CN117036599A

CN117036599A - Virtual data set generation method and device, electronic equipment and storage medium

Info

Publication number: CN117036599A
Application number: CN202310943962.XA
Authority: CN
Inventors: 刘雪梅; 杨帅
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2023-07-28
Filing date: 2023-07-28
Publication date: 2023-11-10

Abstract

The application relates to a virtual data set generation method, a device, electronic equipment and a storage medium, wherein the virtual data set generation method comprises the following steps: loading a three-dimensional model of an object to be identified into a virtual space; shooting the three-dimensional model at a preset shooting point by controlling a virtual camera in the virtual space to obtain an initial virtual data set; and carrying out post-processing on the initial virtual data set to obtain a target virtual data set. Compared with the prior art, the application has the advantages of high efficiency, strong applicability and the like.

Description

A virtual data set generation method, device, electronic equipment and storage medium

技术领域Technical field

本发明涉及数据处理技术领域，尤其是涉及一种虚拟数据集生成方法、装置、电子设备及存储介质。The present invention relates to the field of data processing technology, and in particular, to a virtual data set generation method, device, electronic equipment and storage medium.

背景技术Background technique

目标检测是机器视觉领域最具有挑战性的问题，其任务是在图像中找到目标物体，并确定目标物体的类别和位置，在工业生产中具有广泛的应用场景和任务需求。Target detection is the most challenging problem in the field of machine vision. Its task is to find the target object in the image and determine the category and location of the target object. It has a wide range of application scenarios and task requirements in industrial production.

目前，基于人工神经网络的目标检测算法能够在准确率和泛用性上逐步赶超传统检测算法。不同于传统检测算法，人工神经网络不仅依赖于不断迭代精进的算法模型，还需要数以万计的高质量数据作为模型训练集以保证检测效果，这使得数据集的制作成为了训练检测模型的重要一环。At present, target detection algorithms based on artificial neural networks can gradually catch up with traditional detection algorithms in terms of accuracy and versatility. Different from traditional detection algorithms, artificial neural networks not only rely on continuously iterative and refined algorithm models, but also require tens of thousands of high-quality data as model training sets to ensure detection results, which makes the production of data sets a critical step in training detection models. An important part.

现阶段，相关技术依然采用人工或半自动的方式来制作训练数据集，从而导致检测模型的训练效率较低。At this stage, related technologies still use manual or semi-automatic methods to produce training data sets, resulting in low training efficiency of detection models.

发明内容Contents of the invention

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种效率较高的虚拟数据集生成方法、装置、电子设备及存储介质。The purpose of the present invention is to provide a more efficient virtual data set generation method, device, electronic equipment and storage medium in order to overcome the above-mentioned shortcomings of the prior art.

本发明的目的可以通过以下技术方案来实现：The object of the present invention can be achieved through the following technical solutions:

根据本发明的第一方面，提供了一种虚拟数据集生成方法，该方法包括：According to a first aspect of the present invention, a virtual data set generation method is provided, which method includes:

将待识别物体的三维模型加载至虚拟空间；Load the three-dimensional model of the object to be recognized into the virtual space;

控制所述虚拟空间中的虚拟相机在预设拍摄点处对所述三维模型进行拍摄，获取初始虚拟数据集；Control a virtual camera in the virtual space to shoot the three-dimensional model at a preset shooting point to obtain an initial virtual data set;

对所述初始虚拟数据集进行后处理，获取目标虚拟数据集。Post-process the initial virtual data set to obtain a target virtual data set.

作为优选的技术方案，所述目标虚拟数据集包括：单目标虚拟数据集；As a preferred technical solution, the target virtual data set includes: a single target virtual data set;

所述单目标虚拟数据集包括初始单目标虚拟图像以及所述初始单目标虚拟图像的掩码图；The single-target virtual data set includes an initial single-target virtual image and a mask image of the initial single-target virtual image;

所述对所述初始虚拟数据集进行后处理，获取目标虚拟数据集，包括：The post-processing of the initial virtual data set to obtain the target virtual data set includes:

获取所述初始单目标虚拟图像的渲染轮廓图；Obtain the rendered contour map of the initial single-target virtual image;

根据所述渲染轮廓图，获取所述初始单目标虚拟图像的掩码图。According to the rendered contour map, a mask map of the initial single-target virtual image is obtained.

作为优选的技术方案，所述目标虚拟数据集包括：多目标虚拟数据集；As a preferred technical solution, the target virtual data set includes: a multi-target virtual data set;

所述多目标虚拟数据集包括初始多目标虚拟图像以及所述初始多目标虚拟图像的掩码图；The multi-target virtual data set includes an initial multi-target virtual image and a mask image of the initial multi-target virtual image;

获取所述初始多目标虚拟图像的类纯色渲染图；Obtain a quasi-solid color rendering of the initial multi-objective virtual image;

根据所述类纯色渲染图，获取所述初始多目标虚拟图像的掩码图。According to the quasi-solid color rendering, a mask image of the initial multi-object virtual image is obtained.

作为优选的技术方案，所述三维模型包括所述待识别物体的材质信息、外观信息以及标签信息。As a preferred technical solution, the three-dimensional model includes material information, appearance information and label information of the object to be identified.

作为优选的技术方案，在所述对所述初始虚拟数据集进行后处理，获取目标虚拟数据集之后，还包括：As a preferred technical solution, after post-processing the initial virtual data set and obtaining the target virtual data set, the method further includes:

根据所述三维模型中的所述标签信息，生成所述目标虚拟数据集的标签文件。Generate a label file of the target virtual data set according to the label information in the three-dimensional model.

作为优选的技术方案，所述预设拍摄点为：As a preferred technical solution, the preset shooting point is:

所述虚拟空间中距离所述待识别物体预设距离的点；A point in the virtual space at a preset distance from the object to be identified;

和/或，所述虚拟空间中处于预设经度上的点；And/or, a point in the virtual space at a preset longitude;

和/或，所述虚拟空间中处于预设纬度上的点。And/or, a point in the virtual space at a preset latitude.

根据本发明的第二方面，提供了一种实例分割模型训练方法，该方法包括：According to a second aspect of the present invention, an instance segmentation model training method is provided, which method includes:

采用上述第一方面或第一方面的任意一种可能的实现方式提供的虚拟数据集生成方法所生成的虚拟数据集对实例分割模型进行训练；Using the virtual data set generated by the virtual data set generation method provided by the first aspect or any possible implementation of the first aspect to train the instance segmentation model;

采用真实数据集对训练好的所述实例分割模型进行验证；Verify the trained instance segmentation model using a real data set;

循环迭代上述步骤，直至所述实例分割模型满足验证要求。The above steps are iterated in a loop until the instance segmentation model meets the verification requirements.

根据本发明的第三方面，提供了一种虚拟数据集生成装置，该装置包括：According to a third aspect of the present invention, a virtual data set generating device is provided, which device includes:

模型加载模块，用于将待识别物体的三维模型加载至虚拟空间；The model loading module is used to load the three-dimensional model of the object to be recognized into the virtual space;

初始虚拟数据集获取模块，用于控制所述虚拟空间中的虚拟相机在预设拍摄点处对所述三维模型进行拍摄，获取初始虚拟数据集；An initial virtual data set acquisition module is used to control a virtual camera in the virtual space to shoot the three-dimensional model at a preset shooting point to obtain an initial virtual data set;

后处理模块，用于对所述初始虚拟数据集进行后处理，获取目标虚拟数据集。A post-processing module is used to perform post-processing on the initial virtual data set to obtain a target virtual data set.

根据本发明的第四方面，提供了一种电子设备，包括存储器和处理器，所述存储器上存储有计算机程序，所述处理器执行所述程序时实现第一方面或第一方面的任意一种可能的实现方式提供的方法。According to a fourth aspect of the present invention, an electronic device is provided, including a memory and a processor. A computer program is stored on the memory. When the processor executes the program, the first aspect or any one of the first aspects is implemented. possible implementation methods.

根据本发明的第四方面，提供了一种计算机可读存储介质，其上存储有计算机程序，所述程序被处理器执行时实现第一方面或第一方面的任意一种可能的实现方式提供的方法。According to a fourth aspect of the present invention, a computer-readable storage medium is provided, on which a computer program is stored. When the program is executed by a processor, the first aspect or any possible implementation of the first aspect is provided. Methods.

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

一、效率高：本发明中的虚拟数据集生成方法可以实现虚拟数据集的全自动生成，能够在较短时间内生成大量的虚拟数据集用于目标检测模型的训练，能够节约大量人力和时间成本，提高虚拟数据集的生成效率，从而有利于改善目标检测模型的训练效率。1. High efficiency: The virtual data set generation method in the present invention can realize fully automatic generation of virtual data sets, can generate a large number of virtual data sets in a short time for training target detection models, and can save a lot of manpower and time. cost, improve the generation efficiency of virtual data sets, and thus help improve the training efficiency of target detection models.

二、适用性强：本发明中的虚拟数据集生成方法不仅可以应用于日常生活中物体的检测与识别场景，还可以应用于工业抓取、物体避障等需要获得目标物体外轮廓数据的场景；另外虚拟数据集可以替代实际场景，因此本发明中的虚拟数据集生成方法还可以应用在医疗手术、水下检测等难以获取实际数据集的场景，有效提高上述虚拟数据集生成方法的适用性。2. Strong applicability: The virtual data set generation method in the present invention can not only be applied to object detection and recognition scenarios in daily life, but can also be applied to industrial grabbing, object obstacle avoidance and other scenarios where target object outer contour data needs to be obtained. ; In addition, virtual data sets can replace actual scenes, so the virtual data set generation method in the present invention can also be applied in medical surgeries, underwater detection and other scenarios where it is difficult to obtain actual data sets, effectively improving the applicability of the above virtual data set generation method. .

三、能够获得效果较好的目标检测模型：采用本发明中的虚拟数据集生成方法所生成的虚拟数据集对例如实例分割模型等目标检测模型进行训练时，可以获得更好的目标检测效果。3. A better target detection model can be obtained: When using the virtual data set generated by the virtual data set generation method in the present invention to train a target detection model such as an instance segmentation model, a better target detection effect can be obtained.

附图说明Description of the drawings

图1为本发明实施例中虚拟数据集生成方法的流程示意图；Figure 1 is a schematic flow chart of a virtual data set generation method in an embodiment of the present invention;

图2为本发明实施例中单目标虚拟数据集制作所用零件1模型；Figure 2 is a model of part 1 used in the production of single-target virtual data sets in the embodiment of the present invention;

图3为本发明实施例中多目标虚拟数据集制作所用零件模型；Figure 3 is a part model used for making multi-objective virtual data sets in the embodiment of the present invention;

图4为本发明实施例中初始虚拟数据集的获取流程示意图；Figure 4 is a schematic flowchart of the acquisition process of the initial virtual data set in the embodiment of the present invention;

图5为本发明实施例中初始单目标虚拟图像的掩码图的获取流程示意图；Figure 5 is a schematic flow chart of obtaining a mask image of an initial single-target virtual image in an embodiment of the present invention;

图6为本发明实施例中虚拟图像与其对应的掩码图的混合图；Figure 6 is a mixed image of a virtual image and its corresponding mask image in an embodiment of the present invention;

其中，图6(a)和图6(b)分别为零件1和零件2的初始单目标虚拟图像与对应的掩码图的单目标混合图，图6(c)为包含图6(a)中零件1的混合图以及图6(b)中零件2的混合效果图；Among them, Figure 6(a) and Figure 6(b) are the single-target mixed images of the initial single-target virtual image and the corresponding mask image of Part 1 and Part 2 respectively, and Figure 6(c) is the single-target mixed image including Figure 6(a) The mixed picture of part 1 in Figure 6(b) and the mixed effect picture of part 2 in Figure 6(b);

图7为本发明实施例中实例分割模型的训练损失曲线；Figure 7 is the training loss curve of the instance segmentation model in the embodiment of the present invention;

图8为本发明实施例中采用实例分割模型对真实图像的检测结果；Figure 8 shows the detection results of real images using the instance segmentation model in the embodiment of the present invention;

其中，图8(a)为对单个目标的检测结果，图8(b)至图8(e)为对多个目标的检测结果；Among them, Figure 8(a) shows the detection results of a single target, and Figures 8(b) to 8(e) show the detection results of multiple targets;

图9为本发明实施例中电子设备的结构示意图。Figure 9 is a schematic structural diagram of an electronic device in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明的一部分实施例，而不是全部实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例，都应属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of the present invention.

除非另有定义，本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同；本文中所使用的术语只是为了描述具体的实施例的目的，不是旨在于限制本申请；本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形，意图在于覆盖不排他的包含。Unless otherwise defined, all technical and scientific terms used herein have the same meanings as commonly understood by those skilled in the technical field belonging to this application; the terms used herein are for the purpose of describing specific embodiments only and are not intended to be used in Limitation of this application; the terms "including" and "having" and any variations thereof in the description and claims of this application and the above description of the drawings are intended to cover non-exclusive inclusion.

在本发明实施例的描述中，技术术语“第一”“第二”等仅用于区别不同对象，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量、特定顺序或主次关系。在本申请实施例的描述中，“多个”的含义是两个以上，除非另有明确具体的限定。In the description of the embodiments of the present invention, the technical terms "first", "second", etc. are only used to distinguish different objects, and cannot be understood as indicating or implying the relative importance or implicitly indicating the quantity or specificity of the indicated technical features. Sequence or priority relationship. In the description of the embodiments of this application, "plurality" means two or more, unless otherwise explicitly and specifically limited.

在本发明中所提及“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是，本文所描述的实施例可以与其它实施例相结合。Reference in this disclosure to "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those skilled in the art understand, both explicitly and implicitly, that the embodiments described herein may be combined with other embodiments.

在本发明实施例的描述中，术语“和/或”仅仅是一种描述关联对象的关联关系，表示可以存在三种关系，例如A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。另外，本文中字符“/”，一般表示前后关联对象是一种“或”的关系。In the description of the embodiment of the present invention, the term "and/or" is only an association relationship describing associated objects, indicating that there can be three relationships, such as A and/or B, which can mean: A exists alone, and A exists simultaneously and B, there are three cases of B alone. In addition, the character "/" in this article generally indicates that the related objects are an "or" relationship.

图1是本申请实施例中提供的一种虚拟数据集生成方法的流程示意图。本申请提供了如实施例或流程图所述的方法操作步骤，但基于常规或者无创造性的劳动可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式，不代表唯一的执行顺序。该方法应可以由软件和/或硬件的方式实现。请参见图1，所述方法可以包括：Figure 1 is a schematic flowchart of a method for generating a virtual data set provided in an embodiment of the present application. This application provides method operation steps as described in the embodiments or flow charts, but more or less operation steps may be included based on conventional or non-inventive efforts. The sequence of steps listed in the embodiment is only one way of executing the sequence of many steps, and does not represent the only execution sequence. This method should be implemented by software and/or hardware. Referring to Figure 1, the method may include:

步骤S110：将待识别物体的三维模型加载至虚拟空间；Step S110: Load the three-dimensional model of the object to be recognized into the virtual space;

步骤S120：控制虚拟空间中的虚拟相机在预设拍摄点处对三维模型进行拍摄，获取初始虚拟数据集；Step S120: Control the virtual camera in the virtual space to shoot the three-dimensional model at the preset shooting point to obtain the initial virtual data set;

步骤S130：对初始虚拟数据集进行后处理，获取目标虚拟数据集。Step S130: Post-process the initial virtual data set to obtain the target virtual data set.

可选地，步骤S110中的虚拟空间可以采用SolidWorks软件来构建，待识别物体的三维模型可以为通过SolidWorks软件所预先构建的三维数字模型。当然，上述步骤S110中的三维模型也可以通过其他软件来构建。Optionally, the virtual space in step S110 can be constructed using SolidWorks software, and the three-dimensional model of the object to be identified can be a three-dimensional digital model pre-constructed through SolidWorks software. Of course, the three-dimensional model in the above step S110 can also be constructed by other software.

可以理解的是，上述步骤S110中的三维模型可以由执行上述虚拟数据集生成方法的电子设备直接构建，也可以由其他电子设备来构建然后将构建好的三维模型发送至执行上述虚拟数据集生成方法的电子设备。It can be understood that the three-dimensional model in the above step S110 can be directly constructed by the electronic device that executes the above virtual data set generation method, or can be constructed by other electronic devices and then send the constructed three-dimensional model to the device that executes the above virtual data set generation. Method of electronic equipment.

可选地，步骤S110中的三维模型包括待识别物体的材质信息、外观信息以及标签信息，其中标签信息可以用于生成目标虚拟数据集的标签文件。Optionally, the three-dimensional model in step S110 includes material information, appearance information and label information of the object to be recognized, where the label information can be used to generate a label file of the target virtual data set.

可选地，步骤S110在将待识别物体的三维模型加载至虚拟空间之前，可以对虚拟空间的虚拟背景进行设置，其中，虚拟背景包括背景环境、物体位置以及光照参数等。Optionally, in step S110, before loading the three-dimensional model of the object to be recognized into the virtual space, a virtual background of the virtual space may be set, where the virtual background includes background environment, object position, lighting parameters, etc.

可以理解的是，三维模型也分为单目标三维模型和多目标三维模型，其中单目标三维模型中仅包括一个待识别目标，多目标三维模型中包括多个待识别目标。单目标三维模型可以为如图2所示的零件模型，多目标三维模型可以为如图3所示的装配体模型，多目标之间的相对位置可以通过装配体模型中零件之间的相对位置来确定。It can be understood that the three-dimensional model is also divided into a single-target three-dimensional model and a multi-target three-dimensional model. The single-target three-dimensional model only includes one target to be identified, and the multi-target three-dimensional model includes multiple targets to be identified. The single-objective 3D model can be a part model as shown in Figure 2, and the multi-objective 3D model can be an assembly model as shown in Figure 3. The relative positions between multiple objects can be determined by the relative positions between parts in the assembly model. to make sure.

可选地，步骤S120可以采用SolidWorks中自带的渲染插件photoview360进行渲染拍摄，在进行渲染拍摄时的渲染参数设置可以参照表1。单目标虚拟数据集中的渲染轮廓图的轮廓颜色可以根据虚拟背景颜色以及三维模型颜色来设置，需要保证轮廓与虚拟背景和三维模型可以作出较好的区分。Optionally, step S120 can use the built-in rendering plug-in photoview360 in SolidWorks to perform rendering and shooting. For the rendering parameter settings during rendering and shooting, refer to Table 1. The contour color of the rendered contour map in the single-target virtual data set can be set according to the virtual background color and the three-dimensional model color. It is necessary to ensure that the contour can be well distinguished from the virtual background and the three-dimensional model.

表1渲染拍摄参数Table 1 Rendering shooting parameters

可选地，步骤S120中的预设拍摄点为：虚拟空间中距离所述待识别物体预设距离的点；和/或，虚拟空间中处于预设经度上的点；和/或，虚拟空间中处于预设纬度上的点。Optionally, the preset shooting point in step S120 is: a point in the virtual space at a preset distance from the object to be identified; and/or a point in the virtual space at a preset longitude; and/or a point in the virtual space point at a preset latitude.

可以理解的是，本实施例还可以采用修改相机在空间中的经纬度与距离物体的远近来实现对目标物体全方位的连续拍摄，具体实现方式如图4所示。其中，L(单位：mm)为相机到目标物体的距离；A1(单位：°)为相机所处球坐标系中的经度值；A2(单位：°)为相机所处球坐标系中的纬度值；q为计数器，统计渲染生成的虚拟图片的数量并为其提供序号作为文件名保存。可以理解的是，图4中所示的实现方式也相当于在预设拍摄点处控制虚拟相机进行拍摄，经纬度数值和相机与待识别物体的距离值的变化过程即为由一个预设拍摄点运动到另一预设拍摄点的过程。It can be understood that this embodiment can also modify the longitude and latitude of the camera in space and the distance from the object to achieve all-round continuous shooting of the target object. The specific implementation method is shown in Figure 4. Among them, L (unit: mm) is the distance from the camera to the target object; A1 (unit: °) is the longitude value in the spherical coordinate system where the camera is located; A2 (unit: °) is the latitude in the spherical coordinate system where the camera is located value; q is a counter that counts the number of virtual images generated by rendering and provides a serial number for saving as a file name. It can be understood that the implementation shown in Figure 4 is also equivalent to controlling a virtual camera to shoot at a preset shooting point. The changing process of the longitude and latitude values and the distance value between the camera and the object to be identified is determined by a preset shooting point. The process of moving to another preset shooting point.

可选地，目标虚拟数据集包括：单目标虚拟数据集；单目标虚拟数据集包括初始单目标虚拟图像以及初始单目标虚拟图像的掩码图；Optionally, the target virtual data set includes: a single-target virtual data set; the single-target virtual data set includes an initial single-target virtual image and a mask map of the initial single-target virtual image;

此时，步骤S130对所述初始虚拟数据集进行后处理，获取目标虚拟数据集，包括：获取初始单目标虚拟图像的渲染轮廓图；根据渲染轮廓图，获取所述初始单目标虚拟图像的掩码图。At this time, step S130 performs post-processing on the initial virtual data set to obtain the target virtual data set, including: obtaining a rendering outline of the initial single-target virtual image; and obtaining a mask of the initial single-target virtual image based on the rendering outline. Code map.

可以理解的是，上述初始单目标虚拟图像和渲染轮廓图可以通过在同一相机视角中进行两次渲染的方式来获取，并分别保存在JPG格式和BMP格式。在SolidWorks软件中可以通过对渲染参数ContourLineThickness与ContourLineColor进行设置，从而使渲染后的图片中包含表面特征轮廓，便于将目标物体从背景中区分开来，为后续的数据处理提供便利。It can be understood that the above-mentioned initial single-target virtual image and rendered contour map can be obtained by rendering twice in the same camera perspective, and saved in JPG format and BMP format respectively. In SolidWorks software, the rendering parameters ContourLineThickness and ContourLineColor can be set so that the rendered image contains surface feature contours, which facilitates distinguishing the target object from the background and facilitates subsequent data processing.

可选地，初始单目标虚拟图像的掩码图可以为位深度为8的索引图文件，其获取方法可以为：如图5所示，在获取初始单目标虚拟图像的渲染轮廓图后，渲染轮廓图中含有带有颜色的轮廓(例如红色轮廓)，可以根据颜色类别，对RGB三通道进行遍历，从而渲染轮廓图中提取轮廓；然后对图像进行二值化操作，通过判断目标像素点的连通性对轮廓内部进行孔洞填充；最后将图片转换为所需格式，在保留原始数据的基础上，将图像中每个像素点由二进制转换为8位深度，而后装在索引表并将图像改写为索引图格式，最终将其图像格式由BMP修改为PNG格式，获得初始单目标虚拟图像的掩码图。Optionally, the mask map of the initial single-target virtual image can be an index map file with a bit depth of 8, and the acquisition method can be: as shown in Figure 5, after obtaining the rendering contour map of the initial single-target virtual image, render The contour map contains colored contours (such as red contours). The three RGB channels can be traversed according to the color category to extract the contours from the rendered contour map; then the image is binarized and the target pixel is judged by Connectivity fills holes inside the outline; finally, the image is converted into the required format. On the basis of retaining the original data, each pixel in the image is converted from binary to 8-bit depth, and then installed in the index table and the image is rewritten. For the index image format, the image format is finally modified from BMP to PNG format to obtain the mask image of the initial single-target virtual image.

可选地，目标虚拟数据集包括：多目标虚拟数据集；多目标虚拟数据集包括初始多目标虚拟图像以及初始多目标虚拟图像的掩码图；Optionally, the target virtual data set includes: a multi-target virtual data set; the multi-target virtual data set includes an initial multi-target virtual image and a mask map of the initial multi-target virtual image;

此时，步骤S130对初始虚拟数据集进行后处理，获取目标虚拟数据集，包括：获取初始多目标虚拟图像的类纯色渲染图；根据类纯色渲染图，获取初始多目标虚拟图像的掩码图。At this time, step S130 performs post-processing on the initial virtual data set to obtain the target virtual data set, including: obtaining a quasi-solid color rendering of the initial multi-target virtual image; and obtaining a mask image of the initial multi-target virtual image based on the quasi-solid color rendering. .

可以理解的是，多目标数据在一张图片中存在多个目标物体，这些物体需要与各自标签进行关联，本实施例使用类纯色渲染的方式为后续生成掩码文件提供数据支持。初始多目标虚拟图像的渲染方式选择将同一组零件进行分批处理，一组供渲染生成原图，另一组供类纯色渲染，两组保持相同的零件摆放方式并确保每次转动的相机角度与改变的距离完全一致，从而实现原图和掩码的匹配，以缩短算法的运行时间。It can be understood that multi-target data contains multiple target objects in one picture, and these objects need to be associated with their respective labels. This embodiment uses a solid-color rendering method to provide data support for subsequent generation of mask files. The rendering method of the initial multi-target virtual image is to process the same set of parts in batches. One set is used for rendering to generate the original image, and the other set is used for solid-color rendering. The two sets maintain the same placement of parts and ensure that the camera is rotated each time. The angle is exactly the same as the changed distance, thus matching the original image and the mask to shorten the running time of the algorithm.

根据类纯色渲染图，获取初始多目标虚拟图像的掩码图的方法可以为：According to the quasi-solid color rendering, the method of obtaining the mask map of the initial multi-target virtual image can be:

初始多目标虚拟图像根据索引表中的颜色顺序分别对各目标零件对应的颜色进行识别，结合生成的标签信息，通过遍历图像中所有像素点并对其进行判定，以确定该像素点属于哪个零件，若都不属于，则判定为背景；The initial multi-target virtual image identifies the colors corresponding to each target part according to the color order in the index table. Combined with the generated label information, it traverses all pixels in the image and determines them to determine which part the pixel belongs to. , if it does not belong to any of them, it will be judged as background;

将通过判定后的像素点的坐标分别保存在各个矩阵中，对每个矩阵进行二值化操作、孔洞填充以及格式转换后即可在三个不同矩阵中获得每个目标的独立掩码，此时每张独立掩码的背景像素值均为0，掩码像素值均为1，将各个颜色矩阵中的目标区域像素值按照索引表的顺序进行修改后再合并；The coordinates of the determined pixels are stored in each matrix respectively. After binary operation, hole filling and format conversion of each matrix, independent masks of each target can be obtained in three different matrices. This At this time, the background pixel value of each independent mask is 0, and the mask pixel value is 1. The pixel values of the target areas in each color matrix are modified according to the order of the index table and then merged;

在完成数据矩阵的制作后，装载索引表并修改为PNG格式保存。After completing the production of the data matrix, load the index table and modify it to save it in PNG format.

可以理解的是，本实施例设计了重叠检测算法，在修改各矩阵数据前先行进行一次数据叠加，此时的掩码像素值均为1，因此叠加后像素值大于1的部分即为重叠部分。重叠部分仅发生在两个或多个零件边缘过渡部分，将重叠部分的像素值置零，使之成为黑色背景。It can be understood that this embodiment designs an overlap detection algorithm, and performs a data superposition before modifying each matrix data. At this time, the mask pixel values are all 1, so the part with a pixel value greater than 1 after superposition is the overlapped part. . Overlap only occurs at the transition between the edges of two or more parts. Set the pixel value of the overlapping part to zero, making it a black background.

可选地，在根据类纯色渲染图，获取每个目标的初始多目标虚拟图像的掩码图后，可以通过调用MATLAB中的imfuse函数创建混合叠加图像，使用alpha混合覆盖虚拟原图和掩码图片，联合缩放图像中的强度值，使其混合于同一图像中，原图与掩码图的混合图如图6所示，其中，图6(a)和图6(b)分别为零件1和零件2的初始单目标虚拟图像与对应的掩码图的单目标混合图，图6(c)为包含图6(a)中零件1的混合图以及图6(b)中零件2的混合效果图；Optionally, after obtaining the mask map of the initial multi-target virtual image for each target based on the solid-color rendering, you can create a mixed overlay image by calling the imfuse function in MATLAB, and use alpha blending to cover the virtual original image and mask. image, jointly scale the intensity values in the image so that they are mixed into the same image. The mixed image of the original image and the mask image is shown in Figure 6, where Figure 6(a) and Figure 6(b) are part 1 respectively. and the single-target mixed image of the initial single-target virtual image of Part 2 and the corresponding mask map. Figure 6(c) is a mixed image containing the mixed image of Part 1 in Figure 6(a) and the mixed image of Part 2 in Figure 6(b) renderings;

可选地，在步骤S130对所述初始虚拟数据集进行后处理，获取目标虚拟数据集之后，还包括：根据三维模型中的标签信息，生成目标虚拟数据集的标签文件。Optionally, in step S130, after post-processing the initial virtual data set and obtaining the target virtual data set, it also includes: generating a label file of the target virtual data set according to the label information in the three-dimensional model.

本实施例直接将零件标签写入Excel文件，标签来源为三维模型中预先设定好的零件名称。In this embodiment, the part label is directly written into the Excel file, and the source of the label is the preset part name in the three-dimensional model.

多目标数据的零件名称从属于装配体，可能存在同类零件。针对装配体中存在多个且个数不确定的零件名称的情况，通过API接口读取当前相机视角下可见零件的个数与名称信息，根据可见零件个数，依次读取名称标签并按序写入Excel中。当存在多个同类零件时，将同类型的不同零件通过尾缀的形式进行区分。可以通过在MATLAB中加载PatternRecognition Toolbox附加功能资源，由MATLAB对存有标签信息Excel中间文件进行内容的读取，最后保存至YML文件中。The part names of multi-object data belong to the assembly, and similar parts may exist. In the case where there are multiple and uncertain part names in the assembly, the number and name information of the visible parts in the current camera perspective is read through the API interface. According to the number of visible parts, the name tags are read in sequence and sequenced. Write to Excel. When there are multiple parts of the same type, different parts of the same type are distinguished by suffixes. You can load the PatternRecognition Toolbox additional function resource in MATLAB, and MATLAB reads the content of the Excel intermediate file containing label information, and finally saves it to a YML file.

以上是关于本发明中虚拟数据集生成方法实施例的介绍，下面对实例分割模型训练方法实施例进行介绍，以对本发明中的方案进行进一步说明。The above is an introduction to the embodiments of the virtual data set generation method in the present invention. The following is an introduction to the embodiments of the instance segmentation model training method to further explain the solution in the present invention.

本发明实施例还提供一种实例分割模型训练方法，该方法包括：An embodiment of the present invention also provides an instance segmentation model training method, which method includes:

采用上述虚拟数据集生成方法所生成的虚拟数据集对实例分割模型进行训练；Use the virtual data set generated by the above virtual data set generation method to train the instance segmentation model;

采用真实数据集对训练好的实例分割模型进行验证；Use real data sets to verify the trained instance segmentation model;

循环迭代上述步骤，直至实例分割模型满足验证要求。The above steps are iterated in a loop until the instance segmentation model meets the verification requirements.

上述实例分割模型可以采用Mask R-CNN模型，虚拟数据集由504张零件1的虚拟图像、504张零件2的虚拟图像以及504张包含零件1和零件2的多目标虚拟图像。The above instance segmentation model can use the Mask R-CNN model. The virtual data set consists of 504 virtual images of part 1, 504 virtual images of part 2, and 504 multi-target virtual images containing part 1 and part 2.

本实施例搭建了基于GPU的深度学习计算平台，平台的硬件参数如表2所示。软件运行采用Windows10系统，深度学习框架采用TensorFlow以及Keras，编程语言采用Python3.6版本。配置Python的数据处理库以及图像处理库，如OpenCV、Numpy、Pillow、Scikit等。This embodiment builds a GPU-based deep learning computing platform, and the hardware parameters of the platform are shown in Table 2. The software runs on Windows 10 system, the deep learning framework uses TensorFlow and Keras, and the programming language uses Python 3.6. Configure Python's data processing library and image processing library, such as OpenCV, Numpy, Pillow, Scikit, etc.

表2计算平台的硬件配置Table 2 Hardware configuration of computing platform

使用上述虚拟数据集进行Mask R-CNN检测模型的训练，网络架构为ResNet-101，采用COCO的预训练权重，学习率0.001，共训练100epochs。模型训练的损失曲线如图7所示。Use the above virtual data set to train the Mask R-CNN detection model. The network architecture is ResNet-101, COCO's pre-training weights are used, the learning rate is 0.001, and a total of 100 epochs are trained. The loss curve of model training is shown in Figure 7.

使用训练好的实例分割模型对包含零件1和零件2的真实图片进行检测，检测结果如图8所示。通过对检测结果分析可得：由虚拟数据集训练出的检测模型能较好的对真实环境下的目标物体进行实例分割。图8(a)表明模型对单目标真实图片具有良好的分割能力。图8(b)到图8(e)的背景为数据集中未出现过的点状花纹，且存在不同程度的背景反光，图8(d)中检出一个形状略有差异的铸铁材质的零件2，该材质未出现在提供训练的虚拟数据集中，检测结果表明检测模型有不错的泛化能力，可以识别出不同材质或形状略有不同的零件2。图8(e)中的非目标零件未检出，说明检测模型对干扰项有较好的排除能力。Use the trained instance segmentation model to detect real images containing Part 1 and Part 2. The detection results are shown in Figure 8. By analyzing the detection results, it can be concluded that the detection model trained by the virtual data set can better instance segment the target objects in the real environment. Figure 8(a) shows that the model has good segmentation capabilities for single-target real images. The backgrounds in Figures 8(b) to 8(e) are dotted patterns that have never appeared in the data set, and there are varying degrees of background reflection. In Figure 8(d), a cast iron part with a slightly different shape is detected. 2. This material does not appear in the virtual data set provided for training. The detection results show that the detection model has good generalization ability and can identify parts of different materials or slightly different shapes. 2. The non-target parts in Figure 8(e) were not detected, indicating that the detection model has a good ability to eliminate interference items.

本实施例还对两类零件的实例分割精度进行了统计分析，两种目标的实例分割精度与分类精度如表3所示。其中，实例分割精度IoU是指分割检测结果与物体实际结果Ground Truth的交集与其并集之比，为无量纲数，理论最大值为1，越接近1则表明分割精度越高；准确率precision是指预测值为真的样本数中实际确实为真的比例；召回率recall是指所有实际为真的样本中被检测出的比例。若单个指标较高，另一个指标过低均说明检测模型存在分类问题。This embodiment also performs a statistical analysis on the instance segmentation accuracy of the two types of parts. The instance segmentation accuracy and classification accuracy of the two types of targets are shown in Table 3. Among them, the instance segmentation accuracy IoU refers to the ratio of the intersection of the segmentation detection result and the actual result of the object Ground Truth and its union. It is a dimensionless number. The theoretical maximum value is 1. The closer to 1, the higher the segmentation accuracy; the accuracy precision is It refers to the proportion of samples whose predicted value is true that is actually true; recall refers to the proportion of all samples that are actually true that are detected. If a single index is high and the other index is too low, it indicates that the detection model has a classification problem.

表3两种目标的实例分割精度与分类精度Table 3 Instance segmentation accuracy and classification accuracy of two types of targets

目标对象target 分割精度IoU(％)Segmentation accuracy IoU(%) 准确率precision(％)Accuracy precision(%) 召回率recall(％)Recall rate recall(%) 零件1Part 1 82.482.4 97.697.6 96.496.4 零件2Part 2 85.585.5 98.898.8 96.596.5

由分析结果表明，采用本案例制作的虚拟数据集在Mask R-CNN网络模型训练过程中可收敛，同时在面向真实图片的检测中有良好的检测效果，验证了虚拟数据集制作方法的可行性。The analysis results show that the virtual data set produced using this case can converge during the Mask R-CNN network model training process, and at the same time has good detection results in the detection of real pictures, verifying the feasibility of the virtual data set production method .

以上是关于方法实施例的介绍，以下通过装置实施例，对本发明所述方案进行进一步说明。The above is an introduction to the method embodiments. The solution of the present invention will be further described below through device embodiments.

本发明实施例还提供一种虚拟数据集生成装置，该装置包括：An embodiment of the present invention also provides a virtual data set generating device, which includes:

可选地，后处理模块所获取的目标虚拟数据集包括：单目标虚拟数据集；Optionally, the target virtual data set obtained by the post-processing module includes: a single target virtual data set;

此时，后处理模块具体用于：获取所述初始单目标虚拟图像的渲染轮廓图；根据所述渲染轮廓图，获取所述初始单目标虚拟图像的掩码图。At this time, the post-processing module is specifically configured to: obtain a rendering contour map of the initial single-target virtual image; and obtain a mask map of the initial single-target virtual image based on the rendering contour map.

可选地，后处理模块所获取的目标虚拟数据集包括：多目标虚拟数据集；多目标虚拟数据集包括初始多目标虚拟图像以及所述初始多目标虚拟图像的掩码图；Optionally, the target virtual data set acquired by the post-processing module includes: a multi-target virtual data set; the multi-target virtual data set includes an initial multi-target virtual image and a mask image of the initial multi-target virtual image;

此时，后处理模块具体用于：获取所述初始多目标虚拟图像的类纯色渲染图；根据所述类纯色渲染图，获取所述初始多目标虚拟图像的掩码图。At this time, the post-processing module is specifically configured to: obtain a quasi-solid color rendering of the initial multi-objective virtual image; and obtain a mask image of the initial multi-objective virtual image based on the quasi-pure color rendering.

可选地，模型加载模块加载至虚拟空间中的三维模型包括所述待识别物体的材质信息、外观信息以及标签信息。Optionally, the three-dimensional model loaded into the virtual space by the model loading module includes the material information, appearance information and label information of the object to be identified.

可选地，虚拟数据集生成装置还包括：Optionally, the virtual data set generation device also includes:

标签文件生成模块，用于根据所述三维模型中的所述标签信息，生成所述目标虚拟数据集的标签文件。A label file generation module, configured to generate a label file for the target virtual data set according to the label information in the three-dimensional model.

可选地，初始虚拟数据集获取模块中的预设拍摄点为：所述虚拟空间中距离所述待识别物体预设距离的点；和/或，所述虚拟空间中处于预设经度上的点；和/或，所述虚拟空间中处于预设纬度上的点。Optionally, the preset shooting point in the initial virtual data set acquisition module is: a point in the virtual space at a preset distance from the object to be identified; and/or a point in the virtual space at a preset longitude. point; and/or, a point at a preset latitude in the virtual space.

图9示出了可以用来实施本公开的实施例的电子设备的示意性框图。如图9所示，本发明电子设备包括中央处理单元(CPU)，其可以根据存储在只读存储器(ROM)中的计算机程序指令或者从存储单元加载到随机访问存储器(RAM)中的计算机程序指令，来执行各种适当的动作和处理。在RAM中，还可以存储设备操作所需的各种程序和数据。CPU、ROM以及RAM通过总线彼此相连。输入/输出(I/O)接口也连接至总线。Figure 9 shows a schematic block diagram of an electronic device that may be used to implement embodiments of the present disclosure. As shown in Figure 9, the electronic device of the present invention includes a central processing unit (CPU), which can be loaded from a storage unit into a random access memory (RAM) according to computer program instructions stored in a read-only memory (ROM). instructions to perform various appropriate actions and processing. In RAM, various programs and data required for device operation can also be stored. CPU, ROM and RAM are connected to each other through buses. Input/output (I/O) interfaces are also connected to the bus.

设备中的多个部件连接至I/O接口，包括：输入单元，例如键盘、鼠标等；输出单元，例如各种类型的显示器、扬声器等；存储单元，例如磁盘、光盘等；以及通信单元，例如网卡、调制解调器、无线通信收发机等。通信单元允许设备通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Multiple components in the device are connected to the I/O interface, including: input units, such as keyboards, mice, etc.; output units, such as various types of monitors, speakers, etc.; storage units, such as disks, optical disks, etc.; and communication units, For example, network cards, modems, wireless communication transceivers, etc. The communication unit allows the device to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.

处理单元执行上文所描述的各个方法和处理，例如本发明方法步骤S110～S130。例如，在一些实施例中，本发明方法步骤S110～S130可被实现为计算机软件程序，其被有形地包含于机器可读介质，例如存储单元。在一些实施例中，计算机程序的部分或者全部可以经由ROM和/或通信单元而被载入和/或安装到设备上。当计算机程序加载到RAM并由CPU执行时，可以执行上文描述的本发明方法步骤S110～S130的一个或多个步骤。备选地，在其他实施例中，CPU可以通过其他任何适当的方式(例如，借助于固件)而被配置为执行本发明方法步骤S110～S130。The processing unit executes each of the methods and processes described above, such as steps S110 to S130 of the method of the present invention. For example, in some embodiments, steps S110 to S130 of the method of the present invention can be implemented as a computer software program, which is tangibly included in a machine-readable medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device via the ROM and/or communication unit. When the computer program is loaded into the RAM and executed by the CPU, one or more of the above-described steps S110 to S130 of the method of the present invention may be executed. Alternatively, in other embodiments, the CPU may be configured to execute steps S110 to S130 of the method of the present invention in any other suitable manner (for example, by means of firmware).

本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如，非限制性地，可以使用的示范类型的硬件逻辑部件包括：现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, and without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical device (CPLD) and so on.

用于实施本发明的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器，使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行，作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

在本发明的上下文中，机器可读介质可以是有形的介质，其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到各种等效的修改或替换，这些修改或替换都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person familiar with the technical field can easily think of various equivalent methods within the technical scope disclosed in the present invention. Modifications or substitutions shall be included in the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A method for generating a virtual data set, characterized in that the method includes:

Load the three-dimensional model of the object to be recognized into the virtual space;

Control a virtual camera in the virtual space to shoot the three-dimensional model at a preset shooting point to obtain an initial virtual data set;

Post-process the initial virtual data set to obtain a target virtual data set.

2. A method for generating a virtual data set according to claim 1, wherein the target virtual data set includes: a single target virtual data set;

The single-target virtual data set includes an initial single-target virtual image and a mask image of the initial single-target virtual image;

The post-processing of the initial virtual data set to obtain the target virtual data set includes:

Obtain the rendered contour map of the initial single-target virtual image;

According to the rendered contour map, a mask map of the initial single-target virtual image is obtained.

3. A method for generating a virtual data set according to claim 1, characterized in that the target virtual data set includes: a multi-target virtual data set;

The multi-target virtual data set includes an initial multi-target virtual image and a mask image of the initial multi-target virtual image;

Obtain a quasi-solid color rendering of the initial multi-objective virtual image;

According to the quasi-solid color rendering, a mask image of the initial multi-object virtual image is obtained.

4. A method for generating a virtual data set according to any one of claims 1 to 3, characterized in that the three-dimensional model includes material information, appearance information and label information of the object to be identified.

5. A method for generating a virtual data set according to claim 4, characterized in that, after post-processing the initial virtual data set and obtaining the target virtual data set, it further includes:

Generate a label file of the target virtual data set according to the label information in the three-dimensional model.

6. A method for generating a virtual data set according to any one of claims 1 to 3, characterized in that the preset shooting point is:

A point in the virtual space at a preset distance from the object to be identified;

And/or, a point in the virtual space at a preset longitude;

And/or, a point in the virtual space at a preset latitude.

7. An instance segmentation model training method, characterized in that the method includes:

Using the virtual data set generated by the virtual data set generation method according to any one of claims 1 to 6 to train the instance segmentation model;

Verify the trained instance segmentation model using a real data set;

The above steps are iterated in a loop until the instance segmentation model meets the verification requirements.

8. A virtual data set generating device, characterized in that the device includes:

The model loading module is used to load the three-dimensional model of the object to be recognized into the virtual space;

An initial virtual data set acquisition module is used to control a virtual camera in the virtual space to shoot the three-dimensional model at a preset shooting point to obtain an initial virtual data set;

A post-processing module is used to perform post-processing on the initial virtual data set to obtain a target virtual data set.

9. An electronic device, including a memory and a processor, with a computer program stored on the memory, characterized in that when the processor executes the program, the implementation as described in any one of claims 1 to 6 or 7 is achieved. Methods.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor as described in any one of claims 1 to 6 or 7. method.