CN109614983B

CN109614983B - Training data generation method, device and system

Info

Publication number: CN109614983B
Application number: CN201811260697.0A
Authority: CN
Inventors: 刘源
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2018-10-26
Filing date: 2018-10-26
Publication date: 2023-06-16
Anticipated expiration: 2038-10-26
Also published as: CN109614983A

Abstract

The embodiment of the specification provides a method, a device and a system for generating training data, wherein the method comprises the following steps: sending an image acquisition instruction to the image acquisition equipment so that the image acquisition equipment acquires a first image of a target sample according to the instruction of the image acquisition instruction; acquiring a first image acquired by the image acquisition device from the image acquisition device, and then replacing the background image of the first image by using a set background image to obtain a second image of the target sample; finally, labeling the target sample in the second image through a data labeling algorithm to obtain training data; wherein the background image is an area of the first image other than the target sample.

Description

Training data generation method, device and system

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method, an apparatus, and a system for generating training data.

Background

With the rapid development of computer technology, machine learning has also been widely used in various fields. For example, in the field of unmanned retail, the commodity purchased by the user can be identified by using an image identification technology based on a scheme of computer vision, so that the purchasing and payment processes of the commodity are completed. Currently, the mainstream of image recognition is a deep learning algorithm based on convolutional neural networks (Convolutional Neural Network, CNN).

When image recognition is performed by adopting a deep learning algorithm, a large amount of training data is generally required to be collected in advance, a CNN model is trained by using the training data, and the image recognition is performed by using the trained CNN model. While in training the CNN model, a large amount of training data is required.

Therefore, a method for generating training data quickly and efficiently is needed to meet the model training requirements.

Disclosure of Invention

The embodiment of the specification aims to provide a method, a device and a system for generating training data, which can control an image acquisition device to acquire a first image of a target sample by sending an image acquisition instruction to the image acquisition device when generating the training data, so as to realize automatic acquisition of the image of the target sample; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

In order to solve the above technical problems, the embodiments of the present specification are implemented as follows:

The embodiment of the specification provides a training data generation method, which comprises the following steps:

sending an image acquisition instruction to an image acquisition device so that the image acquisition device acquires a first image of a target sample according to the instruction of the image acquisition instruction;

acquiring the first image, and replacing the background image of the first image by using a set background image to obtain a second image of the target sample; wherein the background image is an area of the first image other than the target sample;

and labeling the target sample in the second image through a data labeling algorithm to obtain the training data.

The embodiment of the specification also provides a training data generating device, which comprises:

the first sending module sends an image acquisition instruction to the image acquisition equipment so that the image acquisition equipment acquires a first image of a target sample according to the instruction of the image acquisition instruction;

the acquisition module acquires the first image;

a replacing module for replacing the background image of the first image by using the set background image to obtain a second image of the target sample; wherein the background image is an area of the first image other than the target sample;

And the labeling module is used for labeling the target sample in the second image through a data labeling algorithm to obtain the training data.

The embodiment of the specification also provides a training data generation system, which comprises an image acquisition device and a training data generation device, wherein the training data generation device comprises a training data generation device;

the image acquisition equipment is used for receiving an image acquisition instruction sent by the training data generation equipment and acquiring a first image of a target sample according to the instruction of the image acquisition instruction;

the training data generation device is used for sending an image acquisition instruction to the image acquisition device; and the image acquisition device is further used for acquiring the first image from the image acquisition device and replacing the background image of the first image with a set background image to obtain a second image of the target sample; wherein the background image is an area of the first image other than the target sample; and labeling the target sample in the second image through a data labeling algorithm to obtain the training data.

A processor; and

a memory arranged to store computer executable instructions that, when executed, cause the processor to:

and labeling the target sample in the second image through a data labeling algorithm to obtain training data.

The present description also provides a storage medium for storing computer-executable instructions that, when executed, implement the following:

According to the technical scheme, when training data are generated, the image acquisition equipment can be controlled to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition equipment, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.

FIG. 1 is one of the method flowcharts of the training data generation method provided in the embodiments of the present disclosure;

FIG. 2 is a flowchart of a method for labeling a target sample in the method for generating training data according to the embodiment of the present disclosure;

FIG. 3 is a second flowchart of a method for generating training data according to an embodiment of the present disclosure;

FIG. 4 is a third flowchart of a method for generating training data according to the embodiment of the present disclosure;

fig. 5 is a schematic block diagram of a training data generating device according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a training data generating system according to an embodiment of the present disclosure;

FIG. 7 is a second schematic diagram of a training data generating system according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of a training data generating apparatus according to an embodiment of the present disclosure.

Detailed Description

In order to make the technical solutions in the present application better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

The embodiment of the specification provides a method, a device, a system and a storage medium for generating training data, which can realize automatic acquisition and automatic labeling of the training data, thereby realizing automatic generation of the training data; the generation efficiency of the training data and the accuracy of the generated training data can be improved.

One specific application area of the method provided in the embodiments of the present disclosure may be the unmanned retail area, where identification of the merchandise purchased by the user is required in order to implement the transaction of the merchandise. Specifically, the image recognition model may be used to recognize the commodity, so that the image recognition model needs to be trained in advance. While training the image recognition model, training data needs to be generated.

Fig. 1 is one of the method flowcharts of the training data generating method provided in the embodiment of the present disclosure, where the method shown in fig. 1 at least includes the following steps:

step 102, sending an image acquisition instruction to the image acquisition device so that the image acquisition device acquires a first image of the target object according to the instruction of the image acquisition instruction.

The execution subject of the method provided in the embodiments of the present disclosure is a training data generating device, such as a device with an image processing function, such as a computer, a mobile phone, or a tablet computer, and specifically may be a training data generating device installed on the training data generating device.

Specifically, communication connection is established between the training data generating device and the image acquisition device, when the image acquisition of the target sample is required, the training data generating device sends an image acquisition instruction to the image acquisition device, and after the image acquisition device receives the image acquisition instruction, the image of the target object is acquired according to the instruction of the image acquisition instruction and is recorded as a first image.

In the implementation, the image acquisition instruction may be triggered by the user through the training data generating device, or may automatically send the image acquisition instruction to the image acquisition device according to a set frequency. The present specification does not limit the triggering manner of the image acquisition instruction.

In this embodiment of the present disclosure, the target sample may be any object such as a model, an animal or plant specimen, or a commodity to be sold, and the specific content represented by the target sample may be determined according to an actual application scenario, which is not limited in this embodiment of the present disclosure.

In particular, since one target sample is targeted, it may be necessary in some cases to acquire images of the target sample from multiple perspectives. Therefore, in the implementation, after the image acquisition device acquires one first image of the target sample, the acquired first image is sent to the image acquisition device, so that the image acquisition device generates training data for the first image; of course, in other embodiments, after the image of one target sample is acquired, training data for the target sample may be generated by the training data generating device.

104, acquiring a first image, and replacing the background image of the first image with a set background image to obtain a second image of the target sample; wherein the background image is an area of the first image other than the target sample.

In step 104, the first image sent by the image capturing device is received, and when the first image of the target sample is captured, the captured content includes the target sample and the background area where the target sample is currently located, so that the area corresponding to the target sample in the first image may be recorded as a foreground image, and the area corresponding to the background may be recorded as a background image.

In a specific application, the target sample may be placed in a different scene, and the background where the target sample is located may be changed. Because, in order to further improve the recognition accuracy of the trained image recognition model, in the present embodiment, after the first image of the target sample is acquired, it is necessary to perform the step of replacing the background image of the first image with the set background image.

The set background image is a background where the target sample in the actual application scene is located. For example, the target sample is a commodity that needs to be sold in the vending machine, and when the commodity is identified, the background of the collected commodity is the background corresponding to the commodity placement position in the vending machine, so in order to further improve the accuracy of the training data and improve the identification degree of the trained image identification model, the background image in the first image of the collected commodity can be replaced by the image of the background corresponding to the commodity in the vending machine.

And 106, labeling the target sample in the second image through a data labeling algorithm to obtain training data.

In the above step 106, the target sample in the second image is labeled, and actually, the position of the target sample in the second image and the attribute information such as the name and the category of the target sample are labeled. Of course, other information of the target sample may be labeled in addition to this, which is not limited in the embodiment of the present disclosure.

Specifically, in step 106, the position of the target sample in the second image may be marked by marking the pixel point corresponding to the target sample, so that the marking accuracy of the target sample is higher.

In the embodiment of the present disclosure, attribute information such as a name and a category of the target sample, and a pixel image corresponding to the labeled target sample are used as labeling data of the target sample, and the second image and the corresponding labeling data are used as training data.

In the embodiment of the specification, the image acquisition device can be controlled to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition device, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; therefore, the automatic generation of the training data is realized, the generation efficiency of the training data is improved, the labor cost is reduced, and the accuracy of the generated training data is higher.

In order to facilitate understanding of the training data generation method provided in the embodiments of the present disclosure, the specific implementation process of each step will be described in detail below.

In this embodiment of the present disclosure, the image capturing device may be a (Red, green, blue, red, green and Blue) RGB camera, and the corresponding first captured image is an RGB image; alternatively, in another embodiment, in order to acquire stereo information of the target sample, dimension information of training data is added, and in this embodiment of the present disclosure, the image capturing device may further include a depth camera, in addition to an RGB camera, for capturing a depth image of the target sample, where the first image of the captured target sample includes the RGB image and the depth image. While there are differences in the implementation of the background image of the first image replaced with the set background image in step 104 for the two different cases, the implementation of step 104 will be discussed below.

In the first case of the first type of case,

the image acquisition equipment only comprises an RGB camera, and correspondingly, the first image is an RGB image;

in this case, in step 104, the background image of the first image is replaced with the set background image to obtain a second image of the target sample, including the following steps (1) and (2);

Step (1), extracting a foreground image of a first image; the foreground image is an area corresponding to the target sample;

and (2) synthesizing the foreground image and the set background image to obtain a second image.

In this embodiment of the present disclosure, the foreground image in the first image may be extracted by a mask (mask) method, which specifically includes the following steps:

performing image graying and binarization on the first image, and extracting the outline of a target sample in the first image; creating a mask image with the same size as the first image, for example, if the size of the first image is 640 x 480, the size of the created mask image is 640 x 480, where 640 and 480 are both the number of pixels; and initializing the pixel values of all pixel points on the newly built mask image to 0, wherein the mask image is a full black image.

In the mask image, the outline of the extracted target sample is used for circling an interested area, and pixel values of pixel points in the interested area are all set to 255, so that the interested area is a white area; namely, on the mask image, the pixel value of the pixel point in the interested region is not 0, the pixel value of the pixel point of the non-interested region is 0, the AND operation is carried out on each pixel point on the mask image and the corresponding pixel point on the first image, the AND operation result of the non-interested region is zero, therefore, only a target sample region (interested region) is left in the obtained image, and the pixel values of the pixel points of the other regions are all zero, namely black;

And finally, directly matting out the target sample area from the obtained image, and taking the scratched target area as a foreground image of the first image.

Of course, the foregoing merely describes a specific implementation process of extracting the foreground image from the first image, and the foreground image may be extracted from the first image by other manners, which are not listed in the embodiments of the present disclosure.

In the step (2), the foreground image and the set background image are synthesized, and the overlapping area of the foreground image may be determined in the set background image, and then the foreground image is overlapped in the overlapping area of the set background image.

In the second case of the two-way valve,

the image acquisition device comprises an RGB camera and a depth camera, wherein the RGB camera is used for acquiring RGB images of the target sample, and the depth camera is used for acquiring depth images of the target sample, so that the first images of the target sample comprise the RGB images and the depth images;

in this case, in step 104, the background image of the first image is replaced by the set background image to obtain the second image of the target sample, which specifically includes the following steps (one), (two) and (three);

Step one, extracting a foreground image of a first image; the foreground image is an area corresponding to the target sample;

step two, generating a virtual viewpoint image corresponding to the foreground image according to the foreground image and the depth image;

and step three, synthesizing the virtual viewpoint image and the set background image to obtain a second image.

The specific implementation process of the step (1) may refer to the specific implementation process of the step (1) in the first case, which is not described herein.

In the step (two), a virtual viewpoint image corresponding to the foreground image is generated according to the foreground image and the depth image, and the method specifically comprises the following steps:

projecting the foreground image according to the depth image, and projecting the foreground image into a world coordinate system to obtain projection coordinates of the foreground image in the world coordinate system; and then projecting the foreground image into a virtual image plane according to projection coordinates of the foreground image in a world coordinate system, so as to obtain a virtual viewpoint image of the foreground image in the virtual projection plane.

Of course, the above-described manner of generating the virtual viewpoint image from the planar image (foreground image) and the depth image is not limited thereto, and may be implemented by other manners, and the embodiments of the present specification are not listed.

In the embodiment of the present disclosure, by acquiring the depth image and the RGB image of the target sample, more dimensional information of the target sample may be obtained, so that more data information of the generated training data may be obtained.

Specifically, as shown in fig. 2, in the step 106, the target sample in the second image is labeled by a data labeling algorithm to obtain training data, which specifically includes the following steps:

step 1062, determining the attribute tag of the target sample; wherein the attribute tag includes at least a sample name of the target sample;

step 1064, determining a pixel point corresponding to the target sample by performing image segmentation on the second image, and generating a pixel point labeling image corresponding to the target sample;

step 1066, determining the second image, the attribute tag, and the pixel annotation image as training data.

In the embodiment of the present disclosure, the attribute tag of the target sample refers to each attribute information of the target sample, and may be, for example, information such as a category of the target sample, a name of the target sample, a price of the target sample, a production place of the target sample, and a production date of the target sample.

In step 1062, the attribute tag of the target sample may be determined at least in the following ways:

First, receiving information such as a sample name, a sample category and the like of the target sample input by a user through a training data generating device, and taking the sample information input by the user as an attribute tag of the target sample.

In the implementation, when the training data generating device performs the step of labeling the target sample in the second image, a target sample attribute tag input box can be displayed on a screen of the training data generating device so that a user inputs attribute information of the target sample, and the training data generating device receives related attribute information of the target sample input by the user; or the training data generating device may further send the target sample attribute tag input box to the terminal device of the user, so that the user inputs the attribute information of the target sample through the terminal device, and the training data generating device receives the attribute information of the target sample sent by the user through the terminal device.

Secondly, in the process of controlling the image acquisition device to acquire the first image of the target sample, the scanning device can be controlled to scan the identification code, such as a bar code, on the target sample, and the information of the sample name, the category and the like of the target sample can be identified; when labeling the target sample in the second image, the generating device of the training data obtains the information such as the sample name, the category and the like of the target sample from the scanning device as the attribute label of the target sample.

Or in the implementation, when the image acquisition device marks the target sample in the second image, a scanning instruction is sent to a scanning device connected with the image acquisition device, so that the scanning device scans the identification code on the target sample, identifies the sample name, the category and other attribute information of the target sample, and sends the identified attribute information to the image acquisition device to serve as an attribute tag of the target sample.

Specifically, in step 1064, the second image may be segmented by using a green curtain segmentation and/or a static background segmentation; since the image is segmented by adopting the green curtain segmentation or the static background segmentation, the specific implementation process thereof is not described herein.

Because the labeling of the pixel points on the original image of the second image is inconvenient and may affect the generation of the subsequent model, in the embodiment of the present disclosure, after determining the pixel points corresponding to the target sample, a new second image is generated, and the pixel points corresponding to the target sample are labeled in the newly generated second image and are recorded as the pixel labeling image corresponding to the target sample. The second image is the same as the original second image, and the purpose of the second image is to label the pixel point corresponding to the target sample.

Specifically, when the pixel points corresponding to the target sample are marked, the pixel points corresponding to the target sample may be circled, or the pixel values of the pixel points corresponding to the target sample may be set to the same value, or the like.

Through the above process, training data corresponding to one of the images of the target sample can be generated, and specifically, the training data can be training data of one of the visual angles of the target sample. After the labeling of the target sample is completed, the second image, the attribute label of the target sample and the pixel point labeling image of the target sample can be stored correspondingly and used as one of the training data of the target sample.

Specifically, after obtaining one training data of the target sample, the training data may be stored locally in a generating device of the training data; then, generating second training data of the target sample is started, and after all training data corresponding to the target sample are obtained, all the training data corresponding to the target sample can be uploaded to the cloud and stored in the cloud.

For example, in implementation, images of different perspectives of the target sample may be acquired, and training data for each perspective of the target sample may be generated separately. If the front view of the target sample can be acquired for the first time, after training data corresponding to the front view is obtained, the training data corresponding to the front view of the target sample is stored in a local device for generating the training data; continuously acquiring a left side view of the target sample, performing the image processing process to obtain training data corresponding to the left side view, and storing the training data corresponding to the left side view of the target sample in a local training data generating device; and continuing the processes of image acquisition, processing and the like of the right view of the target sample, and uploading all training data corresponding to the target sample to the cloud for storage after the training data corresponding to the side view of the target sample is obtained.

Of course, in the implementation, after the training data of all the target samples are obtained, the step of uploading the training data to the cloud for storage may be performed.

In the embodiment of the specification, the training data is stored at a remote end, so that on one hand, the security of data storage is higher, and in addition, the use of subsequent training data can be realized.

In the implementation, after one of the training data of the target sample is obtained, an adjusting instruction can be sent to the image acquisition equipment, so that the image acquisition and shooting can shoot other visual angles of the target sample through operations such as rotation, translation and the like; or, in the implementation, the target sample may be further placed on a movable motion platform, and the motion platform is provided with a motion controller, so that the motion controller controls the motion platform to rotate, translate, rise or descend according to the instruction of the adjustment instruction by sending the adjustment instruction to the motion controller by the training data generating device, so that the image acquisition device may shoot other view angles of the target sample.

For ease of understanding, the following examples are presented.

For example, after the front view of the target sample is acquired, a side view of the target sample needs to be acquired, at this time, an instruction to rotate 90 ° clockwise or counterclockwise may be sent to the motion controller on the motion platform, or an instruction to rotate 90 ° clockwise or counterclockwise may also be sent to the image acquisition device, so that the image acquisition device may acquire the side view of the target sample.

In addition, in the embodiment of the present specification, before the image of the target sample is acquired, in order that the image of the target sample at the set angle of view can be acquired, the position of the target sample or the position of the image acquisition apparatus also needs to be adjusted.

Whether the first image of the target specimen is acquired or after the acquisition of the first image is completed, adjustments to the target specimen or the image acquisition device are required before images of other perspectives of the target specimen are acquired. Thus, before performing step 102 described above, the method provided in the embodiments of the present disclosure further includes the following steps:

transmitting a first adjustment instruction to the image acquisition equipment so that the image acquisition equipment rotates or translates according to the instruction of the first adjustment instruction;

or alternatively, the process may be performed,

sending a second adjusting instruction to a motion controller corresponding to the motion platform so that the motion controller controls the motion platform to rotate or move according to the instruction of the second adjusting instruction; the motion platform is used for placing a target sample.

In the embodiment of the specification, the automatic adjustment of the shooting visual angle of the target sample can be realized by sending the adjusting instruction to the image acquisition equipment or the motion controller, so that the shooting of multiple visual angles of the target sample can be realized, and the operation is simple and convenient without manually adjusting the position of the image acquisition equipment or the target sample by a user.

In addition, in the implementation, before sending the adjustment instruction to the motion controller corresponding to the image acquisition device or the motion platform, the current position of the target sample can be detected to determine whether the adjustment instruction needs to be sent to the motion controller corresponding to the image acquisition device or the motion platform, and determine the specific content of the adjustment instruction.

Specifically, when the current position of the target sample is detected, the position of the target sample can be detected by detecting the position of a preset key point in the target sample. For example, if the target sample is a canned cola, several key points may be selected on the cola tank, the image capturing device may be aligned with the cola tank, and whether the preset key points are located at preset positions on the preview shooting interface of the image capturing device may be detected.

In addition, in order to increase the data volume of the target sample, increase the diversity of the target sample data, and increase the robustness of the trained model, the method provided by the embodiment of the present disclosure further includes the following steps before performing the step 106 described above:

and carrying out data enhancement processing on the second image.

Specifically, the data enhancement processing is performed on the second image, which is actually performed on the second image by performing operations such as rotation and translation, or performing operations such as zooming in, zooming out, and color dithering on the second image, so as to obtain a plurality of second images of the target sample, and in the subsequent step 106, the plurality of second images may be respectively labeled, so that training data corresponding to the target sample has diversity.

In addition, in the embodiment of the present disclosure, in order to make the effect of the target sample in the first image obtained by photographing relatively approximate to the effect of the actual target sample, before executing the step 104, the method provided in the embodiment of the present disclosure further includes the following steps:

and performing image preprocessing on the first image.

Specifically, the image preprocessing of the first image may specifically be to adjust parameters such as resolution, brightness, color, and the like of the first image, so that the effect of the target object in the first image is closer to that of the actual target object.

In addition, in the embodiment of the present disclosure, before the image of the target sample is acquired, parameters of the image acquisition device, illumination parameters, motion parameters of the motion platform, and the like need to be set, which may specifically be set according to an actual application scenario.

Fig. 3 is a second flowchart of a method for generating training data according to an embodiment of the present disclosure, where the method shown in fig. 3 at least includes the following steps:

step 302, a training data generating device detects whether a target sample is located at a set position; if yes, go to step 306, otherwise go to step 304.

In the above step 302, whether the target sample is located at the set position may be detected by detecting whether the preset key point on the target sample is located at the preset position on the preview shooting interface of the RGB camera.

Step 304, the training data generating device sends an adjusting instruction to the RGB camera or the motion controller corresponding to the motion platform so as to adjust the shooting visual angle of the target sample; the motion platform is used for placing a target sample.

Step 306, sending an image acquisition instruction to the RGB camera.

In step 308, after receiving the image acquisition instruction sent by the training data generating device, the rgb camera acquires the first image of the target sample and sends the acquired image to the training data generating device.

In step 310, the training data generating device extracts a foreground image in the first image, and synthesizes the foreground image with the set background image.

In step 312, the training data generating device obtains the attribute tag of the target sample.

In step 314, the training data generating device performs image segmentation on the second image by adopting a green curtain segmentation and/or static background segmentation mode, determines the pixel point corresponding to the target sample, and generates the pixel point labeling image corresponding to the target sample.

In step 316, the training data generating device determines the second image, the attribute tag and the pixel point labeling image as training data corresponding to the target sample, and stores the training data locally.

In step 318, the training data generating device uploads all the training data of the target sample to the cloud for storage.

Specifically, the specific implementation process of each step in the embodiment corresponding to fig. 3 is the same as the implementation process of each step in the method corresponding to fig. 1 and 2, so the specific implementation process of each step in the embodiment corresponding to fig. 3 may refer to the embodiment corresponding to fig. 1 and 2, and will not be repeated herein.

Fig. 4 is a third flowchart of a method for generating training data according to an embodiment of the present disclosure, where the method shown in fig. 4 at least includes the following steps:

Step 402, the training data generating device detects whether the target sample is located at a set position; if yes, go to step 406, otherwise go to step 404.

In step 402, whether the target sample is located at the set position may be detected by detecting whether the preset key point on the target sample is located at the preset position on the preview shooting interface of the RGB camera and the depth camera.

Step 404, the training data generating device sends an adjustment instruction to the motion controller corresponding to the motion platform to adjust the shooting view angle of the target sample; the motion platform is used for placing a target sample.

Alternatively, in step 404, the image capturing apparatus may send an adjustment instruction to the RGB camera and the depth camera, and adjust the shooting angle of view of the target sample by adjusting the positions, angles, and the like of the RGB camera and the depth camera.

Step 406, sending an image acquisition instruction to the RGB camera and the depth camera.

Step 408, after receiving an image acquisition instruction sent by the training data generating device, the RGB camera acquires an RGB image of the target sample and sends the acquired RGB image to the training data generating device; and the depth camera acquires a depth image of the target sample after receiving an image acquisition instruction sent by the training data generating equipment, and sends the acquired depth image to the training data generating equipment.

In step 410, the training data generating device extracts a foreground image from the RGB image and generates a virtual viewpoint image from the foreground image and the depth image.

In step 412, the training data generating apparatus synthesizes the virtual viewpoint image and the set background image.

In step 414, the training data generating device obtains the attribute tag of the target sample.

In step 416, the training data generating device performs image segmentation on the second image by adopting a green curtain segmentation and/or static background segmentation mode, determines the pixel point corresponding to the target sample, and generates the pixel point labeling image corresponding to the target sample.

In step 418, the training data generating device determines the second image, the attribute tag and the pixel point labeling image as training data corresponding to the target sample, and stores the training data locally.

Step 420, the training data generating device uploads all training data of the target sample to the cloud for storage.

Specifically, the specific implementation process of each step in the embodiment corresponding to fig. 4 is the same as the implementation process of each step in the method corresponding to fig. 1 and 2, so the specific implementation process of each step in the embodiment corresponding to fig. 4 may refer to the embodiment corresponding to fig. 1 and 2, and will not be repeated herein.

According to the training data generation method provided by the embodiment of the specification, when training data is generated, the image acquisition equipment can be controlled to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition equipment, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

Corresponding to the method for generating training data provided in the embodiment of the present disclosure, based on the same concept, the embodiment of the present disclosure provides a device for generating training data, configured to execute the method for generating training data provided in the embodiment of the present disclosure, and fig. 5 is a schematic diagram of module components of the device for generating training data provided in the embodiment of the present disclosure, where the device shown in fig. 5 includes:

a first sending module 501, configured to send an image acquisition instruction to an image acquisition device, so that the image acquisition device acquires a first image of a target sample according to an instruction of the image acquisition instruction;

An acquisition module 502, configured to acquire a first image;

a replacing module 503, configured to replace the background image of the first image with the set background image to obtain a second image of the target sample; the background image is an area except for the target sample in the first image;

the labeling module 504 is configured to label the target sample in the second image by using a data labeling algorithm, so as to obtain training data.

Optionally, the labeling module 504 includes:

a first determining unit, configured to determine an attribute tag of a target sample; wherein the attribute tag comprises at least a sample name of the target sample;

the first generation unit is used for determining the pixel point corresponding to the target sample by carrying out image segmentation on the second image and generating a pixel point labeling image corresponding to the target sample;

and the second determining unit is used for determining the second image, the attribute label and the pixel point labeling image as training data.

Optionally, the apparatus provided in the embodiments of the present specification further includes:

the second sending module is used for sending a first adjusting instruction to the image acquisition equipment so as to enable the image acquisition equipment to rotate or translate according to the indication of the first adjusting instruction;

or alternatively, the process may be performed,

The third sending module is used for sending a second adjusting instruction to the motion controller corresponding to the motion platform so that the motion controller can control the motion platform to rotate or move according to the instruction of the second adjusting instruction; the motion platform is used for placing a target sample.

Optionally, the first image is an RGB image;

the replacing module 503 includes:

a first extraction unit configured to extract a foreground image of the first image; the foreground image is an area corresponding to the target sample;

and the first synthesis unit is used for synthesizing the foreground image and the set background image to obtain a second image.

Optionally, the first image includes an RGB image and a depth image;

the replacing module 503 includes:

a second extraction unit configured to extract a foreground image of the first image; the foreground image is an area corresponding to the target sample;

the second generation unit is used for generating a virtual viewpoint image corresponding to the foreground image according to the foreground image and the depth image;

and the second synthesis unit is used for synthesizing the virtual viewpoint image and the set background image to obtain a second image.

And the enhancement processing module is used for carrying out data enhancement processing on the second image.

The training data generating device in the embodiment of the present disclosure may further execute the method executed by the training data generating device in fig. 1 to 4, and implement the functions of the training data generating device in the embodiment shown in fig. 1 to 4, which are not described herein.

According to the training data generating device provided by the embodiment of the specification, when training data is generated, the image acquisition equipment can be controlled to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition equipment, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

Corresponding to the method for generating training data provided in the embodiment of the present disclosure, based on the same concept, the embodiment of the present disclosure further provides a system for generating training data, fig. 6 is one of schematic structural diagrams of the system for generating training data provided in the embodiment of the present disclosure, and the system shown in fig. 6 includes an image acquisition device 601 and an image processing device 602; the image processing apparatus 602 includes generation means of training data;

The image acquisition device 601 is configured to receive an image acquisition instruction sent by the training data generating device, and acquire a first image of a target sample according to an instruction of the image acquisition instruction;

an image processing device 602, configured to send an image acquisition instruction to an image acquisition device; and the image acquisition device is also used for acquiring a first image from the image acquisition device, and replacing the background image of the first image with the set background image to obtain a second image of the target sample; the background image is an area except for the target sample in the first image; and labeling the target sample in the second image through a data labeling algorithm to obtain training data.

Optionally, the image processing device 602 is specifically configured to:

determining an attribute label of the target sample; wherein the attribute tag comprises at least a sample name of the target sample; determining a pixel point corresponding to the target sample by carrying out image segmentation on the second image, and generating a pixel point labeling image corresponding to the target sample; and determining the second image, the attribute tag and the pixel point labeling image as training data.

Optionally, the image processing device 602 is further configured to:

Or alternatively, the process may be performed,

Optionally, the image processing device 602 is further configured to:

and carrying out data enhancement processing on the second image.

Optionally, if the first image is an RGB image, the image processing apparatus 602 is further specifically configured to:

extracting a foreground image of the first image; the foreground image is an area corresponding to the target sample; and synthesizing the foreground image and the set background image to obtain a second image.

Optionally, if the first image includes an RGB image and a depth image; the image processing apparatus 602 is further specifically configured to:

extracting a foreground image of the first image; the foreground image is an area corresponding to the target sample; generating a virtual viewpoint image corresponding to the foreground image according to the foreground image and the depth image; and synthesizing the virtual viewpoint image and the set background image to obtain a second image.

In a specific embodiment, the training data generating system further includes a motion platform 603, as shown in fig. 7, where the target sample is placed on the motion platform 603 when training data is generated, and in addition, the motion platform 603 is connected to a motion controller 604, and specifically, the controller 604 may be integrated on the motion platform 603, or may be a device independent of the motion platform 603. The image processing apparatus 602 controls the rotation or movement of the motion platform 603 by sending an adjustment instruction to the motion controller. The image processor 602 is further connected to the image capturing device 601, and controls the image capturing device 601 to capture an image of the target sample and controls the image capturing device 601 to rotate or translate.

Of course, the image processing apparatus 602 is connected to the motion controller 604, and the motion controller 604 is connected to the motion platform, and controls the motion platform 603 to rotate or move under the control of the image processing apparatus 602.

Fig. 7 shows only one possible implementation form of the training data generating system, and the specific form of the training data generating system is not limited thereto, which is not limited by the embodiment of the present disclosure.

According to the training data generation system provided by the embodiment of the specification, when training data is generated, the image processing equipment can control the image acquisition equipment to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition equipment, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

Further, based on the methods shown in fig. 1 to fig. 4, the embodiment of the present disclosure further provides a training data generating device, as shown in fig. 8.

The training data generating device may be configured or configured to generate relatively large differences, and may include one or more processors 801 and a memory 802, where the memory 802 may store one or more storage applications or data. Wherein the memory 802 may be transient storage or persistent storage. The application program stored in the memory 802 may include one or more modules (not shown in the figures), each of which may include a series of computer-executable instruction information in the generation device for training data. Still further, the processor 801 may be configured to communicate with the memory 802 and execute a series of computer executable instruction information in the memory 802 on a training data generating device. The training data generation device may also include one or more power sources 803, one or more wired or wireless network interfaces 804, one or more input/output interfaces 805, one or more keyboards 806, and the like.

In a specific embodiment, the training data generating device includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instruction information in the training data generating device, and the one or more programs configured to be executed by the one or more processors include computer executable instruction information for:

Sending an image acquisition instruction to the image acquisition equipment so that the image acquisition equipment acquires a first image of a target sample according to the instruction of the image acquisition instruction;

acquiring a first image, and replacing a background image of the first image by using a set background image to obtain a second image of the target sample; the background image is an area except for the target sample in the first image;

Optionally, when the computer executable instruction information is executed, labeling the target sample in the second image through a data labeling algorithm to obtain training data, including:

determining an attribute label of the target sample; wherein the attribute tag comprises at least a sample name of the target sample;

determining a pixel point corresponding to the target sample by carrying out image segmentation on the second image, and generating a pixel point labeling image corresponding to the target sample;

and determining the second image, the attribute tag and the pixel point labeling image as training data.

Optionally, the computer executable instruction information, when executed, may further perform the following steps before sending the image acquisition instruction to the image acquisition device:

or alternatively, the process may be performed,

Optionally, when the computer executable instruction information is executed, the target sample in the second image is marked by a data marking algorithm, and before training data is obtained, the following steps may be further executed:

and carrying out data enhancement processing on the second image.

Optionally, the computer executable instruction information, when executed, the first image is an RGB image;

correspondingly, replacing the background image of the first image with the set background image to obtain a second image of the target sample, wherein the method comprises the following steps:

extracting a foreground image of the first image; the foreground image is an area corresponding to the target sample;

and synthesizing the foreground image and the set background image to obtain a second image.

Optionally, the computer executable instruction information, when executed, the first image comprises an RGB image and a depth image;

generating a virtual viewpoint image corresponding to the foreground image according to the foreground image and the depth image;

and synthesizing the virtual viewpoint image and the set background image to obtain a second image.

According to the training data generation device provided by the embodiment of the specification, when training data is generated, the image acquisition device can be controlled to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition device, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

Further, based on the method shown in fig. 1 to fig. 4, the embodiment of the present disclosure further provides a storage medium, which is used to store computer executable instruction information, and in a specific embodiment, the storage medium may be a U disc, an optical disc, a hard disk, etc., where the computer executable instruction information stored in the storage medium can implement the following flow when executed by a processor:

Optionally, when the computer executable instruction information stored in the storage medium is executed by the processor, labeling the target sample in the second image through a data labeling algorithm to obtain training data, including:

Optionally, the computer executable instruction information stored in the storage medium, when executed by the processor, may further perform the following steps before sending the image acquisition instruction to the image acquisition apparatus:

or alternatively, the process may be performed,

Optionally, when the computer executable instruction information stored in the storage medium is executed by the processor, the following steps may be further executed before the target sample in the second image is marked by the data marking algorithm to obtain training data:

and carrying out data enhancement processing on the second image.

Optionally, the storage medium stores computer executable instruction information that when executed by the processor, the first image is an RGB image;

Optionally, the storage medium stores computer executable instruction information that, when executed by the processor, the first image comprises an RGB image and a depth image;

When the computer executable instruction information stored in the storage medium provided by the embodiment of the specification is executed by the processor, the image acquisition device can be controlled to acquire the first image of the target sample by sending the image acquisition instruction to the image acquisition device when training data is generated, so that the automatic acquisition of the image of the target sample is realized; in addition, in the embodiment of the specification, the target sample in the second image is marked by a data marking algorithm, so that the automatic marking of the target sample is realized, and the data marking efficiency is improved; through the embodiment of the specification, automatic training of sample data is realized, so that the data training efficiency is improved, the labor cost is reduced, and the generated training data is high in accuracy.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchIP address PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instruction information. These computer program instruction information may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instruction information, which is executed by the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instruction information may also be stored in a computer readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instruction information stored in the computer readable memory produce an article of manufacture including instruction information means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instruction information may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instruction information which is executed on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instruction information, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The application may be described in the general context of computer-executable instruction information, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. A method for generating training data, applied to a generating device of training data, the method comprising:

Labeling the target sample in the second image through a data labeling algorithm to obtain the training data; the training data comprises the second image and labeling data corresponding to the second image, the labeling data comprises pixel point labeling images, and the pixel point labeling images are images of pixel points corresponding to target samples labeled in the second image.

2. The method of claim 1, wherein labeling the target sample in the second image by the data labeling algorithm to obtain the training data comprises:

determining an attribute tag of the target sample; wherein the attribute tag includes at least a sample name of the target sample;

and determining the second image, the attribute tag and the pixel point labeling image as the training data.

3. The method of claim 1, the method further comprising, prior to sending the image acquisition instructions to the image acquisition device:

sending a first adjusting instruction to the image acquisition equipment so as to enable the image acquisition equipment to rotate or translate according to the instruction of the first adjusting instruction;

Or alternatively, the process may be performed,

sending a second adjusting instruction to a motion controller corresponding to the motion platform so that the motion controller controls the motion platform to rotate or move according to the instruction of the second adjusting instruction; the motion platform is used for placing the target sample.

4. The method of claim 1, wherein the labeling of the target sample in the second image by the data labeling algorithm further comprises, prior to obtaining the training data:

and carrying out data enhancement processing on the second image.

5. The method of any one of claims 1-4, the first image being an RGB image;

correspondingly, the replacing the background image of the first image with the set background image to obtain a second image of the target sample includes:

and synthesizing the foreground image and the set background image to obtain the second image.

6. The method of any of claims 1-4, the first image comprising an RGB image and a depth image;

and synthesizing the virtual viewpoint image and the set background image to obtain the second image.

7. A training data generating apparatus provided in a training data generating device, the apparatus comprising:

the first sending module is used for sending an image acquisition instruction to the image acquisition equipment so that the image acquisition equipment acquires a first image of a target sample according to the instruction of the image acquisition instruction;

the acquisition module is used for acquiring the first image;

a replacing module, configured to replace a background image of the first image with a set background image to obtain a second image of the target sample; wherein the background image is an area of the first image other than the target sample;

the marking module is used for marking the target sample in the second image through a data marking algorithm to obtain the training data; the training data comprises the second image and labeling data corresponding to the second image, the labeling data comprises pixel point labeling images, and the pixel point labeling images are images of pixel points corresponding to target samples labeled in the second image.

8. The apparatus of claim 7, the labeling module comprising:

a first determining unit, configured to determine an attribute tag of the target sample; wherein the attribute tag includes at least a sample name of the target sample;

the first generation unit is used for determining the pixel point corresponding to the target sample through image segmentation of the second image and generating a pixel point labeling image corresponding to the target sample;

and the second determining unit is used for determining the second image, the attribute label and the pixel point labeling image as the training data.

9. The apparatus of claim 7, the apparatus further comprising:

or alternatively, the process may be performed,

the third sending module is used for sending a second adjusting instruction to a motion controller corresponding to the motion platform so that the motion controller can control the motion platform to rotate or move according to the instruction of the second adjusting instruction; the motion platform is used for placing the target sample.

10. The apparatus of any of claims 7-9, the first image being an RGB image;

the replacement module comprises:

and the first synthesis unit is used for synthesizing the foreground image and the set background image to obtain the second image.

11. The apparatus of any of claims 7-9, the first image comprising an RGB image and a depth image;

the replacement module comprises:

and the second synthesis unit is used for synthesizing the virtual viewpoint image and the set background image to obtain the second image.

12. A training data generation system comprises an image acquisition device and an image processing device; the image processing device comprises a generating device of the training data;

the image processing equipment is used for sending an image acquisition instruction to the image acquisition equipment; and the image acquisition device is further used for acquiring the first image from the image acquisition device and replacing the background image of the first image with a set background image to obtain a second image of the target sample; wherein the background image is an area of the first image other than the target sample; labeling the target sample in the second image through a data labeling algorithm to obtain the training data; the training data comprises the second image and labeling data corresponding to the second image, the labeling data comprises pixel point labeling images, and the pixel point labeling images are images of pixel points corresponding to target samples labeled in the second image.

13. A training data generation apparatus comprising:

a processor; and

labeling the target sample in the second image through a data labeling algorithm to obtain training data; the training data comprises the second image and labeling data corresponding to the second image, the labeling data comprises pixel point labeling images, and the pixel point labeling images are images of pixel points corresponding to target samples labeled in the second image.

14. A storage medium storing computer-executable instructions that when executed implement the following: