CN111325984A

CN111325984A - Sample data acquisition method and device and electronic equipment

Info

Publication number: CN111325984A
Application number: CN202010192073.0A
Authority: CN
Inventors: 尚子钰; 潘杰; 张浩悦
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Apollo Intelligent Technology Beijing Co Ltd
Priority date: 2020-03-18
Filing date: 2020-03-18
Publication date: 2020-06-23
Anticipated expiration: 2040-03-18
Also published as: CN111325984B

Abstract

The embodiment of the application provides a sample data acquisition method and device and electronic equipment, which can be used for automatic driving, particularly in the technical field of autonomous parking. The method comprises the following steps: firstly, respectively acquiring a background image and a background labeling image labeled with attribute information of the background image; rendering the 3D model of the target object and the background image to obtain an augmented reality image; rendering the 3D model of the target object to obtain a target object labeling image labeled with attribute information of the target object; the augmented reality image, the background annotation image and the target object annotation image are sample data which are required to be acquired and comprise the target object. Therefore, when the sample data is acquired, the embodiment of the application is based on the augmented reality technology, and the 3D model of the background image and the target object is used for finally generating the sample data comprising the target object, so that the acquisition time consumption and the labeling time consumption of the sample data can be reduced, and the acquisition efficiency of the sample data is improved.

Description

Sample data acquisition method and device and electronic equipment

Technical Field

The application relates to the technical field of data processing, in particular to the technical field of automatic driving.

Background

In an automatic driving scenario, visual perception is usually performed based on a deep learning model, and acquisition of the deep learning model requires a large number of training samples. When training samples are obtained, the problems of difficulty in acquisition, multi-posture, high labeling cost and the like exist, so that some training samples, such as small sample data, are difficult to obtain, and the accuracy of a deep learning model obtained by training is low due to the fact that a large number of training samples for training lack of the small sample data.

In order to improve the accuracy of the deep learning model obtained by training, in the prior art, small sample data lacking in a training sample is usually directly collected and labeled. However, a large amount of time cost and high economic cost are consumed for collecting data and labeling data, and for some small samples, such as objects of police cars, fire trucks and the like, since the small samples are not frequently seen in daily life, it is difficult to collect a large amount of sample data related to the police cars and the fire trucks in different scenes.

Therefore, the existing sample data acquisition method is adopted, so that the sample data acquisition efficiency is low.

Disclosure of Invention

The embodiment of the application provides a method and a device for acquiring sample data and electronic equipment, and the efficiency of acquiring the sample data is improved when the sample data is acquired.

In a first aspect, an embodiment of the present application provides a method for obtaining sample data, where the method for obtaining sample data may include:

respectively acquiring a background image and a background labeling image; wherein, the background labeling image is labeled with the attribute information of the background.

And rendering the 3D model of the target object and the background image to obtain an augmented reality image.

Rendering the 3D model of the target object to obtain a target object labeling image; wherein, the target object labeling image is labeled with the attribute information of the target object, and the sample data comprises the augmented reality image, the background labeling image and the target object labeling image.

It can be seen that, different from the prior art, when sample data is acquired, the embodiment of the application is based on an augmented reality technology, and the 3D model of the background image and the target object is used to finally generate the sample data including the target object, so that a large amount of small sample data which cannot be acquired in the prior art can be generated in a short time, the acquisition time and the labeling time of the sample data are reduced, the acquisition efficiency of the sample data is improved, and the accuracy of the acquired sample data is improved compared with unsmooth sample data which is obtained by directly pasting the target object and the background image.

In a possible implementation manner, the rendering the 3D model of the target object and the background image to obtain the augmented reality image may include:

acquiring key information influencing a rendering effect; wherein the key information includes at least one of a model parameter of a 3D model of the target object, a shooting parameter when the background image is acquired, or an environmental parameter when the background image is acquired.

Rendering the 3D model of the target object, the key information and the background image to obtain the augmented reality image.

It can be seen that when the augmented reality image is obtained by rendering the 3D model and the background image of the target object, the acquired augmented reality image is more realistic, vivid and diversified by rendering the 3D model and the background image of the target object together with the key information that will affect the sample data.

In a possible implementation manner, the key information includes a model parameter of the 3D model of the target object, and the rendering the 3D model of the target object, the key information, and the background image to obtain the augmented reality image may include:

rendering the 3D model of the target object, the model parameters and the background image to obtain the augmented reality image; wherein the model parameters comprise coordinate system parameters and/or rotation angles.

Correspondingly, the rendering the 3D model of the target object to obtain a target object labeled image includes:

rendering the 3D model of the target object and the model parameters to obtain the target object labeling image.

Therefore, in the possible scene, when the augmented reality image is obtained through rendering, the model parameters including the coordinate system parameters and/or the rotation angle are rendered together with the 3D model of the target object and the background image, so that the augmented reality image including the target object and the road background in different postures can be obtained, and the obtained augmented reality image is more diversified.

In one possible implementation, the shooting parameters include a shooting focal length and/or a mechanical variable coefficient.

Therefore, in the possible scene, when the augmented reality image is obtained through rendering, the shooting parameters including the shooting focal length and/or the mechanical transformation coefficient are rendered together with the 3D model of the target object and the background image, so that the augmented reality image including the target object and the road background under different shooting visual angles can be obtained, and the obtained augmented reality image is more diversified.

In one possible implementation, the environmental parameter includes illumination intensity and/or air quality.

Therefore, in the possible scene, when the augmented reality image is obtained through rendering, the environmental parameters including the illumination intensity and/or the air quality are rendered together with the 3D model of the target object and the background image, so that the augmented reality image including the target object and the road background under different illumination intensities and/or air qualities can be obtained, and the obtained augmented reality image is more realistic, vivid and diversified.

In a possible implementation manner, the method for obtaining sample data may further include:

and judging whether the image is occluded in the augmented reality image.

Generating an augmented reality annotation image corresponding to the augmented reality image according to the judgment result; the augmented reality annotation image is marked with the attribute information of the background and the attribute information of the target object in the augmented reality image, so that the accuracy of the acquired sample data can be improved.

In a possible implementation manner, the generating an augmented reality labeled image corresponding to the augmented reality image according to the determination result may include:

if the augmented reality image does not have the image blocked, the current background labeling image is the labeling data of the background in the augmented reality image, and the target object labeling image is the labeling data of the target object in the augmented reality image. Therefore, in this case, when the augmented reality annotation image corresponding to the augmented reality image is acquired, the background annotation image and the target object annotation image can be directly synthesized, so that the augmented reality annotation image corresponding to the augmented reality image can be obtained. The augmented reality labeling image is labeled with attribute information of a background in the augmented reality image and attribute information of a target object.

If the image exists in the augmented reality image and is shielded, it is indicated that the current background labeled image is no longer the labeled data of the background in the augmented reality image, the attribute information of the shielded image needs to be deleted in the background labeled image, and the obtained new background labeled image is the labeled data of the background in the augmented reality image. Therefore, in this case, when obtaining the augmented reality labeled image corresponding to the augmented reality image, the attribute information of the shielded image needs to be deleted in the background labeled image to obtain a new background labeled image; and then, synthesizing the new background annotation image and the target object annotation image, so that the augmented reality annotation image corresponding to the augmented reality image can be obtained, the background annotation image can be prevented from including the annotation data of the shielded image, and the accuracy of the obtained sample data is further improved.

and acquiring a 3D model set, and searching the 3D model of the target object in the 3D model set so as to acquire the 3D model of the target object.

In a second aspect, an embodiment of the present application further provides a device for acquiring sample data, where the method for acquiring sample data may include:

the acquisition module is used for respectively acquiring a background image and a background labeling image; wherein, the background labeling image is labeled with the attribute information of the background.

The processing module is used for rendering the 3D model of the target object and the background image to obtain an augmented reality image; rendering the 3D model of the target object to obtain a target object labeling image; wherein, the target object labeling image is labeled with the attribute information of the target object, and the sample data comprises the augmented reality image, the background labeling image and the target object labeling image.

In a possible implementation manner, the processing module is specifically configured to obtain key information that affects a rendering effect; rendering the 3D model of the target object, the key information and the background image to obtain the augmented reality image; wherein the key information includes at least one of a model parameter of a 3D model of the target object, a shooting parameter when the background image is acquired, or an environmental parameter when the background image is acquired.

In a possible implementation manner, the key information includes model parameters of a 3D model of the target object, and the processing module is specifically configured to:

Correspondingly, the processing module is further specifically configured to:

In a possible implementation manner, the processing module is further configured to determine whether an image is occluded in the augmented reality image; generating an augmented reality annotation image corresponding to the augmented reality image according to the judgment result; and the augmented reality labeling image is labeled with attribute information of a background and attribute information of a target object in the augmented reality image.

In a possible implementation manner, the processing module is specifically configured to, if no image in the augmented reality image is occluded, perform synthesis processing on the background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image; if the image exists in the augmented reality image and is shielded, deleting the attribute information of the shielded image in the background labeling image to obtain a new background labeling image; and synthesizing the new background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image.

In a possible implementation manner, the obtaining module is specifically configured to obtain a 3D model set; and searching the 3D model of the target object in the 3D model set.

In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device may include:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the method for obtaining sample data described in any one of the possible implementations of the first aspect.

In a fourth aspect, embodiments of the present application further provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the method for acquiring sample data described in any one of the foregoing possible implementation manners of the first aspect.

One embodiment in the above application has the following advantages or benefits: when sample data is obtained, a background image and a background labeling image labeled with attribute information of a background are respectively obtained; rendering the 3D model of the target object and the background image to obtain an augmented reality image; rendering the 3D model of the target object to obtain a target object labeling image labeled with attribute information of the target object; the obtained augmented reality image, the background annotation image and the target object annotation image are sample data which are required to be obtained and comprise the target object. It can be seen that, different from the prior art, when sample data is acquired, the embodiment of the application is based on the augmented reality technology, and the 3D model of the background image and the target object is used to finally generate the sample data including the target object, so that a large amount of small sample data which cannot be acquired in the prior art can be generated in a short time, the acquisition time and the labeling time of the sample data are reduced, and the acquisition efficiency of the sample data is improved.

Other effects of the above-described alternative will be described below with reference to specific embodiments.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a scene diagram of a sample data obtaining method that can implement the embodiment of the present application;

fig. 2 is a schematic flowchart of a sample data obtaining method according to a first embodiment of the present application;

fig. 3 is a schematic diagram of a background labeled image of a parking lot according to a first embodiment of the present application;

fig. 4 is a schematic view of a 3D model of a police car provided in a first embodiment of the present application;

FIG. 5 is a diagram illustrating an internal screenshot of a renderer according to a first embodiment of the present application;

fig. 6 is a schematic diagram of an augmented reality image provided in the first embodiment of the present application;

fig. 7 is a schematic flowchart of a method for acquiring an augmented reality image according to a second embodiment of the present application;

fig. 8 is a schematic diagram of acquiring an augmented reality image and a target object annotation image according to a second embodiment of the present application;

fig. 9 is a schematic diagram of a 3D model and an augmented reality image of a police car in a first pose provided by a second embodiment of the present application;

FIG. 10 is a schematic illustration of a 3D model and augmented reality image of a police car in a second pose as provided by a second embodiment of the present application;

fig. 11 is a schematic flowchart of a method for acquiring an augmented reality annotation image corresponding to an augmented reality image according to a third embodiment of the present application;

fig. 12 is a schematic structural diagram of an apparatus for acquiring sample data according to a fourth embodiment of the present application;

fig. 13 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the embodiments of the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In the description of the text of the present application, the character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

It can be understood that the sample data acquisition method provided by the embodiment of the application can be applied to an automatic driving scene. In order to ensure the driving safety of the vehicle, the road condition in front of the vehicle needs to be sensed, and in general, the vehicle needs to be visually sensed based on a deep learning model. However, constructing a deep learning model requires a large number of training samples, especially for some small sample data, for example, the target object is a police car, and the background is sample data of a parking lot, for example, please refer to fig. 1, where fig. 1 is a scene diagram for implementing the method for acquiring sample data according to the embodiment of the present application, when acquiring sample data of a police car staying in a parking lot, since the police car is not often seen in daily life, the efficiency of acquiring sample data is low.

In order to improve the acquisition efficiency of sample data, an attempt can be made to combine the sample data of the police car staying in the parking lot by sticking the police car out of the image containing the police car to another parking lot image by means of mapping. However, when sample data is acquired, although the acquisition efficiency of the sample data can be improved to a certain extent, because the method can only extract the police car in the existing image and directly paste the image to synthesize the image of the police car staying in the parking lot, the synthesized image of the police car staying in the parking lot is not smooth and is greatly different from the actually acquired data, and the accuracy of the acquired sample data is low.

Based on the above discussion, in order to improve the efficiency of obtaining sample data, the embodiment of the present application provides a method for obtaining sample data, and when obtaining sample data, a background image and a background labeling image labeled with attribute information of a background are obtained respectively; rendering the 3D model of the target object and the background image to obtain an augmented reality image; rendering the 3D model of the target object to obtain a target object labeling image labeled with attribute information of the target object; the obtained augmented reality image, the background annotation image and the target object annotation image are sample data which are required to be obtained and comprise the target object. It can be seen that, different from the prior art, when sample data is acquired, the embodiment of the application is based on an augmented reality technology, and the 3D model of the background image and the target object is used to finally generate the sample data including the target object, so that a large amount of small sample data which cannot be acquired in the prior art can be generated in a short time, the acquisition time and the labeling time of the sample data are reduced, the acquisition efficiency of the sample data is improved, and the accuracy of the acquired sample data is improved compared with unsmooth sample data which is obtained by directly pasting the target object and the background image.

Wherein the types of the target object and the background are related to an application scene of the sample data. For example, when the application scene is an automatic driving scene, the target object may be a police vehicle or a fire truck, and the background may be a parking lot or a road. When the target object is a police car, the attribute information of the target object can be information such as the position of the police car, the size of the police car and the like; when the target object is a fire truck, the attribute information of the target object may be information such as a position of the fire truck and a size of the fire truck. When the background is a parking lot, the attribute information of the background can be information such as the position of the parking lot and the size of a parking space; when the background is a road, the attribute information of the background may be information such as a position of the road, the number of lanes of the road, and a lane width of the road; of course, the method can also be applied to other scenes, for example, a scene of house decoration, the target object can be a table, and the background image is a different house type. In the following description, the method for obtaining sample data provided in the embodiment of the present application will be described by taking an example that the target object may be a police car and the background may be a parking lot, but the embodiment of the present application is not limited thereto.

In general, when sample data for training a deep learning model is acquired, not only sample image data but also annotation data corresponding to the sample image data need to be acquired, where the sample image data and the annotation data corresponding to the sample image data are a complete set of sample data. In the embodiment of the application, the 3D model and the background image of the target object are rendered, the obtained augmented reality image can be understood as sample image data, the background annotation image and the 3D model of the target object are rendered, the obtained target object annotation image can be understood as annotation data corresponding to the sample image data, and the augmented reality image, the background annotation image and the target object annotation image are a set of complete sample data including the target object.

It should be noted that, in the embodiment of the present application, only in the process of collecting and labeling sample data, the small sample data is difficult to collect and label, and therefore, the method for obtaining sample data provided in the embodiment of the present application is used to obtain the small sample data, but the method for obtaining sample data provided in the embodiment of the present application is not limited to obtaining some small sample data, and certainly, some large sample data, that is, sample data of a common target car, may also be obtained.

Hereinafter, the method for acquiring sample data provided in the present application will be described in detail by specific examples. It is to be understood that the following detailed description may be combined with other embodiments, and that the same or similar concepts or processes may not be repeated in some embodiments.

Example one

Fig. 2 is a flowchart illustrating a method for obtaining sample data according to a first embodiment of the present application, where the method for obtaining sample data may be executed by software and/or a hardware device, for example, the hardware device may be a device for obtaining sample data, and the device for obtaining sample data may be disposed in an electronic device. For example, please refer to fig. 2, the method for obtaining sample data may include:

s201, respectively acquiring a background image and a background annotation image.

Wherein, the background labeling image is labeled with the attribute information of the background. The background may be a parking lot or a road, for example. When the background is a parking lot, the attribute information of the background can be information such as the position of the parking lot and the size of a parking space; when the background is a road, the attribute information of the background may be information such as a position of the road, the number of lanes of the road, and a lane width of the road.

For example, when a background image is acquired, as shown in fig. 1, taking a background as an example of a parking lot, when a parking lot image is acquired, the parking lot image may be acquired through an acquisition device (e.g., a camera), and may also receive a parking lot image sent by another device. It can be understood that, in the embodiment of the present application, after the parking lot image is acquired, the parking lot image may be labeled to a professional annotator, so as to acquire a background labeled image of the parking lot. For example, please refer to fig. 3, where fig. 3 is a schematic diagram of a background annotation image of a parking lot according to a first embodiment of the present application.

After the background image and the background annotation image are acquired respectively, the following S202 may be performed:

s202, rendering the 3D model of the target object and the background image to obtain an augmented reality image.

For example, the target object may be a police car or a fire truck, and when the target object is a police car, the attribute information of the target object may be information such as a position of the police car and a size of the police car; when the target object is a fire truck, the attribute information of the target object may be information such as a position of the fire truck and a size of the fire truck.

It will be appreciated that the 3D model of the target object needs to be acquired before rendering the 3D model and the background image of the target object. For example, referring to fig. 1, taking a target object as a police car as an example, when obtaining a 3D model of the police car, a 3D model set may be obtained first, and the 3D model of the police car is searched in the 3D model set, so as to obtain the 3D model of the police car, for example, please refer to fig. 4, where fig. 4 is a schematic diagram of the 3D model of the police car provided in the first embodiment of the present application.

After the 3D model of the police car and the parking lot image are acquired respectively, the 3D model of the police car and the parking lot image can be input into a renderer, for example, an internal screenshot of the renderer can be shown in fig. 5, fig. 5 is a schematic diagram of the internal screenshot of the renderer provided in the first embodiment of the present application, and the 3D model of the police car and the parking lot image are rendered to obtain a 2D augmented reality image, where the 2D augmented reality image can be understood as sample image data in sample data to be acquired. It is understood that the 2D augmented reality image includes both the target object police car and the background parking lot. For example, please refer to fig. 6, where fig. 6 is a schematic diagram of an augmented reality image provided in a first embodiment of the present application.

Because when acquiring sample data including a police car, not only an augmented reality image including the police car needs to be acquired, but also annotation data corresponding to the augmented reality image including the police car needs to be acquired, where the annotation image is the police car annotation image and the parking lot annotation image in the embodiment of the present application, in the above S201, when acquiring the parking lot image, the parking lot annotation image corresponding to the parking lot image is already acquired together, and therefore, the police car annotation image needs to be acquired again, that is, the following S203 is executed:

and S203, rendering the 3D model of the target object to obtain a target object annotation image.

Wherein, the target object labeling image is labeled with the attribute information of the target object. For example, the attribute information of the target object may include information such as a position of the target object and a size of the target object.

After the 3D model of the police car is obtained, the 3D model of the police car may be input to a renderer to be rendered, so as to obtain a police car annotation image, as shown in fig. 6, the parking lot image in the augmented reality image shown in fig. 6 is removed, so as to obtain the police car annotation image, and the parking lot annotation image obtained in S201 and the augmented reality image including the police car obtained in S202 are combined to obtain a set of sample data including the police car.

It should be noted that, in the embodiment of the present application, there is no sequence between S202 and S203, and S202 may be executed first, and then S203 may be executed; or executing S203 first and then executing S202; of course, S202 and S203 may also be executed simultaneously, and may be specifically set according to actual needs, and here, the embodiment of the present application is only described by taking the example of executing S202 first and then executing S203, but the embodiment of the present application is not limited thereto.

It can be seen that the embodiment shown in fig. 2 describes in detail a technical solution of how to obtain sample data in the embodiment of the present application. In the following description, taking a target object as a police car and a background as a parking lot as an example, in a real scene, when image data including the police car are collected, the position angle of the police car is changeable and can be realized through model parameters of a 3D model of the police car; similarly, when the acquired image data including the parking lot is acquired, the shooting parameters and/or the environmental parameters when the image data including the parking lot is acquired are also variable, so that the acquired sample data including the police car is more diversified and more realistic, and when the key information which can influence the sample data is rendered, namely the 3D model of the target object and the background image are rendered in the step S202, and when the augmented reality image is obtained, the key information, the 3D corresponding to the target object model and the background image can be rendered together to obtain the augmented reality image. For example, the key information may include at least one of a model parameter of a 3D model of the target object, a photographing parameter when acquiring the background image, or an environment parameter when acquiring the background image. In the following, how to acquire an augmented reality image will be described in detail in combination with the key information, respectively, and reference may be made to the following second embodiment.

Example two

Fig. 7 is a flowchart illustrating an augmented reality image acquiring method according to a second embodiment of the present application, for example, please refer to fig. 7, where the augmented reality image acquiring method may include:

and S701, acquiring key information influencing the rendering effect.

The key information includes at least one of a model parameter of a 3D model of the target object, a shooting parameter when acquiring the background image, or an environment parameter when acquiring the background image, and may also include other parameters, which may be specifically set according to actual needs. It can be understood that the more parameters included in the key information, the more the key information is rendered together with the 3D model of the target object and the background image, and the more the rendering degree of the augmented reality image is diversified, the higher the degree of reality is.

For example, in the embodiment of the present application, the model parameters include coordinate system parameters and/or rotation angles; the shooting parameters can comprise camera internal parameters such as a shooting focal length and/or a mechanical variable coefficient; the environmental parameters may include light intensity and/or air quality; of course, other parameters may also be included, and the setting may be specifically performed according to actual needs, and herein, the embodiments of the present application are not further limited.

After obtaining the key information affecting the rendering effect, the key information may be rendered together with the 3D model of the target object and the background image, that is, the following S702 is performed:

s702, rendering the 3D model, the key information and the background image of the target object to obtain an augmented reality image.

For example, in one possible scenario, when the key information is a model parameter of a 3D model of the target object, when the augmented reality image is obtained by rendering, the 3D model of the target object, the model parameter including the coordinate system parameter and/or the rotation angle, and the background image may be input into the renderer together, and the 3D model of the target object, the model parameter including the coordinate system parameter and/or the rotation angle, and the background image may be rendered, so as to obtain the augmented reality image. It is noted that, when rendering an augmented reality image including a target object, if model parameters including coordinate system parameters and/or rotation angles are added, in order to make the target object annotation image corresponding to the augmented reality image consistent with the annotation data of the target object in the augmented reality image, the target object in the augmented reality image and the target object in the target object annotation image need to maintain consistent attitude angles and consistent sizes, and therefore, when the target object labeling image is obtained by rendering, the 3D model of the target object and the model parameters including the coordinate system parameters and/or the rotation angle can be input into the renderer together, rendering the 3D model of the target object and the model parameters including the coordinate system parameters and/or the rotation angle, thereby obtaining a target object labeling image.

It can be understood that, when the model parameters including the coordinate system parameters and/or the rotation angles are rendered together with the 3D model of the target object and the background image to obtain the augmented reality image, and the model parameters including the coordinate system parameters and/or the rotation angles are rendered together with the 3D model of the target object to obtain the labeled image of the target object, values of the model parameters are different, and the posture of the target object in the image generated based on the model parameters is different.

In another possible implementation manner, when the key information is a shooting parameter, and an augmented reality image is obtained by rendering, the 3D model of the target object, the shooting parameter including the shooting focal length and/or the mechanical transformation coefficient, and the background image may be input to the renderer together, and the 3D model of the target object, the shooting parameter including the shooting focal length and/or the mechanical transformation coefficient, and the background image may be rendered, so that the augmented reality image is obtained.

It can be understood that when the 3D model of the target object, the shooting parameters including the shooting focal length and/or the mechanical transformation coefficient, and the background image are rendered together to obtain the augmented reality image, values of the shooting parameters are different, and the posture of the target object in the image generated based on the shooting parameters is different.

In yet another possible implementation manner, when the key information is an environmental parameter, when the augmented reality image is obtained by rendering, the 3D model of the target object, the environmental parameter including the illumination intensity and/or the air quality, and the background image may be input to the renderer together, and the 3D model of the target object, the environmental parameter including the illumination intensity and/or the air quality, and the background image may be rendered, so as to obtain the augmented reality image.

It can be understood that when the 3D model of the target object, the environmental parameters including the illumination intensity and/or the air quality, and the background image are rendered together to obtain the augmented reality image, values of the environmental parameters are different, and the posture of the target object in the image generated based on the environmental parameters is different.

It can be understood that, in the above three possible scenes, the technical solutions are described how to render the 3D model of the target object, the key information, and the background image to obtain the augmented reality image when the key information is the model parameter of the 3D model of the target object, the shooting parameter when acquiring the background image, or the environment parameter when acquiring the background image. Certainly, when the augmented reality image is acquired, the key information includes three parameters, namely a model parameter of the 3D model of the target object, a shooting parameter when the background image is acquired, and an environment parameter when the background image is acquired, and the 3D model of the target object, the three parameters, and the background image are rendered to obtain the augmented reality image; in addition, the 3D model of the target object and the model parameters including the coordinate system parameters and/or the rotation angle may also be input to the renderer together, and the 3D model of the target object and the model parameters including the coordinate system parameters and/or the rotation angle may be rendered, thereby obtaining the annotation image of the target object. For example, please refer to fig. 8, where fig. 8 is a schematic diagram of acquiring an augmented reality image and a target object annotation image according to a second embodiment of the present application. Therefore, the acquired augmented reality image is more realistic, vivid and diversified. It should be noted that, in the embodiment of the present application, a rendering method for rendering a 3D model of a target object, three parameters and a background image is similar to a rendering method for rendering a 3D model of a target object, any one of the three parameters and a background image in the three possible scenes, and reference may be made to the description in the three possible implementation manners, and here, the embodiment of the present application is not described again.

In an example, continuing to use the target object as a police car and the background as a parking lot as an example, when the key information includes a coordinate system parameter, a rotation angle, a shooting focal length and an illumination intensity, and a parameter value in the key information is a first numerical value, the 3D model of the police car and the posture of the police car in the augmented reality image generated based on the 3D model may be the first posture, for example, as shown in fig. 9, fig. 9 is a schematic diagram of the 3D model of the police car and the augmented reality image in the first posture provided in the second embodiment of the present application. For example, as shown in fig. 10, fig. 10 is a schematic diagram of a 3D model of a police car in a second posture and an augmented reality image in an augmented reality image generated based on the 3D model, where the key information includes a coordinate system parameter, a rotation angle, a shooting focal length, and an illumination intensity, and a parameter value in the key information is a second value. As is apparent from fig. 9 and 10, when the values of the parameters in the key information are different, the postures of the police cars in the images generated based on the key information are different.

In the embodiment shown in fig. 2 or the embodiment shown in fig. 7, when the 3D model of the target object and the background image are rendered, a part of the background may be occluded by the target object in the obtained augmented reality image including the target object, and if no further annotation data processing is performed, the accuracy of the obtained sample data is not high. Therefore, in order to improve the accuracy of the acquired sample data, after the augmented reality image and the background annotation image are acquired respectively, whether the image in the augmented reality image is occluded or not may be further determined, and the augmented reality annotation image corresponding to the augmented reality image is generated according to the determination result, so as to obtain the augmented reality annotation image corresponding to the augmented reality image, which may be referred to in the following third embodiment.

EXAMPLE III

Fig. 11 is a schematic flowchart of a method for acquiring an augmented reality annotation image corresponding to an augmented reality image according to a third embodiment of the present application, for example, please refer to fig. 11, where the method for acquiring sample data may further include:

s1101, judging whether the image is shielded in the augmented reality image.

And S1102, if the augmented reality image is not shielded, synthesizing the background annotation image and the target object annotation image to obtain the augmented reality annotation image corresponding to the augmented reality image.

After the judgment, if no image is occluded in the augmented reality image, it is indicated that the current background labeled image is the labeled data of the background in the augmented reality image, and the target object labeled image is the labeled data of the target object in the augmented reality image. Therefore, in this case, when the augmented reality annotation image corresponding to the augmented reality image is acquired, the background annotation image and the target object annotation image can be directly synthesized, so that the augmented reality annotation image corresponding to the augmented reality image can be obtained. The augmented reality labeling image is labeled with attribute information of a background in the augmented reality image and attribute information of a target object.

S1103, if the image is occluded in the augmented reality image, deleting attribute information of the occluded image in the background annotation image to obtain a new background annotation image; and synthesizing the new background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image.

After the judgment, if the image is occluded in the augmented reality image, it is indicated that the current background annotation image is no longer the annotation data of the background in the augmented reality image, and the attribute information of the occluded image needs to be deleted in the background annotation image, so that the obtained new background annotation image is the annotation data of the background in the augmented reality image. Therefore, in this case, when obtaining the augmented reality labeled image corresponding to the augmented reality image, the attribute information of the shielded image needs to be deleted in the background labeled image to obtain a new background labeled image; and then, synthesizing the new background annotation image and the target object annotation image, so that the augmented reality annotation image corresponding to the augmented reality image can be obtained, the background annotation image can be prevented from including the annotation data of the shielded image, and the accuracy of the obtained sample data is further improved.

It should be noted that the generation of the annotation data in the sample data corresponds to the application of the sample data, and if the sample data is used for training the detection model, the boundary frame annotation data of the obstacle needs to be generated according to the augmented reality image and the background annotation image, and the problem that whether the image is blocked in the augmented reality image needs to be considered; after the bounding box annotation data is generated, the image detection model may be trained using the augmented reality image and the bounding box annotation data. If the sample data is used for training the segmentation model, segmentation annotation data needs to be generated according to the augmented reality image and the background annotation image, and then the image segmentation model is trained by using the augmented reality image and the segmentation annotation data. If the sample data is used for training the classification model, only one classification label data of the augmented reality image needs to be determined, and then the augmented reality image and the classification label data are used for training the image classification model.

Example four

Fig. 12 is a schematic structural diagram of a device 120 for acquiring sample data according to a fourth embodiment of the present application, for example, please refer to fig. 12, where the device 120 for acquiring sample data may include:

an obtaining module 1201, configured to obtain a background image and a background labeling image respectively; wherein, the background labeling image is labeled with the attribute information of the background.

A processing module 1202, configured to render the 3D model of the target object and the background image to obtain an augmented reality image; rendering the 3D model of the target object to obtain a target object labeling image; the target object labeling image is labeled with attribute information of the target object, and the sample data comprises an augmented reality image, a background labeling image and a target object labeling image.

Optionally, the processing module 1202 is specifically configured to obtain key information that affects rendering effects; rendering the 3D model, the key information and the background image of the target object to obtain an augmented reality image; wherein the key information includes at least one of a model parameter of the 3D model of the target object, a shooting parameter when acquiring the background image, or an environmental parameter when acquiring the background image.

Optionally, the key information includes model parameters of the 3D model of the target object, and the processing module 1202 is specifically configured to:

rendering the 3D model, the model parameters and the background image of the target object to obtain an augmented reality image; wherein the model parameters comprise coordinate system parameters and/or rotation angles.

Correspondingly, the processing module 1202 is further specifically configured to:

and rendering the 3D model and the model parameters of the target object to obtain a target object labeling image.

Optionally, the shooting parameters include a shooting focal length and/or a mechanical variable coefficient.

Optionally, the environmental parameter comprises illumination intensity and/or air quality.

Optionally, the processing module 1202 is further configured to determine whether an image in the augmented reality image is occluded; generating an augmented reality annotation image corresponding to the augmented reality image according to the judgment result; the augmented reality labeling image is labeled with attribute information of a background in the augmented reality image and attribute information of a target object.

Optionally, the processing module 1202 is specifically configured to, if no image in the augmented reality image is occluded, perform synthesis processing on the background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image; if the image is shielded in the augmented reality image, deleting the attribute information of the shielded image in the background labeling image to obtain a new background labeling image; and synthesizing the new background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image.

Optionally, the obtaining module 1201 is specifically configured to obtain a 3D model set; and finding a 3D model of the target object in the set of 3D models.

The apparatus 120 for acquiring sample data provided in this embodiment of the present application may implement the technical solution of the method for acquiring sample data in any embodiment described above, and its implementation principle and beneficial effect are similar to those of the method for acquiring sample data, and reference may be made to the implementation principle and beneficial effect of the method for acquiring sample data, which are not described herein again.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 13, fig. 13 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 13, the electronic apparatus includes: one or more processors 1301, memory 1302, and interfaces for connecting the various components, including high speed interfaces and low speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 13 illustrates an example of a processor 1301.

Memory 1302 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to execute the method for acquiring sample data provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the method for acquiring sample data provided by the present application.

The memory 1302, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the acquisition module 1201 and the processing module 1202 shown in fig. 12) corresponding to the method for acquiring sample data in the embodiment of the present application. The processor 1301 executes various functional applications and data processing of the server by running non-transitory software programs, instructions and modules stored in the memory 1302, that is, implements the method for acquiring sample data in the above method embodiments.

The memory 1302 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the electronic device of the acquisition method of the sample data, and the like. Further, the memory 1302 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 1302 may optionally include a memory remotely located from the processor 1301, and these remote memories may be connected to the electronic device of the sample data obtaining method through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the sample data obtaining method may further include: an input device 1303 and an output device 1304. The processor 1301, the memory 1302, the input device 1303 and the output device 1304 may be connected by a bus or other means, and fig. 13 illustrates the bus connection.

The input device 1303 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus of the method of acquiring sample data, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 1304 may include a display device, auxiliary lighting devices (e.g., LEDs), tactile feedback devices (e.g., vibrating motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, when sample data is obtained, a background image and a background labeling image labeled with attribute information of a background are obtained respectively; rendering the 3D model of the target object and the background image to obtain an augmented reality image; rendering the 3D model of the target object to obtain a target object labeling image labeled with attribute information of the target object; the obtained augmented reality image, the background annotation image and the target object annotation image are sample data which are required to be obtained and comprise the target object. It can be seen that, different from the prior art, when sample data is acquired, the embodiment of the application is based on an augmented reality technology, and the 3D model of the background image and the target object is used to finally generate the sample data including the target object, so that a large amount of small sample data which cannot be acquired in the prior art can be generated in a short time, the acquisition time and the labeling time of the sample data are reduced, the acquisition efficiency of the sample data is improved, and the accuracy of the acquired sample data is improved compared with unsmooth sample data which is obtained by directly pasting the target object and the background image.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for acquiring sample data is characterized by comprising the following steps:

respectively acquiring a background image and a background labeling image; wherein, the background labeling image is labeled with the attribute information of the background;

rendering the 3D model of the target object and the background image to obtain an augmented reality image;

2. The method of claim 1, wherein rendering the 3D model of the target object and the background image to obtain an augmented reality image comprises:

acquiring key information influencing a rendering effect; wherein the key information comprises at least one of model parameters of a 3D model of the target object, shooting parameters when acquiring the background image, or environment parameters when acquiring the background image;

3. The method of claim 2, wherein the key information comprises model parameters of a 3D model of the target object, and wherein the rendering the 3D model of the target object, the key information, and the background image to obtain the augmented reality image comprises:

rendering the 3D model of the target object, the model parameters and the background image to obtain the augmented reality image; wherein the model parameters comprise coordinate system parameters and/or rotation angles;

4. The method of claim 2,

the shooting parameters comprise a shooting focal length and/or a mechanical variable coefficient.

5. The method of claim 2,

the environmental parameters include light intensity and/or air quality.

6. The method according to any one of claims 1-5, further comprising:

judging whether an image is shielded or not in the augmented reality image;

generating an augmented reality annotation image corresponding to the augmented reality image according to the judgment result; and the augmented reality labeling image is labeled with attribute information of a background and attribute information of a target object in the augmented reality image.

7. The method according to claim 6, wherein the generating an augmented reality annotation image corresponding to the augmented reality image according to the determination result comprises:

if the augmented reality image is not shielded, synthesizing the background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image;

if the image exists in the augmented reality image and is shielded, deleting the attribute information of the shielded image in the background labeling image to obtain a new background labeling image; and synthesizing the new background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image.

8. The method according to any one of claims 1-5, further comprising:

acquiring a 3D model set;

and searching the 3D model of the target object in the 3D model set.

9. An apparatus for acquiring sample data, comprising:

the acquisition module is used for respectively acquiring a background image and a background labeling image; wherein, the background labeling image is labeled with the attribute information of the background;

10. The apparatus of claim 9,

the processing module is specifically used for acquiring key information influencing the rendering effect; rendering the 3D model of the target object, the key information and the background image to obtain the augmented reality image; wherein the key information includes at least one of a model parameter of a 3D model of the target object, a shooting parameter when the background image is acquired, or an environmental parameter when the background image is acquired.

11. The apparatus according to claim 10, wherein the key information comprises model parameters of the 3D model of the target object, and the processing module is specifically configured to:

correspondingly, the processing module is further specifically configured to:

12. The apparatus of claim 10,

13. The apparatus of claim 10,

the environmental parameters include light intensity and/or air quality.

14. The apparatus according to any one of claims 9 to 13,

the processing module is further configured to determine whether an image is occluded in the augmented reality image; generating an augmented reality annotation image corresponding to the augmented reality image according to the judgment result; and the augmented reality labeling image is labeled with attribute information of a background and attribute information of a target object in the augmented reality image.

15. The apparatus of claim 14,

the processing module is specifically configured to, if no image in the augmented reality image is occluded, perform synthesis processing on the background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image; if the image exists in the augmented reality image and is shielded, deleting the attribute information of the shielded image in the background labeling image to obtain a new background labeling image; and synthesizing the new background annotation image and the target object annotation image to obtain an augmented reality annotation image corresponding to the augmented reality image.

16. The apparatus according to any one of claims 9 to 13,

the obtaining module is specifically configured to obtain a 3D model set; and searching the 3D model of the target object in the 3D model set.

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of obtaining sample data of any one of claims 1 to 8.

18. A non-transitory computer readable storage medium storing computer instructions for causing a computer to execute the method for acquiring sample data according to any one of claims 1 to 8.