CN114067172A

CN114067172A - Simulation image generation method, simulation image generation device and electronic equipment

Info

Publication number: CN114067172A
Application number: CN202111277649.4A
Authority: CN
Inventors: 蔡诗晗; 熊友军; 赵明国
Original assignee: Ubtech Robotics Corp
Current assignee: Ubtech Robotics Corp
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2022-02-18

Abstract

A simulation image generation method, a simulation image generation device and an electronic device are provided. The application discloses a simulation image generation method and device, electronic equipment and a computer readable storage medium. The method comprises the following steps: importing a to-be-tested environment in a preset simulation platform, wherein at least one camera and at least one environment object are created in the to-be-tested environment, the environment object comprises a target to be identified and intelligent equipment to be tested, and the intelligent equipment to be tested comprises at least one camera; controlling the intelligent equipment to be detected and/or the target to be identified to move in the environment to be detected, and obtaining a processed image through parameter control of the camera and/or image processing operation of an image output by the camera in the process; acquiring attribute data of each environmental object in a processed image; and labeling the processed image based on the attribute data to obtain a simulation image. By the scheme, a large number of high-fidelity simulation images which can be used for training the visual algorithm model can be rapidly obtained in a short time.

Description

Simulation image generation method, simulation image generation device and electronic equipment

Technical Field

The present application relates to image processing technologies, and in particular, to a method and an apparatus for generating a simulation image, an electronic device, and a computer-readable storage medium.

Background

Visual algorithm models based on deep learning often require a large amount of training data sets. Training of the visual algorithm model is currently typically accomplished using a large-scale published data set as the training data set. However, the disclosed large-scale data sets may be uncovered for different needs and scenarios. This requires that the visual algorithm model be retrained or fine-tuned based on the collected data after data is collected for specific needs.

Currently, the data acquisition method is usually manual to perform on-site data acquisition in a specific demand scene. However, in-field data acquisition, complicated acquisition conditions are often encountered, which makes it difficult to acquire desired data. In addition, the collected data may not have practical application value due to various factors. Therefore, how to quickly obtain data that can be used for training a visual algorithm model becomes a problem to be solved at present.

Disclosure of Invention

The application provides a simulation image generation method, a simulation image generation device, electronic equipment and a computer readable storage medium, which can quickly obtain a large number of high-fidelity simulation images which can be used for training a visual algorithm model in a short time.

In a first aspect, the present application provides a method for generating a simulation image, including:

importing a to-be-tested environment in a preset simulation platform, wherein at least one camera and at least one environment object are created in the to-be-tested environment, the environment object comprises a target to be identified and intelligent equipment to be tested, and the intelligent equipment to be tested comprises at least one camera;

controlling the intelligent equipment to be detected and/or the target to be identified to move in the environment to be detected;

when the intelligent device to be detected and/or the target to be identified move in the environment to be detected, obtaining a processed image through parameter control of the camera and/or image processing operation of an image output by the camera;

acquiring attribute data of each environmental object in the processed image;

and labeling the processed image based on the attribute data to obtain a simulated image.

In a second aspect, the present application provides a simulation image generation apparatus, including:

the system comprises an importing module, a simulation module and a processing module, wherein the importing module is used for importing an environment to be tested into a preset simulation platform, at least one camera and at least one environment object are created in the environment to be tested, the environment object comprises a target to be identified and intelligent equipment to be tested, and the intelligent equipment to be tested comprises at least one camera;

the control module is used for controlling the intelligent equipment to be detected and/or the target to be identified to move in the environment to be detected;

the processing module is used for controlling the parameters of the camera and/or processing the image output by the camera to obtain a processed image when the intelligent device to be detected and/or the target to be identified move in the environment to be detected;

the acquisition module is used for acquiring attribute data of each environmental object in the processed image;

and the marking module is used for marking the processed image based on the attribute data to obtain a simulation image.

In a third aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to the first aspect when executing the computer program.

In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect.

In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by one or more processors, performs the steps of the method of the first aspect as described above.

Compared with the prior art, the application has the beneficial effects that: firstly, a to-be-detected environment is led into a preset simulation platform, wherein at least one camera and at least one environment object are created in the to-be-detected environment, the environment object comprises a target to be identified and intelligent equipment to be detected, the intelligent equipment to be detected comprises at least one camera, then the intelligent equipment to be detected and/or the target to be identified are controlled to move in the to-be-detected environment, in the process, a processed image is obtained through parameter control of the camera and/or image processing operation of an image output by the camera, attribute data of each environment object in the processed image is obtained, and finally the processed image can be marked based on the attribute data to obtain a simulation image. In the process, manual on-site data acquisition is not needed, the environment is simulated through the simulation platform to generate the simulation image, and the time required by data acquisition can be greatly saved. Moreover, when the demand changes, only need to transform and import different environment to be measured, perhaps carry out different parameter control to the camera, perhaps adopt different image processing operations, can obtain the simulation image that adapts to the demand change, possess very strong expansibility and flexibility. It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flow chart of an implementation of a simulation image generation method provided in an embodiment of the present application;

FIG. 2 is an exemplary diagram of a processed image obtained in an occlusion scene according to an embodiment of the present application;

FIG. 3 is an exemplary diagram of a processed image obtained from a blurred perspective view provided by an embodiment of the present application;

FIG. 4 is an exemplary diagram of a processed image obtained from a blurred near view provided by an embodiment of the present application;

FIG. 5 is an exemplary diagram of a processed image obtained in a dimly lit scene according to an embodiment of the present application;

FIG. 6 is an exemplary diagram of a processed image obtained in a light overexposure scenario according to an embodiment of the present application;

FIG. 7 is a block diagram of a simulation image generation apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

In order to explain the technical solution proposed in the present application, the following description will be given by way of specific examples.

A method for generating a simulation image according to an embodiment of the present application is described below. Referring to fig. 1, the method for generating a simulation image includes:

step 101, importing an environment to be tested into a preset simulation platform.

In the embodiment of the application, a simulation platform can be used for simulating the actual environment required by data acquisition. For the case that the intelligent device to be tested is a robot, as an example only, the simulation platform may be an Omniverse Isaac Sim simulation platform, which is a simulation platform developed and applied by england corporation for an Artificial Intelligence (AI) robot. The Omniverse Isaac Sim simulation platform is integrated with a real-time Ray Tracing (RTX) technology and can provide photo-level graphical rendering based on visual angle of a visual sensor; and a Physics engine development kit PhysxSDK for rigid body kinematics calculation based on a Graphic Processing Unit (GPU) is also integrated, so that rapid and stable physical simulation can be provided. In addition, it can provide the data transmission interface of Robot Operating System (ROS) and Robot software development kit Isaac SDK. In addition, the user can use a Python-based script interface and an extension system to customize the interface and the function of the simulation platform.

For the Omniverse Isaac Sim simulation platform, the adopted Scene files are specifically Universal Scene Description (USD) of the company pictex and Material Description Library (MDL) of the company invida. The USD is provided with rich scene representation interfaces (API), and complex function operation can be given to the scene by directly modifying the API; the MDL can be used for describing a physical material format of a real-time ray tracing physical rendering solution, and can ensure the basic consistency of materials during multi-application interaction. In addition, the files in the two formats are easy to transmit, and the service performance of remote file transmission is effectively ensured.

When the electronic equipment is introduced into an environment to be detected by the simulation platform, at least one camera and at least one environment object are established in the introduced environment to be detected, wherein the environment object comprises an object to be identified and intelligent equipment to be detected, and the intelligent equipment to be detected comprises at least one camera. That is, there are two kinds of cameras involved in the environment to be measured, one is a camera created at some fixed position in the environment to be measured, and the other is a camera on the intelligent device to be measured. In fact, the intelligent device under test is also created by means of import. For convenience of explaining the scheme of the embodiment of the present application, the following description will be given by taking the simulation platform as the Omniverse Isaac Sim simulation platform and the intelligent device to be tested as the robot, and each step will be described.

The electronic device can import the USD file of the environment to be tested into the simulation platform, that is, the import of the environment to be tested can be realized, and the environment to be tested can be an environment of a robot world cup RoboCup tournament, as an example only. Of course, the corresponding USD file of the environment to be tested may also be built and imported according to the specific requirements of different projects, so as to implement the replacement of the environment to be tested.

The electronic device can also adjust initial position data of the robot, write the initial position data as a parameter into an existing USD file of the robot (the USD file contains information of shapes, materials, shapes and the like of all components of the robot, information of speeds, initial positions and the like of the robot), and import the USD file into the simulation platform, so that the import of the robot (namely the intelligent device to be tested) can be realized. Besides, the electronic equipment can adjust the parameters of the camera of the robot, so that the parameters of the camera are consistent with the parameters of the ZED camera used by the real robot. Specifically, the parameters mainly include: the parameters include, but are not limited to, focus parameters, field size parameters, and shutter parameters.

The use of the USD file can be understood by the following description: the USD file of the environment under test is similar to the main function (main function) in the code. All movable environmental objects (such as football, robot and the like) in the environment to be measured also have corresponding USD files. The USD file of the environment to be tested can call other USD files, and other written functions are called similarly in the main function. While the fixed environment objects (e.g. goals and lawns) in the environment to be tested resemble non-calling declarative statements in the main function. Therefore, it can be understood that the information such as the initial position of the object in the called USD file is similar to the parameter in the called function, and may be set as a default value or may be modified when the function is called.

Specifically, after the environment to be measured is introduced, the environment to be measured can be more perfected by the following method:

one way is to set up obstacles and obstructions in the environment to be measured as displayed by the simulation platform, which can simulate the occlusion impact factors in the tracking of visual targets. As shown in fig. 2, it can be simulated that a football (i.e. an object to be recognized) is occluded by a left character in a foreground through a simulation platform.

The other mode is that a plurality of cameras are created at different positions and/or different angles in the environment to be measured displayed by the simulation platform; certainly, because the robot also has a camera, the recording of the dynamic motion condition of the target under different visual angles can be simulated by calling the camera on the robot. For example only, when the environment to be measured is a RoboCup match, the camera of the robot may be set up in advance according to the RoboCup match, and only the camera of the robot needs to be finely adjusted in the RoboCup environment displayed by the simulation platform, so that the robot can be directly called. It will be appreciated that since the RoboCup tournament limits the use of only the left eye camera of the robot, no additional cameras are added to the robot in this example. If other items have specific requirements, the camera can be added on the robot body, so that different visual angle conditions of the robot are increased.

The third mode is to set up a wood board slope with adjustable height in the environment to be measured displayed by the simulation platform. When the environment to be detected is a RoboCup big game, the target to be identified is a football, and the football can roll down from the slope subsequently to obtain the initial speed of the sports.

And 102, controlling the intelligent device to be detected and/or the target to be identified to move in the environment to be detected.

In the embodiment of the application, the electronic device can control the robot and/or the target to be recognized to move in the environment to be detected in order to simulate various conditions which may occur in the actual environment.

For the robot, the robot can be controlled to walk in all directions and/or rotate by a head motor in the environment to be measured displayed by the simulation platform. On one hand, the fuzzy jitter of the camera of the robot during dynamic motion can be simulated through the jitter of the robot during moving; on the other hand, the robot moves to different positions to simulate the influence factors of moving the tracking target out of sight, scale change and the like in the visual target tracking.

For the target to be recognized, the target to be recognized can obtain the initial motion speed through the third way of perfecting the environment to be detected, which is described in the step 101, so that the target to be recognized is in a continuous motion state, and a richer simulation image is generated subsequently.

And 103, when the intelligent device to be detected and/or the target to be recognized move in the environment to be detected, obtaining a processed image by controlling parameters of the camera and/or performing image processing operation on the image output by the camera.

When the intelligent device to be detected and/or the target to be identified move in the environment to be detected, each camera can start working, namely image acquisition is carried out. In the simulation platform, various image processing operations may be performed to obtain processed images of different image effects. Of course, the parameters of the camera may also be adjusted to obtain processed images with different image effects. It is understood that

steps

102 and 103 are both intended to simulate images in non-laboratory ideal environments, such as: simulating images with the problems of blurring or picture flickering and the like caused by the shaking of a camera of the robot due to the movement of the robot; simulating an image with dim light or overexposure and other problems caused by weather or direct illumination and other reasons.

In an application scenario, the image processing operation on the image output by the camera may be: noise points are set in the image output by the camera. This may result in a reduced sharpness of the processed image being obtained and increased complexity.

In another application scenario, the image processing operation on the image output by the camera may be: and adjusting parameters which can influence the motion blur of the target to be recognized and are contained in the image output by the camera. For example only, the parameters may include a Blur Diameter parameter (blu Diameter frame) and an Exposure parameter (Exposure frame). In practice, the filtering diameter, and thus the degree of blurring of the object to be recognized, may be adjusted by adjusting the values of these two parameters. By way of example only, specific blur effects may be seen in the soccer ball of fig. 3 and 4. As shown in fig. 3, by using the simulation platform, a processed image obtained from a blurred perspective view when a soccer ball (i.e., an object to be recognized, at the bottom left of fig. 3) moves rapidly can be simulated. As shown in fig. 4, a processed image obtained under a blurred close-range view angle when a soccer ball (i.e., an object to be recognized) moves rapidly can be simulated by the simulation platform.

In another application scenario, the parameter control of the camera may be: and controlling shutter parameters of the camera. This can change the illumination brightness of the image output by the camera, whereby processed images under different illumination conditions can be obtained. As shown in fig. 5, the processed image obtained when the light is dark can be simulated by the simulation platform. As shown in fig. 6, the processed image obtained by the light overexposure can be simulated by the simulation platform. Of course, the influence of the influence factors such as scale change and motion blur on the image output by the camera can also be simulated by combining the control on the focal length parameter and/or the view field parameter of the camera, so as to obtain the corresponding processed image.

And 104, acquiring attribute data of each environmental object in the processed image.

In the embodiment of the application, the Omniversae Isaac Sim simulation platform has a recording function Synthetic Data Recorder, can be recorded through a manual visual window, and can also be directly operated through a python script. Through the recording function, the processed image can be extracted, and simultaneously, the attribute data of each environmental object in the processed image can be extracted. Specifically, the attribute data includes state data of the environmental object and position data, where the state data is used to indicate whether the environmental object exists in the corresponding processed image, and the position data is used to indicate coordinates of the environmental object in the corresponding processed image when the environmental object exists in the corresponding processed image.

By way of example only, when the environment to be measured is a RoboCup tournament, the environment object includes not only a soccer ball (i.e., an object to be recognized), but also a person, a robot, a goal, and a lawn, and even a ceiling, a ground covered by the lawn, and the like. Each environmental object has attribute data and can be located by a unique object number and a different class (each class having a unique class number). For example, the object number of a soccer Ball is 0002, the category is Ball; the object numbers of the goals on both sides are 0004 and 0012 respectively, and the categories are Goal. The attribute data obtained by the electronic device is actually stored in a file (i.e., a. NYP file) in the NYP format, where each line of the file corresponds to an environmental object in the environment to be measured, and the format of the attribute data of an environmental object may be (object number, category number, x1, y1, x2, y2) for a processed image. Wherein x1 and y1 are respectively the abscissa and the ordinate of the top left vertex of the calibration rectangular frame of the environmental object; x2 and y2 are the abscissa and ordinate, respectively, of the lower right vertex of the calibrated rectangular box of the environmental object. If the environmental object is out of the field of view of the camera, x1, y1, x2, and y2 of the environmental object can be set to some predetermined fixed value to indicate that the environmental object is not present in the corresponding processed image obtained based on the camera.

For example only, if the attribute data of the extracted soccer ball for a certain processing image is (1081,2,641,189,658,207), the attribute data expresses the following meaning: the object number of the football in the environment to be measured is 1081, the classification number is 2, and the corresponding classification is Ball. In the processed image, the coordinates of the upper left vertex and the coordinates of the lower right vertex of the calibration rectangular frame of the soccer ball are (641,189) and (658,207), and the width and height of the calibration rectangular frame of the soccer ball can be obtained by these coordinates.

And 105, labeling the processed image based on the attribute data to obtain a simulated image.

In the embodiment of the application, data cleaning and extraction operations can be performed on the attribute data, and the processed image is labeled based on the attribute data after the data cleaning and extraction, so that accurate groudtruth data required in training of the visual algorithm model can be obtained. It can be understood that the simulation image is a processed image with labeled data, i.e. a labeled processed image. Because the labeling operation is executed by the electronic equipment, the time consumption of labeling can be reduced, and the condition of data labeling errors can be reduced.

Specifically, the processed image obtained in the present embodiment is a file in PNG format (i.e.,. PNG file), and the attribute data obtained is a file in NPY format (i.e.,. NPY file). In addition, the Omniverse Isaac Sim simulation platform may also provide depth information and object segmentation information, which are not described herein again. It will be appreciated that each processed image (. png file) has attribute data (. npy file) for the corresponding environmental object, i.e., the. png file has a one-to-one correspondence with the. npy file. In addition, the png and npy files obtained from the same camera are arranged and numbered in time sequence. The following is a brief description of the process of data cleansing and extraction operations:

the png file and the npy file, i.e., the processed image and the attribute data, which are mixed together based on the same camera can be separated by a preset python script. The png file may be stored in a predetermined image folder, and the npy file may be stored in a predetermined data folder. It will be appreciated that all the processed images in the image folder are combined in time sequence, i.e., Synthetic Data Recorder is based on video recorded by a camera; the data folder stores attribute data (i.e., status data and position data) of all environment objects in the video arranged in time sequence.

For each npy file, attribute data of an environmental object to be extracted can be extracted based on the classification number of the environmental object. Of course, if the environmental object has a plurality of similar objects in the environment to be measured, the object number of the environmental object needs to be considered. For example, if the environmental object to be extracted is a soccer ball, the attribute data corresponding to the soccer ball can be extracted from the. npy file by using the classification number 2 of the soccer ball (or adding the object number 1081 of the soccer ball). The width w and the height h of the target rectangular frame can be obtained by performing subtraction on the upper left vertex coordinates and the lower right vertex coordinates of the labeling rectangular frame included in the attribute data. Therefore, the most common label format (x, y, w, h) in the target tracking can be formed, and the most common label format respectively represents the coordinates (x, y) of the top left vertex of the target rectangular frame and the side length (w, h) of the target rectangular frame. According to the sequence of the file numbers of npy from small to large, the attribute data of the football in each npy file is extracted to form a football label file (groudtruth file), and each line in the label file represents the position and size information of the football in each frame of the processed image according to time sequence.

It can be understood that, under the condition of manually labeling data, on one hand, a person needs to perform operations such as visually recognizing an object, judging the position of the object, manually drawing a labeling rectangular frame of the object and the like on a processed image one by one, and then extract the position of the labeling rectangular frame in the labeled processed image to form a labeling file. Under the condition of large data volume, the manual labeling time is correspondingly increased, and the long-time labeling may cause fatigue of human vision, so that the labeling error rate is high. On the other hand, each operation in the labeling process depends on the judgment of the person, which may result in different target rectangular frames being obtained when different persons label data of the same processed image. And when the object is shielded, the probability of wrong labeling is higher. The simulation image is generated based on the Omniverse Isaac Sim simulation platform, so that the possibility of marking errors is basically eliminated, time consumption is extremely short, and the data acquisition efficiency is greatly improved.

In some embodiments, after step 105, the following operations may also be performed:

and A1, training the visual algorithm model to be trained through the simulation image.

Because the simulation images are labeled, the simulation images can be put into the training process of the visual algorithm model to be trained. Specifically, the simulation image may be classified first, for example, according to 6: 2: 2, randomly dividing the simulation image into a training set, a verification set and a test set. The electronic device can train the visual algorithm model to be trained on a high-performance graphics processor through a training set, a verification set and a test set. It should be noted that the embodiment of the present application does not limit the specific type of the visual algorithm model; that is, steps a1 through a4 may be applied to any kind of visual algorithm model.

And A2, testing the accuracy of the trained visual algorithm model to obtain the accuracy of the trained visual algorithm model under at least one influence factor.

After the trained visual algorithm model is obtained through step a1, the accuracy of the trained visual algorithm model may be tested under at least one influence factor, so as to obtain the accuracy of the trained visual algorithm model under each influence factor. It is understood that the influence factor can be set according to the specific application scenario (i.e. training target) of the visual algorithm model. For example, if the specific application scenario in the foregoing example is RoboCup tournament, the impact factor may be set as: eight kinds of motion blur, illumination change, moving out of sight, camera motion, scale change, occlusion, rapid motion and rotation. Rotation can be divided into in-plane rotation (e.g., tilting the head to the right or left) and out-of-plane rotation (e.g., lowering or raising the head).

In addition, the influencing factors may include: deformation (e.g., people change from standing to crouching into a ball), low resolution (typically less than 800 x 600, the resolution used by the camera of RoboCup tournament is typically 1280 x 720), and background clusters (e.g., the background is crossstreet of people and people), etc. In this example, since the target to be recognized is a soccer ball, the deformation and the rotation can be combined into one category, and the influence factor of the deformation does not need to be considered separately. The application scene is Robocup competition, the condition of the camera is basically stable, and therefore the influence factor of low resolution can be not considered. The competition area of the competition is the artificial lawn, and except the robots and referees of the competition parties and the followers behind each robot, other people in the competition area are limited (the other people are not allowed to enter the competition area), so that the influence factor of the background cluster can be not considered.

A3, detecting whether the target influence factor exists.

The electronic device can preset an accuracy threshold, and compare the accuracy of the trained visual algorithm model under each influence factor with the accuracy threshold. If the accuracy of the trained visual algorithm model under a certain influence factor is lower than the accuracy threshold, the influence factor can be determined as a target influence factor.

And A4, if the target influence factor exists, taking the trained visual algorithm model as a new visual algorithm model to be trained, and returning to execute the step of controlling the intelligent device to be tested and/or the target to be recognized to move in the environment to be tested and the subsequent steps based on the target influence factor until the target influence factor does not exist.

It can be understood that, in the generation process of the simulation image, the image processing operation and the parameter control on the camera involved in step 103, and the movement of the to-be-detected smart device and the movement of the to-be-recognized target involved in step 102 all affect the scene corresponding to the simulation image. It should be noted that the scene referred to herein does not refer to the environment represented by the image, but refers to the image under a certain influence factor, for example, a dim light scene and an overexposed light scene under the influence of the illumination change, and an occluded scene under the influence factor of occlusion. Thus, the operation of returning to execute the step of controlling the intelligent device under test and/or the target to be identified to move in the environment under test and the subsequent steps based on the target influence factor can be embodied as follows: firstly, determining a target simulation image in the simulation image according to the target influence factor, namely, finding the simulation image influenced by the target influence factor as the target simulation image; then, as various scenes are possible under one target influence factor, the target simulation images need to be analyzed, the accuracy of the visual algorithm model is determined to be lower in which scene under the target influence factor, and therefore the target scene can be determined according to the analysis result; and then, returning to execute the step of controlling the intelligent device to be tested and/or the target to be recognized to move in the environment to be tested and the subsequent steps based on the target scene, namely, more new simulation data under the target scene are generated through the simulation platform, and the new visual algorithm model to be trained is trained according to the new simulation data until the obtained trained visual algorithm model is stopped when the precision is higher under various influence factors.

By way of example only, the lighting impact factors may include the previously illustrated dimly lit scenes, the previously illustrated overexposed scenes, and the previously illustrated complex light changing scenes (e.g., flickering changing lights in a stage). Assume that the target influence factor is determined to be the illumination influence factor by step a 3; then, in step a4, it is further analyzed that the accuracy of the visual algorithm model is low in a dim light scene, and the accuracy in an overexposed light scene is ideal; the electronic device can generate a large amount of new simulation images under the dim light scene through the steps 101-105 to further train the visual algorithm model, so that the accuracy of the visual algorithm model under the dim light scene and the light exposure scene can be higher at the later stage, and the accuracy of the visual algorithm model under the illumination influence factor is improved.

Therefore, through the embodiment of the application, on-site data acquisition is not needed manually, but the environment is simulated through the simulation platform to generate the simulation image, so that the time required by data acquisition can be greatly saved. Moreover, when the requirement changes, the simulation image adaptive to the requirement change can be obtained only by changing and importing different environments to be tested, or performing different parameter control on the camera, or adopting different image processing operations, so that the simulation image generation method provided by the embodiment of the application has strong expansibility and flexibility.

Corresponding to the simulation image generation method proposed in the foregoing, the embodiment of the present application provides a simulation image generation apparatus. Referring to fig. 7, a simulation image generating apparatus 700 according to an embodiment of the present application includes:

an importing module 701, configured to import an environment to be tested into a preset simulation platform, where at least one camera and at least one environment object are created in the environment to be tested, the environment object includes a target to be identified and an intelligent device to be tested, and the intelligent device to be tested includes at least one camera;

a control module 702, configured to control the to-be-tested smart device and/or the to-be-identified target to move in the to-be-tested environment;

a processing module 703, configured to obtain a processed image through parameter control on the camera and/or image processing operation on an image output by the camera when the to-be-detected smart device and/or the to-be-identified target moves in the to-be-detected environment;

an obtaining module 704, configured to obtain attribute data of each environmental object in the processed image;

the labeling module 705 is configured to label the processed image based on the attribute data to obtain a simulated image.

Optionally, the image processing operation on the image output by the camera includes:

and setting a noise point in the image output by the camera.

Optionally, the parameter control of the camera includes:

and controlling the shutter parameters of the camera.

and adjusting parameters which can influence the motion blur of the target to be recognized and are contained in the image output by the camera.

Optionally, the attribute data includes state data and position data of an environmental object, where the state data is used to indicate whether the environmental object exists in the corresponding processed image, and the position data is used to indicate coordinates of the environmental object in the corresponding processed image when the environmental object exists in the corresponding processed image.

Optionally, the simulation image generating apparatus 700 further includes:

the training module is used for training the visual algorithm model to be trained through the simulation image after the simulation image is obtained;

the testing module is used for testing the accuracy of the trained visual algorithm model to obtain the accuracy of the trained visual algorithm model under at least one influence factor;

the detection module is used for detecting whether a target influence factor exists, wherein the accuracy of the trained visual algorithm model under the target influence factor is lower than a preset accuracy threshold;

and an alternation module, configured to, if the target impact factor exists, take the trained visual algorithm model as a new visual algorithm model to be trained, and trigger operation of the control module 702 based on the target impact factor until the target impact factor does not exist.

Optionally, a scene corresponding to each processed image is determined based on an image processing operation corresponding to the processed image, a parameter of a camera corresponding to the processed image, movement of the to-be-detected smart device, and/or movement of the to-be-identified target; the above-mentioned alternating module, comprising:

a first determining unit, configured to determine a target simulation image in the simulation image according to the target influence factor;

the second determining unit is used for analyzing the target simulation image and determining a target scene in a scene corresponding to the target simulation image according to an analysis result;

an alternation unit, configured to trigger operation of the control module 702 based on the target scenario.

An embodiment of the present application further provides an electronic device, please refer to fig. 8, where the electronic device 8 in the embodiment of the present application includes: a memory 801, one or more processors 802 (only one shown in fig. 8), and computer programs stored on the memory 801 and executable on the processors. The memory 801 is used for storing software programs and units, and the processor 802 executes various functional applications and data processing by running the software programs and units stored in the memory 801, so as to acquire resources corresponding to the preset events. Specifically, the processor 802 realizes the following steps by running the above-described computer program stored in the memory 801:

acquiring attribute data of each environmental object in the processed image;

Assuming that the above is the first possible implementation manner, in a second possible implementation manner provided on the basis of the first possible implementation manner, the image processing operation on the image output by the camera includes:

and setting a noise point in the image output by the camera.

In a third possible embodiment based on the first possible embodiment, the parameter control of the camera includes:

and controlling the shutter parameters of the camera.

In a fourth possible implementation form that is provided on the basis of the first possible implementation form, the image processing operation on the image output by the camera includes:

In a fifth possible embodiment based on the first possible embodiment, the attribute data includes state data of an environmental object indicating whether or not the environmental object exists in the corresponding processed image and position data indicating coordinates of the environmental object in the corresponding processed image when the environmental object exists in the corresponding processed image.

In a sixth possible implementation manner, which is provided based on the first possible implementation manner, the second possible implementation manner, the third possible implementation manner, the fourth possible implementation manner, or the fifth possible implementation manner, after the simulated image is obtained, the processor 802 further implements the following steps when running the computer program stored in the memory 801:

training a visual algorithm model to be trained through the simulation image;

performing accuracy test on the trained visual algorithm model to obtain the accuracy of the trained visual algorithm model under at least one influence factor;

detecting whether a target influence factor exists, wherein the accuracy of the trained visual algorithm model under the target influence factor is lower than a preset accuracy threshold;

and if the target influence factor exists, taking the trained visual algorithm model as a new visual algorithm model to be trained, and returning to execute the step of controlling the intelligent device to be tested and/or the target to be recognized to move in the environment to be tested and subsequent steps based on the target influence factor until the target influence factor does not exist.

In a seventh possible implementation manner provided on the basis of the sixth possible implementation manner, a scene corresponding to each processed image is determined based on an image processing operation corresponding to the processed image, parameters of a camera corresponding to the processed image, movement of the smart device to be identified, and/or movement of the target to be identified; the step of controlling the to-be-tested intelligent device and/or the to-be-identified target to move in the to-be-tested environment and the subsequent steps are executed based on the target influence factor, and the steps include:

determining a target simulation image in the simulation images according to the target influence factor;

analyzing the target simulation image, and determining a target scene in a scene corresponding to the target simulation image according to an analysis result;

and returning to execute the step of controlling the intelligent equipment to be tested and/or the target to be identified to move in the environment to be tested and the subsequent steps based on the target scene.

It should be understood that in the embodiments of the present Application, the Processor 802 may be a CPU, and the Processor may be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 801 may include read-only memory and random access memory, and provides instructions and data to the processor 802. Some or all of memory 801 may also include non-volatile random access memory. For example, the memory 801 may also store device class information.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of external device software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules or units is only one logical functional division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer-readable storage medium may include: any entity or device capable of carrying the above-described computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer readable Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signal, telecommunication signal, software distribution medium, etc. It should be noted that the computer readable storage medium may contain other contents which can be appropriately increased or decreased according to the requirements of the legislation and the patent practice in the jurisdiction, for example, in some jurisdictions, the computer readable storage medium does not include an electrical carrier signal and a telecommunication signal according to the legislation and the patent practice.

The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method for generating a simulation image, comprising:

importing a to-be-tested environment in a preset simulation platform, wherein at least one camera and at least one environment object are created in the to-be-tested environment, the environment object comprises a target to be identified and to-be-tested intelligent equipment, and the to-be-tested intelligent equipment comprises at least one camera;

acquiring attribute data of each environmental object in the processed image;

and labeling the processed image based on the attribute data to obtain a simulation image.

2. The method of generating a simulated image according to claim 1, wherein said image processing operation on the image output by said camera comprises:

and setting a noise point in the image output by the camera.

3. The method for generating a simulation image according to claim 1, wherein the controlling of the parameters of the camera comprises:

and controlling the shutter parameters of the camera.

4. The method of generating a simulated image according to claim 1, wherein said image processing operation on the image output by said camera comprises:

5. The simulation image generation method according to claim 1 to the above, wherein the attribute data includes state data of the environmental object and position data, wherein the state data is used to indicate whether the environmental object exists in the corresponding processed image, and the position data is used to indicate coordinates of the environmental object in the corresponding processed image when the environmental object exists in the corresponding processed image.

6. The simulation image generation method according to any one of claims 1 to 5, wherein after the obtaining of the simulation image, the simulation image generation method further comprises:

training a visual algorithm model to be trained through the simulation image;

carrying out accuracy test on the trained visual algorithm model to obtain the accuracy of the trained visual algorithm model under at least one influence factor;

7. The simulation image generation method according to claim 6, wherein the scene corresponding to each processed image is determined based on the image processing operation corresponding to the processed image, the parameter of the camera corresponding to the processed image, the movement of the intelligent device to be detected, and/or the movement of the target to be identified; the step of controlling the intelligent device to be tested and/or the target to be identified to move in the environment to be tested and the subsequent steps are executed based on the target influence factor, and the steps comprise:

and returning and executing the step of controlling the intelligent equipment to be tested and/or the target to be identified to move in the environment to be tested and the subsequent steps based on the target scene.

8. A simulation image generation apparatus, comprising:

the processing module is used for controlling the parameters of the camera and/or processing the image of the image output by the camera to obtain a processed image when the intelligent device to be detected and/or the target to be identified move in the environment to be detected;

9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.