CN111476834B - Method and device for generating image and electronic equipment - Google Patents
Method and device for generating image and electronic equipment Download PDFInfo
- Publication number
- CN111476834B CN111476834B CN201910068605.7A CN201910068605A CN111476834B CN 111476834 B CN111476834 B CN 111476834B CN 201910068605 A CN201910068605 A CN 201910068605A CN 111476834 B CN111476834 B CN 111476834B
- Authority
- CN
- China
- Prior art keywords
- image
- light source
- object model
- preset object
- depth
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/586—Depth or shape recovery from multiple images from multiple light sources, e.g. photometric stereo
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
Abstract
Disclosed is a method of generating an image, comprising: determining reflection information of each pixel point in the first image; determining light source information in a scene where the first image is shot according to the first image, a surface normal map corresponding to the first image and the reflection information; editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image; and obtaining a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model. Different preset object models can be added in different positions of the first image according to the need, a large number of second images and second depth images can be obtained, time and energy for training the neural network can be saved, cost can be reduced, possibility of errors of the second depth images can be reduced, and additional countermeasure training is avoided.
Description
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating an image, and an electronic device.
Background
In recent years, machine learning techniques centered on deep learning have attracted attention. Through deep learning, the distance between the current vehicle and surrounding vehicles, pedestrians and obstacles can be judged, so that automatic driving of the automobile is gradually possible. In all the deep learning methods, the depth estimation algorithm based on the monocular image has the advantages of convenient deployment, low calculation cost and the like, and is in growing attention of academia and industry.
The existing depth estimation algorithm based on the monocular image requires a large number of images and depth images (data with depth labels) to train a depth estimation neural network model, and the acquisition of the training data is time-consuming and labor-consuming, has high cost, is influenced by factors such as noise, and is easy to have errors.
Disclosure of Invention
In order to solve the technical problems, the embodiment of the application provides a method and a device for generating an image and electronic equipment.
According to one aspect of the present application, there is provided a method of generating an image, comprising: determining reflection information of each pixel point in the first image; determining light source information in a scene where the first image is shot according to the first image, a surface normal map corresponding to the first image and the reflection information; editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image; and obtaining a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model.
According to another aspect of the present application, there is provided an apparatus for generating an image, comprising: the reflection information determining module is used for determining reflection information of each pixel point in the first image; the light source determining module is used for determining light source information in a scene where the first image is shot according to the first image, a surface normal map corresponding to the first image and the reflection information; the second image acquisition module is used for editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image; and the second depth image acquisition module is used for acquiring a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model.
According to another aspect of the present application, there is provided a computer readable storage medium storing a computer program for performing any one of the methods described above.
According to another aspect of the present application, there is provided an electronic device including: a processor; a memory for storing the processor-executable instructions; the processor is configured to perform any of the methods described above.
According to the method for generating the image, different preset object models can be added to different positions of the first image according to requirements, so that a large number of second images and second depth images can be obtained through the different preset object models, and further, the large number of second images and second depth images are used as annotation data for training the depth estimation neural network model, and the depth estimation neural network model is trained, so that time and energy for training the neural network can be saved; the cost can be reduced because the collection of a large amount of sample data is avoided; in addition, the second depth image is obtained through the real first image, so that the possibility of errors of the second depth image can be reduced, and the additional countermeasure training on the annotation data is avoided.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 is a schematic diagram of a scenario of an exemplary system of the present application.
Fig. 2 is a flow chart of a method for generating an image according to an exemplary embodiment of the present application.
Fig. 3 is a schematic flow chart of determining light source information in a scene where a first image is captured according to the first image, a surface normal map corresponding to the first image, and reflection information according to an exemplary embodiment of the present application.
Fig. 4 is a schematic flow chart of editing and rendering a preset object model to be added in a first image according to the first image, a surface normal map, reflection information and light source information according to an exemplary embodiment of the present application, so as to obtain a second image.
Fig. 5 is a schematic flow chart of obtaining a second depth image corresponding to a second image according to a first depth image corresponding to a first image and a preset object model according to an exemplary embodiment of the present application.
Fig. 6 is a flowchart illustrating determining pixel coordinates of a preset object model according to camera parameters of a first image and three-dimensional coordinates of the preset object model according to an exemplary embodiment of the present application.
Fig. 7 is a flow chart of a method of generating an image according to another exemplary embodiment of the present application.
Fig. 8 is a schematic structural view of an apparatus for generating an image according to an exemplary embodiment of the present application.
Fig. 9 is a schematic diagram of a structure of a light source determining module in an apparatus for generating an image according to an exemplary embodiment of the present application.
Fig. 10 is a schematic structural view of a second image acquisition module in an apparatus for generating an image according to an exemplary embodiment of the present application.
Fig. 11 is a schematic structural view of a second depth image acquiring module in an apparatus for generating an image according to an exemplary embodiment of the present application.
Fig. 12 is a schematic diagram of a structure of a pixel coordinate determining unit in an apparatus for generating an image according to an exemplary embodiment of the present application.
Fig. 13 is a schematic structural view of an apparatus for generating an image according to another exemplary embodiment of the present application.
Fig. 14 is a block diagram of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
Summary of the application
Currently, images may be synthesized by 3D (dimension) engine rendering while generating depth images. However, rendering a composite image with a 3D engine is significantly different from a truly taken image, and training a depth estimation neural network model using such images typically requires introducing additional countermeasure training to reduce the impact of the differences from the truly taken image.
Aiming at the technical problems, the basic idea of the application is to provide a method, a device and electronic equipment for generating images, wherein different preset object models can be added in different positions of a first image according to the need, so that a large number of second images and second depth images can be obtained through different preset object models, and further, the second images and the second depth images are used as annotation data for training a depth estimation neural network model, and the depth estimation neural network model is trained, so that the time and energy for training the neural network can be saved; the cost can be reduced because the collection of a large amount of sample data is avoided; in addition, the second depth image is obtained through the real first image, so that the possibility of errors of the second depth image can be reduced, and the additional countermeasure training on the annotation data is avoided.
It should be noted that the application range of the present application is not limited to the technical field of image processing. For example, the technical scheme provided by the embodiment of the application can be applied to other intelligent mobile devices for providing technical support for image processing of the intelligent mobile devices.
Various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
Exemplary System
FIG. 1 is a schematic diagram of a scenario of an exemplary system of the present application. As shown in fig. 1, parameter estimation is performed on a first image (the first image may be an RGB image or a gray image); performing light source estimation according to the first image and the result of parameter estimation (or according to the first image, the result of parameter estimation and the first depth image); editing and rendering according to the light source estimation result, the first image, the first depth image and the preset object model to obtain a second image and a second depth image. Detailed description of specific implementation process see below for method and apparatus embodiments.
Exemplary method
Fig. 2 is a flow chart of a method for generating an image according to an exemplary embodiment of the present application. The method for generating the image provided by the embodiment of the application can be applied to the technical field of image processing of automobiles, and can also be applied to the functional field of image processing of intelligent robots. As shown in fig. 2, the method for generating an image according to the embodiment of the present application includes the following steps:
step 101, determining reflection information of each pixel point in the first image.
It should be noted that the first image may be an RGB image or a gray image, and the first image may be a sample image in a sample library.
The reflection information includes diffuse reflection parameters and specular reflection parameters. In this embodiment, the reflection information of each pixel point may refer to a diffuse reflection parameter corresponding to the pixel point. It should be noted that diffuse reflection refers to a phenomenon that light is irregularly reflected by a rough surface in all directions, and is used to indicate how the material of an object reflects illumination. In the present embodiment, the diffuse reflection parameter r (x, y) of each pixel point (x, y) in the first image may be determined by the following formula:
wherein r (x, y) represents the diffuse reflection parameter of the pixel point (x, y),i x represents the gradient, i, of the pixel point (x, y) in the horizontal direction y The gradient of the pixel point (x, y) in the vertical direction is represented, T is a preset threshold value, T is more than or equal to 0 and less than or equal to 255, p is a natural number, and the value is generally 1 or 2.
And 102, determining light source information in a scene where the first image is shot according to the first image, the surface normal map corresponding to the first image and the reflection information.
In an embodiment, the light source information may include light source position and light source intensity, etc. Light source information in a scene where the first image is taken, that is, light source information in a scene where the first image is taken when the first image is taken, for example: the scene where the first image is shot is in a room, the scene in the room is shot by a camera to obtain the first image, the first image comprises a window in the room and a desk lamp in a lighting state, and the light source information can be regarded as the light source information through sunlight of the window and the desk lamp in the lighting state.
In an embodiment, the first image may be input into a trained preset normal map to extract a neural network, so as to obtain a surface normal map corresponding to the first image. The preset normal map extraction neural network can be obtained by training a convolutional neural network through a large number of sample images.
And step 103, editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image.
In an embodiment, the preset object model may be a person, an animal, a machine, etc., and the preset object model may be added to the first image according to the actual application status, so as to perform editing rendering, and obtain the second image. For example: the scene where the first image is shot is a room, a window and a desk lamp in a lighting state are arranged in the room, if the preset object model is a three-dimensional model of a cat, the three-dimensional model of the cat can be added below the window in the first image, and the three-dimensional model of the cat is edited and rendered to obtain a second image, so that the three-dimensional model of the cat can be added in the second image.
And 104, obtaining a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model.
It should be noted that the first depth image corresponds to the first image, and the first depth image may be a sample depth image in the sample library.
According to the method for generating the image, different preset object models can be added to different positions of the first image according to the need, so that a large number of second images and second depth images can be obtained through the different preset object models, and further, the large number of second images and second depth images are used as annotation data for training the depth estimation neural network model, and the depth estimation neural network model is trained, so that time and energy for training the neural network can be saved; the cost can be reduced because the collection of a large amount of sample data is avoided; in addition, the second depth image is obtained through the real first image, so that the possibility of errors of the second depth image can be reduced, and the additional countermeasure training on the annotation data is avoided.
An exemplary embodiment of the present application provides another method of generating an image. The embodiment of the present application extends from the embodiment of fig. 2, and differences between the embodiment of the present application and the embodiment of fig. 2 are mainly described below, which are not repeated. The method for generating the image provided by the embodiment of the application further comprises the following steps:
And determining a surface normal corresponding to each pixel point in the first depth image to obtain a surface normal map corresponding to the first image.
In an embodiment, the surface normal corresponding to each pixel point may be obtained by calculating the normal of the plane in which each pixel point and its surrounding preset pixels are fit in the 3D coordinate system. The number of the preset pixels can be selected according to the actual application condition, which is not particularly limited.
According to the method for generating the image, disclosed by the embodiment of the application, the surface normal map corresponding to the first image can be directly obtained by utilizing the first depth image, so that the implementation process is simple, additional resources are not needed, resources and space can be saved, and the implementation speed is improved.
Fig. 3 is a schematic flow chart of determining light source information in a scene where a first image is captured according to the first image, a surface normal map corresponding to the first image, and reflection information according to an exemplary embodiment of the present application. The embodiment of fig. 3 of the present application is extended from the embodiment of fig. 2 of the present application, and differences between the embodiment of fig. 3 and the embodiment of fig. 2 are mainly described below, which will not be repeated.
As shown in fig. 3, in the method for generating an image according to the embodiment of the present application, the light source information includes a light source position and a light source intensity; determining light source information in a scene where the first image is taken according to the first image, a surface normal map corresponding to the first image and reflection information (i.e. step 102), including:
And 102a, performing image segmentation on the first image to obtain a plurality of image sub-areas.
The first image is subjected to image segmentation to obtain a plurality of image sub-regions (a collection of pixels, also referred to as super-pixels). Super-pixels are small areas composed of a series of pixel points that are adjacent in position and similar in color, brightness, texture, etc. These small areas mostly retain the effective information for further image segmentation and do not generally destroy the boundary information of the objects in the image.
Step 102b, determining the feature vector of each image sub-region by using the surface normal map and the reflection information.
It should be noted that, the feature vector of each image sub-area is determined by using the surface normal map and the reflection information, and may be implemented in any feasible manner according to the actual application condition, which is not limited specifically.
In this embodiment, the reflection information is a diffuse reflection parameter, and the feature vector E of each image sub-region j is determined by using the surface normal map and the reflection information j The following are provided:
wherein E is j (n) the value of the eigenvector representing the image subregion j calculated by the nth operator, S j Representing the region range of the image subregion j, I (x, y) representing the superposition of the surface normal value corresponding to the pixel point (x, y) and the diffuse reflection parameter value, F n (x, y) represents n operators, n takes on the value of 17, comprises 9 texture template operators, 6 edge operators in different directions and 2 color operators, k takes on the value of 2 and 4, represents energy characteristics when k takes on the value of 2, and represents peak characteristics when k takes on the value of 4.
And then calculating the feature vectors of four adjacent image subregions around the image subregion j and the feature vectors of two scales by the formula, and superposing the calculated feature vectors to construct a feature vector of 17 x 2 x 5 x 2=340 dimensions, wherein 17 represents 17 operators according to the sequence from left to right, 2 represents two cases of K value 2 and 4, 5 represents 5 image subregions, and 2 represents 2 scales. It should be noted that, if the image sub-area j is located at the corner position, when there are no four adjacent image sub-areas, the feature vector corresponding to the adjacent image sub-area that is not present is replaced with 0; the two scales, the original scale of the image sub-region j and the scale smaller than the original scale of the image sub-region j, are typically chosen (typically 50% of the scale is chosen).
Step 102c, determining the light source position in the first image according to the feature vector of each image subarea and the preset light source classification neural network.
It should be noted that, the feature vector of each image sub-area (may also be referred to as a superpixel) is used as an input of the preset light source two-classification neural network, whether the image sub-area is a light source or not is used as an output, and if one image sub-area is judged to be a light source, the position of the image sub-area is the light source position. Determining the position of the light source in the first image, i.e. the coordinates of the pixel of the first light source determined as the light source, the coordinates of the pixel of the first light source being used (x l ,y l ) And l is a natural number.
Step 102d, determining the light source intensity in the first image according to the first image and the light source position in the first image.
It should be noted that, according to the first image and the light source position in the first image, the light source intensity in the first image may be determined by any feasible method according to the actual application condition, which is not limited specifically.
In the embodiment of the application, the determination of the light source intensity in the first image is realized by adopting the following formula:
wherein L is l Representing the intensity of the first light source, and pixels representing the pixel point in the first image, I l Represents a pixel point (x l ,y l ) Pixel value of R l (L) represents the pixel point (x) rendered by the first light source L l ,y l ) The pixel value of (2) may be estimated for the light source intensity of the first light source L, the light source intensity of the estimated values which minimizes the above equation, and the estimated value may be used as the light source intensity L l As a result of (a).
According to the method for generating the image, disclosed by the embodiment of the application, the position and the intensity of the light source in the first image can be obtained, so that the second image and the second depth image generated according to the first image are more real and effective.
Fig. 4 is a schematic flow chart of editing and rendering a preset object model to be added in a first image according to the first image, a surface normal map, reflection information and light source information according to an exemplary embodiment of the present application, so as to obtain a second image. The embodiment of fig. 4 of the present application is extended from the embodiment of fig. 3 of the present application, and differences between the embodiment of fig. 4 and the embodiment of fig. 3 are emphasized below, which will not be repeated.
As shown in fig. 4, in the method for generating an image provided in the embodiment of the present application, according to a first image, a surface normal map, reflection information and light source information, a preset object model to be added in the first image is edited and rendered to obtain a second image (i.e. step 103), which includes:
Step 103a, limiting the placement position of the preset object model through the surface normal map.
In an embodiment, the placement position of the preset object model may be constrained by the surface normal map, so as to avoid that the preset object model is placed outside the boundary of the first image.
Step 103b, determining camera parameters of the first image.
The camera parameters include an intra-camera parameter and an extra-camera parameter. The in-camera parameters are parameters related to the characteristics of the camera itself, such as the focal length of the camera, the pixel size, and the like. The off-camera parameters are parameters in the world coordinate system, such as the position, rotation direction, etc. of the camera.
Step 103c, determining pixel coordinates of the preset object model according to the camera parameters of the first image and the three-dimensional coordinates of the preset object model.
The three-dimensional coordinates, that is, three-dimensional cartesian coordinates (x, y, z), are expressions of points in a three-dimensional cartesian coordinate system, where x, y, z are coordinate values of x-axis, y-axis, and z-axis that have a common zero point and are orthogonal to each other, respectively. The pixel coordinates (x, y) are the locations of the pixels in the image.
And 103d, editing and rendering the first image and the preset object model according to the pixel coordinates, the reflection information, the light source position and the light source intensity of the preset object model to obtain a second image.
It should be noted that, the first image and the preset object model are edited and rendered, and the object model is used to replace the object at the corresponding position in the first image, so as to obtain the second image.
According to the method for generating the image, disclosed by the embodiment of the application, the placement position of the preset object model can be limited through the surface normal map, so that the preset object model can be prevented from exceeding the boundary of the first image, and the generated second image is more real and effective.
Fig. 5 is a schematic flow chart of obtaining a second depth image corresponding to a second image according to a first depth image corresponding to a first image and a preset object model according to an exemplary embodiment of the present application. The embodiment of fig. 5 of the present application is extended from the embodiment of fig. 4 of the present application, and differences between the embodiment of fig. 5 and the embodiment of fig. 4 are emphasized below, and are not repeated.
As shown in fig. 5, in the method for generating an image according to the embodiment of the present application, according to a first depth image corresponding to a first image and a preset object model, a second depth image corresponding to a second image is obtained (i.e. step 104), which includes:
step 104a, obtaining a depth value of each pixel point in the preset object model according to the three-dimensional coordinates of the preset object model.
It should be noted that, the z coordinate value of the three-dimensional coordinate (x, y, z) of each point in the preset object model may be used as the depth value of each corresponding pixel point in the preset object model.
And step 104b, obtaining a second depth image according to the first depth image and the depth value of each pixel point in the preset object model.
It should be noted that, the depth value of the pixel point of the same portion as the first depth image in the second depth image is the depth value of the pixel point of the corresponding portion in the first depth image, and the depth value of the pixel point of the preset object model portion in the second depth image is the depth value of the pixel point of the preset object model.
According to the method for generating the image, disclosed by the embodiment of the application, the depth value of each pixel point in the preset object model can be obtained according to the three-dimensional coordinates of the preset object model, and the second depth image can be obtained according to the first depth image and the depth value of each pixel point in the preset object model, so that the method is simple and quick to realize, and the data are real and effective.
Fig. 6 is a flowchart illustrating determining pixel coordinates of a preset object model according to camera parameters of a first image and three-dimensional coordinates of the preset object model according to an exemplary embodiment of the present application. The embodiment of fig. 6 of the present application is extended from the embodiment of fig. 4 of the present application, and differences between the embodiment of fig. 6 and the embodiment of fig. 4 are emphasized below, which will not be repeated.
As shown in fig. 6, in the method for generating an image according to the embodiment of the present application, determining pixel coordinates of a preset object model according to camera parameters of a first image and three-dimensional coordinates of the preset object model (i.e. step 103 c) includes:
step 103c1, setting a reference pixel point of the preset object model.
It should be noted that the pixel at the center of the preset object model may be set as the reference pixel (x 1 ,y 1 )。
Step 103c2, setting the pixel coordinates and depth values of the reference pixel point.
It should be noted that, the pixel coordinates of the reference pixel point may be set by changing the position of the preset object model in the first image to change the coordinates of the preset object model in the pixel coordinate system. Changing the position of the preset object model in the first image may be achieved by means of dragging or the like. According to the actual application condition, a reference pixel point (x 1 ,y 1 ) The depth value D of the depth value is more than 0 and less than or equal to D (x) 1 ,y 1 ) Wherein D (x 1 ,y 1 ) Representing and referencing pixel points (x 1 ,y 1 ) And the depth value of the corresponding pixel point in the corresponding first depth image.
Step 103c3, calculating the three-dimensional coordinates of the reference pixel point by using a preset three-dimensional coordinate calculation formula according to the camera parameters of the first image, the pixel coordinates and the depth values of the reference pixel point.
It should be noted that, the preset three-dimensional coordinate calculation formula may be selected according to the actual application condition, which is not limited.
The preset three-dimensional coordinate calculation formula in this embodiment is:
W(x 2 ,y 2 ,z 2 )=D(x 1 ,y 1 )K -1 [x 1 ,y 1 ,1]
wherein W (x) 2 ,y 2 ,z 2 ) Representing a reference pixel point (x 1 ,y 1 ) K represents a camera reference matrix, D (x 1 ,y 1 ) Representing and referencing pixel points (x 1 ,y 1 ) And the depth value of the corresponding pixel point in the corresponding first depth image.
Step 103c4, calculating the pixel coordinates of each pixel point in the preset object model by using a preset pixel coordinate calculation formula according to the camera parameters of the first image, the three-dimensional coordinates of the reference pixel point, the three-dimensional coordinates of the preset object model, and the relative positions of the reference pixel point and each pixel point in the preset object model.
It should be noted that, the preset three-dimensional coordinate calculation formula may be selected according to the actual application condition, which is not limited.
The preset pixel coordinate calculation formula in this embodiment is:
wherein, (x) t ,y t ) Representing points (x t ,y t ,z t ) Pixel coordinates of (x) 2 ,y 2 ,z 2 ) Representing a reference pixel point (x 1 ,y 1 ) Is a three-dimensional coordinate of Deltax t Represents x t And x 2 The relative position of (also referred to as offset, can take on the value x t -x 2 )、Δy t Representing y t And y is 2 The relative position of (may also be referred to as offset, and may take on the value y t -y 2 )、Δz t Representing z t And z 2 The relative position of (may also be referred to as offset, and may take on the value z t -z 2 ) K represents the camera reference matrix.
According to the method for generating the image, which is disclosed by the embodiment of the application, the pixel coordinates of the preset object model can be obtained, so that the subsequent generation of the second image is facilitated.
Fig. 7 is a flow chart of a method of generating an image according to another exemplary embodiment of the present application. The embodiment of the present application shown in fig. 7 is extended from the embodiments of fig. 2-6, and differences between the embodiment of fig. 7 and the embodiments of fig. 2-6 are emphasized below, which are not repeated.
As shown in fig. 7, in the method for generating an image according to the embodiment of the present application, editing and rendering are performed on a preset object model to be added in a first image according to the first image, a surface normal map corresponding to the first image, reflection information, and light source information, so as to obtain a second image (i.e. before step 103), the method further includes:
step 105, adding a preset object model in the first image.
It should be noted that, the preset object model may be added to the corresponding position in the first image according to the specific content of the first image and the specific content of the preset object model. The preset object model is a 3D model, and can be a person, an animal, a plant, a machine and the like. A large number of object models can be constructed according to actual needs and added into the first image, and a large number of second images and second depth images are generated.
According to the method for generating the image, disclosed by the embodiment of the application, the second image and the second depth image can be generated by adding the preset object model into the first image, and a large amount of sample data is not required to be acquired, so that time and energy can be saved, and the cost can be reduced.
Exemplary apparatus
Fig. 8 is a schematic structural view of an apparatus for generating an image according to an exemplary embodiment of the present application. The device for generating the image provided by the embodiment of the application can be applied to the field of image processing of automobiles and also can be applied to the field of image processing functions of intelligent robots. As shown in fig. 8, an apparatus for generating an image according to an embodiment of the present application includes:
a reflection information determining module 201, configured to determine reflection information of each pixel point in the first image;
a light source determining module 202, configured to determine light source information in a scene where the first image is captured according to the first image, a surface normal map corresponding to the first image, and reflection information;
the second image obtaining module 203 is configured to edit and render a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information, so as to obtain a second image;
The second depth image obtaining module 204 is configured to obtain a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model.
An exemplary embodiment of the present application provides a schematic structural view of the reflection information determination module 201 in an apparatus for generating an image. The embodiment of the present application extends from the embodiment of fig. 8, and differences between the embodiment of the present application and the embodiment of fig. 8 are mainly described below, which are not repeated.
In the apparatus for generating an image according to the embodiment of the present application, the image determining module 201 is further configured to determine a surface normal corresponding to each pixel in the first depth image, so as to obtain a surface normal map corresponding to the first image.
Fig. 9 is a schematic diagram of the structure of the light source determining module 202 in the apparatus for generating an image according to an exemplary embodiment of the present application. The embodiment of fig. 9 of the present application is extended from the embodiment of fig. 8 of the present application, and differences between the embodiment of fig. 9 and the embodiment of fig. 8 are described below with emphasis, and the details of the differences are not repeated.
As shown in fig. 9, in the apparatus for generating an image provided in the embodiment of the present application, the light source information includes a light source position and a light source intensity, and the light source determining module 202 includes:
An image segmentation unit 202a, configured to perform image segmentation on the first image to obtain a plurality of image sub-areas;
a feature vector determination unit 202b for determining a feature vector for each image sub-region using the surface normal map and the reflection information;
a light source position determining unit 202c, configured to determine a light source position in the first image according to the feature vector of each image sub-region and a preset light source classification neural network;
the light source intensity determining unit 202d is configured to determine the light source intensity in the first image according to the first image and the light source position in the first image.
Fig. 10 is a schematic structural diagram of a second image acquisition module 203 in an apparatus for generating an image according to an exemplary embodiment of the present application. The embodiment of fig. 10 of the present application is extended from the embodiment of fig. 9 of the present application, and differences between the embodiment of fig. 10 and the embodiment of fig. 9 are emphasized below, and are not repeated.
As shown in fig. 10, in the apparatus for generating an image according to the embodiment of the present application, the second image acquisition module 203 includes:
a position limiting unit 203a for limiting a placement position of the preset object model by a surface normal map;
A camera parameter determining unit 203b for determining camera parameters of the first image;
a pixel coordinate determining unit 203c, configured to determine pixel coordinates of a preset object model according to the camera parameters of the first image and the three-dimensional coordinates of the preset object model;
and the second image determining unit 203d is configured to edit and render the first image and the preset object model according to the pixel coordinates, the reflection information, the light source position and the light source intensity of the preset object model, so as to obtain a second image.
Fig. 11 is a schematic structural diagram of the second depth image acquiring module 204 in the apparatus for generating an image according to an exemplary embodiment of the present application. The embodiment of fig. 11 of the present application is extended from the embodiment of fig. 10 of the present application, and differences between the embodiment of fig. 11 and the embodiment of fig. 10 are emphasized below, and are not repeated.
As shown in fig. 11, in the apparatus for generating an image according to the embodiment of the present application, the second depth image acquiring module 204 includes:
a depth value determining unit 204a, configured to obtain a depth value of each pixel point in the preset object model according to the three-dimensional coordinates of the preset object model;
the second depth image determining unit 204b is configured to obtain a second depth image according to the first depth image and a depth value of each pixel point in the preset object model.
Fig. 12 is a schematic diagram of the structure of a pixel coordinate determining unit 203c in an apparatus for generating an image according to an exemplary embodiment of the present application. The embodiment of fig. 12 of the present application is extended from the embodiment of fig. 10 of the present application, and differences between the embodiment of fig. 12 and the embodiment of fig. 10 are emphasized below, and are not repeated.
As shown in fig. 12, in the apparatus for generating an image provided in the embodiment of the present application, a pixel coordinate determining unit 203c includes:
a reference pixel setting subunit 203c1, configured to set a reference pixel of a preset object model;
a data setting sub-unit 203c2 for setting pixel coordinates and depth values of the reference pixel points;
a three-dimensional coordinate calculation subunit 203c3, configured to calculate, according to the camera parameter of the first image, the pixel coordinates and the depth values of the reference pixel point, the three-dimensional coordinates of the reference pixel point by using a preset three-dimensional coordinate calculation formula;
the pixel coordinate calculating subunit 203c4 is configured to calculate, according to the camera parameter of the first image, the three-dimensional coordinates of the reference pixel point, the three-dimensional coordinates of the preset object model, the relative positions of the reference pixel point and each pixel point in the preset object model, the pixel coordinates of each pixel point in the preset object model by using a preset pixel coordinate calculation formula.
Fig. 13 is a schematic structural view of an apparatus for generating an image according to another exemplary embodiment of the present application. The embodiment of fig. 13 of the present application extends from the embodiment of fig. 8-12 of the present application, and differences between the embodiment of fig. 13 and the embodiment of fig. 8-12 are emphasized below, which are not repeated.
As shown in fig. 13, in the apparatus for generating an image according to the embodiment of the present application, further includes:
an adding module 205 is configured to add a preset object model to the first image.
It should be understood that fig. 8 to 13 provide the apparatus for generating an image with the reflection information determination module 201, the light source determination module 202, the second image acquisition module 203, the second depth image acquisition module 204, and the addition module 205. And the operations and functions of the image dividing unit 202a, the feature vector determining unit 202b, the light source position determining unit 202c, and the light source intensity determining unit 202d included in the light source determining module 202, the position restricting unit 203a, the camera parameter determining unit 203b, the pixel coordinate determining unit 203c, and the second image determining unit 203d included in the second image obtaining module 203, the depth value determining unit 204a and the second depth image determining unit 204b included in the second depth image obtaining module 204, the reference pixel point setting subunit 203c1, the data setting subunit 203c2, the three-dimensional coordinate calculating subunit 203c3, and the pixel coordinate calculating subunit 203c4 included in the pixel coordinate determining unit 203c may refer to the methods of generating images provided in fig. 1 to 7 described above, and are not repeated here.
Exemplary electronic device
Fig. 14 illustrates a block diagram of an electronic device according to an embodiment of the application.
As shown in fig. 14, the electronic device 11 includes one or more processors 11a and a memory 11b.
The processor 11a may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 11 to perform desired functions.
The memory 11b may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 11a to perform the methods of generating images and/or other desired functions of the various embodiments of the application described above. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.
In one example, the electronic device 11 may further include: an input device 11c and an output device 11d, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
For example, the input device 11c may be a camera or a microphone, a microphone array, or the like as described above, for capturing an input signal of an image or a sound source. When the electronic device is a stand-alone device, the input means 11c may be a communication network connector for receiving the acquired input signal from the neural network processor.
In addition, the input device 11c may also include, for example, a keyboard, a mouse, and the like.
The output device 11d may output various information including the determined output voltage, output current information, and the like to the outside. The output device 11d may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
Of course, only some of the components of the electronic device 11 relevant to the present application are shown in fig. 14 for simplicity, components such as buses, input/output interfaces, and the like being omitted. In addition, the electronic device 11 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer readable storage Medium
In addition to the methods and apparatus described above, embodiments of the application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in a power parameter adjustment method according to various embodiments of the application described in the "exemplary methods" section of this specification.
The computer program product may write program code for performing operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present application may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform the steps in a power parameter adjustment method according to various embodiments of the present application described in the "exemplary method" section above in the present specification.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not necessarily limited to practice with the above described specific details.
The block diagrams of the devices, apparatuses, devices, systems referred to in the present application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present application.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.
Claims (9)
1. A method of generating an image, comprising:
determining reflection information of each pixel point in the first image;
determining light source information in a scene where the first image is shot according to the first image, a surface normal map corresponding to the first image and the reflection information; the light source information comprises a light source position and a light source intensity;
editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image;
obtaining a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model;
the determining, according to the first image, the surface normal map corresponding to the first image, and the reflection information, light source information in a scene where the first image is captured includes: image segmentation is carried out on the first image to obtain a plurality of image subareas; determining a feature vector of each image subarea by utilizing a surface normal map corresponding to the first image and the reflection information; determining the position of a light source in the first image according to the feature vector of each image subarea and a preset light source classification neural network; determining the light source intensity in the first image according to the first image and the light source position in the first image.
2. The method of claim 1, wherein the method further comprises:
and determining a surface normal corresponding to each pixel point in the first depth image to obtain a surface normal map corresponding to the first image.
3. The method of claim 2, wherein editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image, including:
limiting the placement position of the preset object model through the surface normal map;
determining camera parameters of the first image;
determining pixel coordinates of the preset object model according to the camera parameters of the first image and the three-dimensional coordinates of the preset object model;
and editing and rendering the first image and the preset object model according to the pixel coordinates of the preset object model, the reflection information, the light source position and the light source intensity to obtain the second image.
4. The method of claim 3, wherein obtaining a second depth image corresponding to the second image from the first depth image corresponding to the first image and the preset object model, comprises:
Obtaining a depth value of each pixel point in the preset object model according to the three-dimensional coordinates of the preset object model;
and obtaining the second depth image according to the first depth image and the depth value of each pixel point in the preset object model.
5. A method according to claim 3, wherein determining pixel coordinates of the preset object model from camera parameters of the first image and three-dimensional coordinates of the preset object model comprises:
setting a reference pixel point of the preset object model;
setting pixel coordinates and depth values of the reference pixel points;
according to the camera parameters of the first image, the pixel coordinates and the depth values of the reference pixel points, calculating the three-dimensional coordinates of the reference pixel points by using a preset three-dimensional coordinate calculation formula;
and calculating the pixel coordinates of each pixel point in the preset object model by using a preset pixel coordinate calculation formula according to the camera parameters of the first image, the three-dimensional coordinates of the reference pixel point, the three-dimensional coordinates of the preset object model and the relative positions of the reference pixel point and each pixel point in the preset object model.
6. The method according to any one of claims 1-5, wherein editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information, and before obtaining a second image, further comprising:
and adding the preset object model in the first image.
7. An apparatus for generating an image, comprising:
the reflection information determining module is used for determining reflection information of each pixel point in the first image;
the light source determining module is used for determining light source information in a scene where the first image is shot according to the first image, a surface normal map corresponding to the first image and the reflection information; the light source information comprises a light source position and a light source intensity;
the second image acquisition module is used for editing and rendering a preset object model to be added in the first image according to the first image, the surface normal map, the reflection information and the light source information to obtain a second image;
the second depth image acquisition module is used for acquiring a second depth image corresponding to the second image according to the first depth image corresponding to the first image and the preset object model;
Wherein the light source determining module includes: the image segmentation unit is used for carrying out image segmentation on the first image to obtain a plurality of image subregions; a feature vector determining unit, configured to determine a feature vector of each image sub-region by using the surface normal map and the reflection information corresponding to the first image; the light source position determining unit is used for determining the light source position in the first image according to the characteristic vector of each image subarea and a preset light source classification neural network; and the light source intensity determining unit is used for determining the light source intensity in the first image according to the first image and the light source position in the first image.
8. A computer readable storage medium storing a computer program for performing the method of generating an image according to any one of the preceding claims 1-6.
9. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor being configured to perform the method of generating an image according to any of the preceding claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910068605.7A CN111476834B (en) | 2019-01-24 | 2019-01-24 | Method and device for generating image and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910068605.7A CN111476834B (en) | 2019-01-24 | 2019-01-24 | Method and device for generating image and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111476834A CN111476834A (en) | 2020-07-31 |
CN111476834B true CN111476834B (en) | 2023-08-11 |
Family
ID=71743594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910068605.7A Active CN111476834B (en) | 2019-01-24 | 2019-01-24 | Method and device for generating image and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111476834B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002117413A (en) * | 2000-10-10 | 2002-04-19 | Univ Tokyo | Image generating device and image generating method for reflecting light source environmental change in real time |
CN105825544A (en) * | 2015-11-25 | 2016-08-03 | 维沃移动通信有限公司 | Image processing method and mobile terminal |
CN106710003A (en) * | 2017-01-09 | 2017-05-24 | 成都品果科技有限公司 | Three-dimensional photographing method and system based on OpenGL ES (Open Graphics Library for Embedded System) |
CN106873828A (en) * | 2017-01-21 | 2017-06-20 | 司承电子科技(上海)有限公司 | A kind of implementation method of the 3D press key input devices for being applied to virtual reality products |
WO2017192467A1 (en) * | 2016-05-02 | 2017-11-09 | Warner Bros. Entertainment Inc. | Geometry matching in virtual reality and augmented reality |
CN108509887A (en) * | 2018-03-26 | 2018-09-07 | 深圳超多维科技有限公司 | A kind of acquisition ambient lighting information approach, device and electronic equipment |
CN108525298A (en) * | 2018-03-26 | 2018-09-14 | 广东欧珀移动通信有限公司 | Image processing method, device, storage medium and electronic equipment |
CN109087346A (en) * | 2018-09-21 | 2018-12-25 | 北京地平线机器人技术研发有限公司 | Training method, training device and the electronic equipment of monocular depth model |
CN109118582A (en) * | 2018-09-19 | 2019-01-01 | 东北大学 | A kind of commodity three-dimensional reconstruction system and method for reconstructing |
CN109155078A (en) * | 2018-08-01 | 2019-01-04 | 深圳前海达闼云端智能科技有限公司 | Generation method, device, electronic equipment and the storage medium of the set of sample image |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9947134B2 (en) * | 2012-07-30 | 2018-04-17 | Zinemath Zrt. | System and method for generating a dynamic three-dimensional model |
JP6152635B2 (en) * | 2012-11-08 | 2017-06-28 | ソニー株式会社 | Image processing apparatus and method, and program |
CN104134230B (en) * | 2014-01-22 | 2015-10-28 | 腾讯科技(深圳)有限公司 | A kind of image processing method, device and computer equipment |
-
2019
- 2019-01-24 CN CN201910068605.7A patent/CN111476834B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002117413A (en) * | 2000-10-10 | 2002-04-19 | Univ Tokyo | Image generating device and image generating method for reflecting light source environmental change in real time |
CN105825544A (en) * | 2015-11-25 | 2016-08-03 | 维沃移动通信有限公司 | Image processing method and mobile terminal |
WO2017192467A1 (en) * | 2016-05-02 | 2017-11-09 | Warner Bros. Entertainment Inc. | Geometry matching in virtual reality and augmented reality |
CN106710003A (en) * | 2017-01-09 | 2017-05-24 | 成都品果科技有限公司 | Three-dimensional photographing method and system based on OpenGL ES (Open Graphics Library for Embedded System) |
CN106873828A (en) * | 2017-01-21 | 2017-06-20 | 司承电子科技(上海)有限公司 | A kind of implementation method of the 3D press key input devices for being applied to virtual reality products |
CN108509887A (en) * | 2018-03-26 | 2018-09-07 | 深圳超多维科技有限公司 | A kind of acquisition ambient lighting information approach, device and electronic equipment |
CN108525298A (en) * | 2018-03-26 | 2018-09-14 | 广东欧珀移动通信有限公司 | Image processing method, device, storage medium and electronic equipment |
CN109155078A (en) * | 2018-08-01 | 2019-01-04 | 深圳前海达闼云端智能科技有限公司 | Generation method, device, electronic equipment and the storage medium of the set of sample image |
CN109118582A (en) * | 2018-09-19 | 2019-01-01 | 东北大学 | A kind of commodity three-dimensional reconstruction system and method for reconstructing |
CN109087346A (en) * | 2018-09-21 | 2018-12-25 | 北京地平线机器人技术研发有限公司 | Training method, training device and the electronic equipment of monocular depth model |
Non-Patent Citations (1)
Title |
---|
Enrico Boni ; .Ultrasound Open Platforms for Next-Generation Imaging Technique Development.IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control.2018,第65卷(第7期),第1078 - 1092页. * |
Also Published As
Publication number | Publication date |
---|---|
CN111476834A (en) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200302241A1 (en) | Techniques for training machine learning | |
EP3660787A1 (en) | Training data generation method and generation apparatus, and image semantics segmentation method therefor | |
CN108010118B (en) | Virtual object processing method, virtual object processing apparatus, medium, and computing device | |
CN108381549B (en) | Binocular vision guide robot rapid grabbing method and device and storage medium | |
CA2667538C (en) | System and method for recovering three-dimensional particle systems from two-dimensional images | |
CN112639846A (en) | Method and device for training deep learning model | |
JP2018163554A (en) | Image processing device, image processing method, image processing program, and teacher data generating method | |
CN110782517B (en) | Point cloud labeling method and device, storage medium and electronic equipment | |
CN111739005B (en) | Image detection method, device, electronic equipment and storage medium | |
EP4068220A1 (en) | Image processing device, image processing method, moving device, and storage medium | |
CN112651881B (en) | Image synthesizing method, apparatus, device, storage medium, and program product | |
CN111292334B (en) | Panoramic image segmentation method and device and electronic equipment | |
CN109117806B (en) | Gesture recognition method and device | |
CN112017246B (en) | Image acquisition method and device based on inverse perspective transformation | |
CN114255314A (en) | Automatic texture mapping method, system and terminal for shielding avoidance three-dimensional model | |
CN115393322A (en) | Method and device for generating and evaluating change detection data based on digital twins | |
CN115861733A (en) | Point cloud data labeling method, model training method, electronic device and storage medium | |
CN113033248A (en) | Image identification method and device and computer readable storage medium | |
CN111476834B (en) | Method and device for generating image and electronic equipment | |
CN117315295A (en) | BIM model similarity calculation method, system, equipment and storage medium | |
CN115273013B (en) | Lane line detection method, system, computer and readable storage medium | |
JP2008084338A (en) | Pseudo three-dimensional image forming device, pseudo three-dimensional image forming method and pseudo three-dimensional image forming program | |
CN113034675B (en) | Scene model construction method, intelligent terminal and computer readable storage medium | |
CN114299271A (en) | Three-dimensional modeling method, three-dimensional modeling apparatus, electronic device, and readable storage medium | |
CN114330708A (en) | Neural network training method, system, medium and device based on point cloud data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |