WO2021006189A1

WO2021006189A1 - Image generation device, image generation method, and image generation program

Info

Publication number: WO2021006189A1
Application number: PCT/JP2020/026091
Authority: WO
Inventors: 五十嵐　健夫; 承鐸盧; 昌彦足立; 高橋　健一
Original assignee: 国立大学法人東京大学
Priority date: 2019-07-08
Filing date: 2020-07-02
Publication date: 2021-01-14

Abstract

Provided are an image generation device, an image generation method, and an image generation program for calculating parameters for generating an image which reproduces an image of a reference model. An image generation device 10 is provided with: a renderer control unit 11 for, in accordance with a plurality of parameters, causing a renderer 12 to render an image of a model; a feature amount calculation unit 13 for calculating a first feature amount that is a feature amount of the image and a second feature amount that is a feature amount of an image of a reference model; and an update unit 15 for updating the plurality of parameters such that a difference between the first feature amount and the second feature amount is reduced.

Description

Image generator, image generation method and image generation program

Cross-reference of related applications

This application is based on Japanese application No. 2019-127878 filed on July 8, 2019, and the contents of the description are incorporated herein by reference.

The present invention relates to an image generator, an image generation method, and an image generation program.

Conventionally, images of various models have been generated using CG (Computer Graphics) technology. In recent years, research has been conducted to convert the style of an image into a painting style using a neural network (Non-Patent Document 1 below), and "inverse rendering" that estimates the parameters of a renderer that reproduces a given reference image. Research is being conducted (Non-Patent Document 2).

However, there are some models, such as fur, where it is difficult to generate a visually realistic image. Therefore, conventionally, a procedural modeling expert tries to generate a realistic image by repeatedly adjusting and rendering the parameters of the procedural modeling while comparing the image of the reference model with the rendered image. It may be a mistake.
As described above, in general, it takes a long time even for an expert to adjust a plurality of parameters that determine the conditions for image generation to bring the generated image closer to the target image.

Therefore, the present invention provides an image generation device, an image generation method, and an image generation program that calculate image generation parameters that reproduce the image of the reference model.

The image generator according to one aspect of the present invention includes a renderer control unit that causes a renderer to render an image of a model according to a plurality of parameters, a first feature amount that is an image feature amount, and an image feature amount of a reference model. It is provided with a feature amount calculation unit that calculates each of a certain second feature amount, and an update unit that updates a plurality of parameters so as to reduce the difference between the first feature amount and the second feature amount.

According to this aspect, image generation that reproduces the image of the reference model by updating a plurality of parameters so as to reduce the difference between the feature amount of the image of the reference model and the feature amount of the rendered image. The parameters can be calculated.

In the above aspect, the renderer control unit may specify a rendering rule for procedural modeling by the plurality of parameters, and have the renderer render an image of the model based on the rendering rule.

According to this aspect, procedural modeling that reproduces the image of the reference model by updating a plurality of parameters so as to reduce the difference between the feature amount of the image of the reference model and the feature amount of the rendered image. Parameters can be calculated.

In the above aspect, the renderer may be a 3D renderer that renders an image of a 3D model based on rendering rules.

According to this aspect, it is possible to calculate the parameters of procedural modeling that reproduce the image of the reference model and render the image of the 3D model similar to the reference model.

In the above aspect, the update unit may update a plurality of parameters so as to satisfy predetermined constraint conditions regarding the plurality of parameters.

According to this aspect, it is possible to narrow the search space for a plurality of parameters and calculate the parameters that reproduce the image of the reference model at a higher speed.

In the above aspect, the feature amount calculation unit may include a pre-learned convolutional neural network.

According to this aspect, the features of the image can be appropriately captured by using the pre-learned convolutional neural network as the feature amount extractor.

In the above embodiment, the updater minimizes the loss function for evaluating the difference between the first feature and the second feature using at least one of particle swarm optimization, covariance matrix adaptive evolution strategy, and Bayesian optimization. Multiple parameters may be updated so as to be.

According to this aspect, even if the loss function is not differentiable with respect to the parameters, the optimum parameters can be calculated globally, and the image of the reference model can be reproduced.

An image generation method according to another aspect of the present invention is to cause a renderer to render an image of a model according to a plurality of parameters, a first feature amount which is an image feature amount, and an image feature amount of a reference model. It includes calculating each of the second feature amount and updating a plurality of parameters so as to reduce the difference between the first feature amount and the second feature amount.

The image generation program according to another aspect of the present invention is a renderer control unit that causes a renderer to render a model image according to a plurality of parameters in a calculation unit provided in the image generation device, and a first feature that is an image feature amount. The feature amount calculation unit that calculates the amount and the second feature amount that is the feature amount of the image of the reference model, and a plurality of parameters are updated so as to reduce the difference between the first feature amount and the second feature amount. It functions as an update unit.

According to the present invention, it is possible to provide an image generation device, an image generation method, and an image generation program that calculate image generation parameters that reproduce the image of the reference model.

It is a figure which shows the functional block of the image generation apparatus which concerns on embodiment of this invention. It is a conceptual diagram of the parameter optimization processing by the image generation apparatus which concerns on this embodiment. It is a figure which shows the physical structure of the image generation apparatus which concerns on this embodiment. This is an example of an image of a reference model referred to by the image generator according to the present embodiment. This is an example of an image generated by the image generator according to the present embodiment. This is an example of the parameters estimated by the image generator according to the present embodiment. It is a flowchart of the parameter optimization processing executed by the image generation apparatus which concerns on this embodiment.

An embodiment of the present invention will be described with reference to the accompanying drawings. In each figure, those having the same reference numerals have the same or similar configurations.

FIG. 1 is a diagram showing a functional block of the image generation device 10 according to the embodiment of the present invention. The image generation device 10 includes a renderer control unit 11, a renderer 12, a feature amount calculation unit 13, a storage unit 14, and an update unit 15.

The renderer control unit 11 causes the renderer 12 to render an image of the model according to a plurality of parameters. The renderer control unit 11 causes the renderer 12 to render the model image according to parameters such as brightness, contrast, and color temperature, and causes the renderer 12 to render the model image according to the parameters related to the image quality adjustment function of the renderer 12. You can do it.

Further, the renderer control unit 11 may specify a rendering rule for procedural modeling by a plurality of parameters, and cause the renderer 12 to render an image of the model based on the rendering rule. Here, the rendering rule is a rule used in the procedural modeling function of the renderer 12, and includes a rendering algorithm specified by a plurality of parameters.

In the present embodiment, the renderer 12 is a 3D renderer that renders an image of a 3D model based on rendering rules specified by a plurality of parameters. The renderer 12 may be composed of a commercially available general-purpose rendering engine.

The feature amount calculation unit 13 calculates the first feature amount, which is the feature amount of the image rendered by the renderer 12, and the second feature amount, which is the feature amount of the image of the reference model (reference image 14a).

The feature amount calculation unit 13 may include a pre-learned convolutional neural network (CNN: Convolutional Neural Network) 13a. CNN13a may be composed of, for example, VGGNet (Karen Simonyan and Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition”, arXiv: 1409.1556, 2014). The feature calculation unit 13 is described in Leon Gatys, Alexander S Ecker, and Matthias Bethge, “Texture Synthesis Using Convolutional Neural Networks”, Advances in Neural Information Processing Systems 28, Curran Associates, Inc., 262. As described above, the Gram matrix of the feature map calculated by CNN13a may be calculated as the feature amount. In this way, by using the pre-learned convolutional neural network as the feature amount extractor, the features of the image can be appropriately captured.

The storage unit 14 stores the reference image 14a. The reference image 14a is an image of the reference model and is a target of image generation by the renderer 12. The reference model can be anything, for example fur.

The update unit 15 updates a plurality of parameters so as to reduce the difference between the first feature amount and the second feature amount. Here, the second feature amount, which is the feature amount of the reference image 14a, is fixed, and the update unit 15 makes the first feature amount, which is the feature amount of the image rendered by the renderer 12, closer to the second feature amount. Update multiple parameters that specify rendering rules. In this way, procedural modeling parameters that reproduce the image of the reference model by updating a plurality of parameters so as to reduce the difference between the feature amount of the image of the reference model and the feature amount of the rendered image. Can be calculated. As a result, the calculation cost can be significantly reduced as compared with the case where a plurality of parameters are completely searched, for example.

The update unit 15 may update a plurality of parameters so as to satisfy a predetermined constraint condition regarding the plurality of parameters. For example, when a plurality of parameters include the thickness of the root of the hair and the thickness of the tip of the hair, the update unit 15 sets a constraint condition so that the thickness of the tip of the hair is equal to or less than the thickness of the root of the hair. You may impose. Further, the update unit 15 may impose a constraint condition for setting an upper limit and a lower limit for each of the plurality of parameters. In this way, the search space for a plurality of parameters can be narrowed, and the parameters for procedural modeling that reproduce the image of the reference model can be calculated at higher speed.

The updater 15 minimizes the loss function for evaluating the difference between the first feature and the second feature by using at least one of particle swarm optimization, covariance matrix adaptive evolution strategy, and Bayesian optimization. In addition, multiple parameters may be updated. Particle swarm optimization, covariance matrix adaptive evolution strategy, and Bayesian optimization are all algorithms that can be applied without calculating partial derivatives based on the parameters of the loss function. In addition, particle swarm optimization, covariance matrix adaptive evolution strategy, and Bayesian optimization are not algorithms for finding local optimal solutions, but algorithms for finding global optimal solutions. In this way, even if the loss function is not differentiable with respect to the parameters, the optimum parameters can be calculated globally, and the image of the reference model can be reproduced.

FIG. 2 is a conceptual diagram of parameter optimization processing by the image generation device 10 according to the present embodiment. The image generation device 10 specifies a rendering rule by a plurality of parameters p _dst , and generates an image I _dst of the model by the renderer 12. Further, the image generation device 10 stores the reference image I _ref .

Image generation apparatus 10, CNN13a by calculating a first characteristic amount _{x dst} is a feature quantity of the image _{I dst,} a second feature quantity _{x ref} is a feature quantity of the reference image _{I ref.} Then, the difference between the first feature amount x _dst and the second feature amount x _ref is evaluated by the loss function (Loss). Here, the loss function may be, for example, || x _dst −x _ref || ² . Here, || and || ² are L2 norms.

The image generator 10 updates a plurality of parameters p _dst so as to minimize the loss function, for example using particle swarm optimization. The image generation device 10 repeatedly repeats the above parameter update process, determines a plurality of optimized parameter p _dst ^* when a predetermined condition is satisfied, and determines a plurality of optimized parameter p _dst ^* . It is used to cause the renderer 12 to render an image of the model. Here, the predetermined condition may be that the value of the loss function is equal to or less than the threshold value, or that the number of epochs (the number of repetitions of the parameter update process) is equal to or greater than the predetermined number of times.

FIG. 3 is a diagram showing a physical configuration of the image generation device 10 according to the present embodiment. The image generation device 10 includes a CPU (Central Processing Unit) 10a corresponding to a calculation unit, a RAM (Random Access Memory) 10b corresponding to a storage unit, a ROM (Read only Memory) 10c corresponding to a storage unit, and a communication unit. It has a 10d, an input unit 10e, and a display unit 10f. Each of these configurations is connected to each other via a bus so that data can be transmitted and received. In this example, the case where the image generation device 10 is composed of one computer will be described, but the image generation device 10 may be realized by combining a plurality of computers. Further, the configuration shown in FIG. 3 is an example, and the image generation device 10 may have configurations other than these, or may not have a part of these configurations. The image generation device 10 may have, for example, a GPU (Graphical Processing Unit).

The CPU 10a is a control unit that controls execution of a program stored in the RAM 10b or ROM 10c, calculates data, and processes data. The CPU 10a is a calculation unit that executes a program (image generation program) that optimizes the parameters of procedural modeling so as to reproduce the image of the reference model. The CPU 10a receives various data from the input unit 10e and the communication unit 10d, displays the calculation result of the data on the display unit 10f, and stores the data in the RAM 10b.

The RAM 10b is a storage unit in which data can be rewritten, and may be composed of, for example, a semiconductor storage element. The RAM 10b may store data such as a program executed by the CPU 10a and a reference image. It should be noted that these are examples, and data other than these may be stored in the RAM 10b, or a part of these may not be stored.

The ROM 10c is a storage unit capable of reading data, and may be composed of, for example, a semiconductor storage element. The ROM 10c may store, for example, an image generation program or data that is not rewritten.

The communication unit 10d is an interface for connecting the image generator 10 to another device. The communication unit 10d may be connected to a communication network such as the Internet.

The input unit 10e receives data input from the user, and may include, for example, a keyboard and a touch panel.

The display unit 10f visually displays the calculation result by the CPU 10a, and may be configured by, for example, an LCD (Liquid Crystal Display). The display unit 10f may display, for example, a reference image, procedural modeling parameter values, and a rendered image.

The image generation program may be stored in a storage medium readable by a computer such as RAM 10b or ROM 10c and provided, or may be provided via a communication network connected by the communication unit 10d. In the image generation device 10, the CPU 10a executes the image generation program to realize various operations described with reference to FIG. It should be noted that these physical configurations are examples and do not necessarily have to be independent configurations. For example, the image generation device 10 may include an LSI (Large-Scale Integration) in which the CPU 10a, the RAM 10b, and the ROM 10c are integrated.

FIG. 4 is an example of the image I _ref of the reference model referred to by the image generator 10 according to the present embodiment. The image I _{ref of the} reference model is an image of fur cut out to a size of 10 cm × 10 cm. The shooting conditions of the image I _ref may be arbitrarily adjusted in advance. Further, the image I _ref may be an image of the reference model (fur in this example) taken from diagonally above. By shooting the reference model from diagonally above, the three-dimensional features of the reference model can be captured in a single image. However, the image generator 10 may use a plurality of image I _refs for a single reference model.

FIG. 5 is an example of the image _IDst generated by the image generator 10 according to the present embodiment. For the generated image I _dst , the parameter update process is repeated 20 times by the image generation device 10, a plurality of optimized parameters p _dst ^* are calculated, and a renderer is used using the plurality of optimized parameters p _dst ^*. It is an image rendered by 12. As shown in the figure, the image generator 10 obtains an image that is visually realistic so that it is almost indistinguishable from the actual image I _ref .

FIG. 6 is an example of the parameters estimated by the image generator 10 according to the present embodiment. In this example, 15 parameters in procedural modeling are calculated by the image generator 10. The vertical axis of the graph shown in the figure is the number of the 15 parameters, and the horizontal axis is the value of the parameter. In the figure, the values obtained by standardizing the 15 parameters from 0 to 1 are shown.

The points indicated by black circles in the graph of FIG. 6 indicate the values of the parameters obtained by trial and error so as to reproduce the reference image by a procedural modeling expert. On the other hand, the points indicated by white circles in the graph of FIG. 6 indicate the values of the parameters calculated so as to reproduce the reference image by the image generation device 10 according to the present embodiment. Among the 15 parameters, there are those in which the parameters obtained by the expert and the parameters calculated by the image generation device 10 are almost the same and are significantly different. As a result, any parameter set produces an image similar to the reference image, so the values of multiple parameters that reproduce the reference image may not be unique, and the area of parameter space that corresponds to the reference image is It is suggested that it has a certain extent.

FIG. 7 is a flowchart of the parameter optimization process executed by the image generation device 10 according to the present embodiment. First, the image generation device 10 calculates a second feature amount, which is a feature amount of the reference image (S10). Then, the image generation device 10 initializes a plurality of parameters for designating the rendering rule (S11). The image generation device 10 may be initialized by setting a plurality of parameters to predetermined default values, or may be initialized by randomly selecting a plurality of parameters.

Next, the image generation device 10 specifies a rendering rule by a plurality of parameters, and renders the model image by the renderer 12 based on the rendering rule (S12). Then, the image generation device 10 calculates the first feature amount, which is the feature amount of the rendered image (S13).

After that, the image generation device 10 updates a plurality of parameters so as to reduce the difference between the first feature amount and the second feature amount by particle swarm optimization (S14). The image generation device 10 may update a plurality of parameters by other algorithms such as a covariance matrix adaptive evolution strategy and Bayesian optimization.

If the learning end condition is not satisfied (S15: NO), the image generator 10 executes the processes S12 to S14 again and updates a plurality of parameters. On the other hand, when the learning end condition is satisfied (S15: YES), the image generation device 10 ends the parameter optimization process. Here, the learning end condition is a condition that the number of epochs (the number of times the processes S12 to S14 are executed) is equal to or more than a predetermined number of times, or the difference between the first feature amount and the second feature amount is a predetermined value or less. It may be.

The embodiments described above are for facilitating the understanding of the present invention, and are not for limiting and interpreting the present invention. Each element included in the embodiment and its arrangement, material, condition, shape, size, etc. are not limited to those exemplified, and can be changed as appropriate. In addition, the configurations shown in different embodiments can be partially replaced or combined.

For example, the image generator 10 can generate an image of a fur-clad 3D model having an arbitrary three-dimensional shape by using a plurality of parameters calculated to reproduce the fur reference model. Similarly, even if an object has a surface pattern or surface shape that has been difficult to reproduce in the past, the image generation device 10 according to the present embodiment generates a visually realistic 3D model image. be able to.

10 ... Image generator, 10a ... CPU, 10b ... RAM, 10c ... ROM, 10d ... Communication unit, 10e ... Input unit, 10f ... Display unit, 11 ... Renderer control unit, 12 ... Renderer, 13 ... Feature amount calculation unit, 13a ... CNN, 14 ... storage unit, 14a ... reference image, 15 ... update unit

Claims

A renderer control that lets the renderer render the model image according to multiple parameters,
A feature amount calculation unit that calculates the first feature amount, which is the feature amount of the image, and the second feature amount, which is the feature amount of the image of the reference model, respectively.
An update unit that updates the plurality of parameters so as to reduce the difference between the first feature amount and the second feature amount.
An image generator comprising.
The renderer control unit specifies rendering rules for procedural modeling by the plurality of parameters, and causes the renderer to render an image of the model based on the rendering rules.
The image generator according to claim 1.
The renderer is a 3D renderer that renders an image of a 3D model based on the rendering rules.
The image generator according to claim 2.
The update unit updates the plurality of parameters so as to satisfy predetermined constraint conditions relating to the plurality of parameters.
The image generator according to any one of claims 1 to 3.
The feature calculation unit includes a pre-learned convolutional neural network.
The image generator according to any one of claims 1 to 4.
The updater minimizes the loss function for evaluating the difference between the first feature and the second feature using at least one of particle swarm optimization, covariance matrix adaptive evolution strategy, and Bayesian optimization. Update the plurality of parameters so as to
The image generator according to any one of claims 1 to 5.
Having the renderer render the image of the model according to multiple parameters,
Calculation of the first feature amount, which is the feature amount of the image, and the second feature amount, which is the feature amount of the image of the reference model, respectively.
By updating the plurality of parameters so as to reduce the difference between the first feature amount and the second feature amount,
Image generation method including.
The arithmetic unit provided in the image generator
Renderer control, which causes the renderer to render the model image according to multiple parameters,
A feature amount calculation unit that calculates the first feature amount, which is the feature amount of the image, and the second feature amount, which is the feature amount of the image of the reference model, and the first feature amount and the second feature amount. An update unit that updates the plurality of parameters so as to reduce the difference.
An image generation program that functions as.