WO2022180782A1

WO2022180782A1 - Information processing device, information processing method, and information processing program

Info

Publication number: WO2022180782A1
Application number: PCT/JP2021/007334
Authority: WO
Inventors: 翔大山田; 弘員柿沼; 秀信長田; 浩太日高
Original assignee: 日本電信電話株式会社
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2022-09-01

Abstract

An information processing device 1 is provided with: a transformation unit 12 for transforming the style of a content image using the style of a style image; and a finding unit 13 for finding values of various style transformation parameters such that the difference between a feature quantity of a prescribed region of the content image, the style of which has been transformed, and a feature quantity of a prescribed region of the style image is minimized.

Description

Information processing device, information processing method, and information processing program

The present invention relates to an information processing device, an information processing method, and an information processing program.

There is a technique for converting the style of a content image (texture, painting style) with the style of a style image (Non-Patent Document 1). In style conversion, if the granularity of the content image and the style image is different, the values of various parameters included in the style conversion algorithm are adjusted, such as adjusting the parameter values related to the size of the image and the granularity of the style.

In style conversion, the content image after style conversion is similar to the style of the reference range specified by the user on the style image side in terms of human perceptual scale, that is, the scale of the content image and style image match. , it is necessary to adjust the values of various parameters. Conventionally, however, the values of various parameters have been manually adjusted while visually confirming the state of style conversion.

The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a technique capable of improving the work efficiency of style conversion.

An information processing apparatus according to an aspect of the present invention includes a conversion unit that converts the style of a content image with the style of the style image, a feature amount of a predetermined range of the content image after style conversion, and a feature of the predetermined range of the style image. and a search unit for searching for values of various parameters for style conversion that minimize the difference between the amount and the amount.

An information processing method according to one aspect of the present invention is an information processing method performed by an information processing apparatus, comprising: converting a style of a content image into a style of the style image; and a step of searching for the value of various parameters for style conversion that minimizes the difference between the feature amount in the predetermined range of the style image.

An information processing program according to one aspect of the present invention is an information processing program that causes a computer to function as the information processing apparatus.

According to the present invention, it is possible to provide a technology that can improve the work efficiency of style conversion.

FIG. 1 is a diagram showing a functional block configuration of an information processing apparatus. FIG. 2 is a diagram illustrating a processing flow of the information processing apparatus; FIG. 3 is a reference diagram for explaining the processing flow. FIG. 4 is a diagram showing the hardware configuration of the information processing device.

Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the description of the drawings, the same parts are denoted by the same reference numerals, and the description thereof is omitted.

[Summary of Invention]
In the present invention, the style A of the entire content image is converted by the style B of the style image, and the feature amount of the content image having the style A' after the style conversion and the feature amount of the style image of the predetermined range. Automatically search for the values of various parameters for style conversion that minimize the difference (maximize the similarity). Since the values of various parameters for style conversion are automatically searched for, it is possible to simplify the task of adjusting the sense of scale between the content image and the style image, thereby reducing the operation cost of style conversion. As a result, it is possible to provide a technique capable of improving the work efficiency of style conversion.

[Configuration of information processing device]
FIG. 1 is a diagram showing a functional block configuration of an information processing device 1 according to this embodiment. The information processing device 1 is a style conversion device that converts the style of a content image using a style image. The information processing device 1 includes an input unit 11 , a conversion unit 12 , a search unit 13 , an output unit 14 , a storage unit 15 and a display unit 16 .

The input unit 11 is a functional unit that inputs the range specified by the user for the style A content image displayed on the display unit 16 as the reference range R1. The input unit 11 is a functional unit that inputs a range specified by the user for the style image of style B displayed on the display unit 16 as a reference range R2.

The conversion unit 12 is a functional unit that converts the style A of the content image into the style B of the style image. The conversion unit 12 is a functional unit that converts the style A of the content image using the values of various parameters for style conversion searched by the search unit 13 .

The searching unit 13 selects various parameters for style conversion that minimize the difference between the feature amount of the reference range R1 of the content image of style A′ after style conversion and the feature amount of the reference range R2 of the style image of style B. It is a functional part that searches for the value of

The output unit 14 is a functional unit that outputs to the display unit 16 a content image whose style has been converted using the values of various parameters for style conversion searched by the search unit 13 .

The storage unit 15 is a functional unit that stores the values of various parameters for style conversion searched by the search unit 13 .

The display unit 16 is a functional unit that displays the style-converted content image output by the output unit 14 . The display unit 16 is, for example, a touch panel display. The display unit 16 also displays a content image before style conversion, a style image, and the like, and provides a GUI such as finger touch.

It should be noted that the functional division of the above functional units is an example. For example, the conversion unit 12, search unit 13, and output unit 14 may be combined into one processing unit.

[Operation of information processing device]
FIG. 2 is a diagram showing a processing flow of the information processing device 1. As shown in FIG. On the display unit 16, for example, a style A content image of a domestic cat and a style B style image of a polygonal pattern are displayed on the left and right sides. A user wishes to convert a cat image into a polygonal image.

Step S1;
First, the input unit 11 inputs a range (ROI; Region of Interest) designated by the user in the style image as a reference range R2.

Step S2;
Next, the input unit 11 determines whether or not the user has specified a range within the content image. The determination method is, for example, a method of displaying a selection screen for whether or not to specify a range within the content image and making a determination based on the selection result of the user, or a method of determining that no range is specified after a certain period of time has elapsed after execution of step S1. There is

Step S3;
When the user specifies a range within the content image, the input unit 11 inputs the range specified by the user within the content image as the reference range R1.

Step S4;
If the user does not specify a range within the content image, the input unit 11 inputs a range randomly sampled (selected) from the content image as the reference range R1. The sampling range is the range of all or part of the content image.

Step S5;
Next, the conversion unit 12 temporarily converts the style A of the content image into the style B of the style image (see FIG. 3).

Step S6;
Next, the searching unit 13 selects a style conversion style image that minimizes the difference between the feature amount of the reference range R1 of the content image of style A′ after style conversion and the feature amount of the reference range R2 of the style image of style B. (see FIG. 3). Various parameters include patch size, patch stride, style image size, and the like.

For example, the reference range in the image is defined as a feature quantity encoded by VGG 19 (a convolutional neural network with a depth of 19 layers) as a feature quantity that humans perceptually recognize. is the minimum difference between the VGG19-encoded feature amount and the VGG19-encoded feature amount of the reference range R2 of the style image, that is, the distance dist={VGG19(R1), VGG19(R2)} is searched for the parameter value that makes the shortest. Since the searching unit 13 searches for the values of various parameters for such style conversion, it is possible to simplify the work of adjusting the sense of scale between the content image and the style image.

It should be noted that the search process can be realized using existing technologies such as the Grid search function that automatically optimizes the parameters of the machine learning model.

Step S7;
Next, the conversion unit 12 converts the style A of the content image using the values of the searched various parameters for style conversion.

Step S8;
Finally, the output unit 14 outputs the content image subjected to style conversion in step S7 to the display unit 16 for preview display. Further, the searching unit 13 saves the searched various parameter values of the style conversion in the storage unit 15 .

[effect]
According to this embodiment, the style of the content image is converted by the style of the style image, and the difference between the feature amount of the predetermined range of the content image after style conversion and the feature amount of the predetermined range of the style image is minimized. Since the values of various parameters for style conversion are searched for, it is possible to simplify the task of adjusting the sense of scale between the content image and the style image, thereby reducing the operation cost of style conversion.

Further, according to this embodiment, when the user designates the range (ROI) of the style image, the range on the content image side to be style-converted is automatically selected, and the values of various parameters for style conversion are is automatically derived, style conversion can be performed intuitively simply by specifying the ROI from the style image. Since the user can intuitively perform the style conversion process while looking at the style without being conscious of the parameter values themselves, work efficiency is greatly improved.

From the above, we can provide a technology that can improve the work efficiency of style conversion.

[others]
The invention is not limited to the above embodiments. The present invention can be modified in many ways within the scope of the gist of the present invention.

The information processing apparatus 1 of the present embodiment described above includes, for example, a CPU 901, a memory 902, a storage 903, a communication device 904, an input device 905, and an output device 906, as shown in FIG. It can be realized using a general-purpose computer system. Memory 902 and storage 903 are storage devices. In the computer system, each function of the information processing apparatus 1 is realized by executing a predetermined program loaded on the memory 902 by the CPU 901 .

The information processing device 1 may be implemented by one computer. The information processing device 1 may be implemented by a plurality of computers. The information processing device 1 may be a virtual machine implemented in a computer. Programs for the information processing device 1 can be stored in computer-readable recording media such as HDDs, SSDs, USB memories, CDs, and DVDs. The program for information processing device 1 can also be distributed via a communication network.

1: Information Processing Device 11: Input Unit 12: Conversion Unit 13: Search Unit 14: Output Unit 15: Storage Unit 16: Display Unit 901: CPU
902: Memory 903: Storage 904: Communication device 905: Input device 906: Output device

Claims

a conversion unit that converts the style of the content image with the style of the style image;
a searching unit that searches for values of various parameters for style conversion that minimize the difference between the feature amount of the predetermined range of the content image after style conversion and the feature amount of the predetermined range of the style image;
Information processing device.
The search unit is
A search is made for values of various parameters for style conversion that minimize the difference between the feature amount obtained by encoding the predetermined range of the content image after style conversion with VGG19 and the feature amount obtained by encoding the predetermined range of the style image with VGG19. The information processing apparatus according to claim 1.
The predetermined range of the content image is
3. The information processing apparatus according to claim 1, wherein the range is randomly selected from the content image.
In the information processing method performed by the information processing device,
transforming the style of the content image with the style of the style image;
a step of searching for values of various parameters for style conversion that minimize the difference between the feature amount of the predetermined range of the content image after style conversion and the feature amount of the predetermined range of the style image;
Information processing method that performs
An information processing program that causes a computer to function as the information processing apparatus according to any one of claims 1 to 3.