WO2021162055A1 - 心的イメージ可視化方法、心的イメージ可視化装置及びプログラム - Google Patents
心的イメージ可視化方法、心的イメージ可視化装置及びプログラム Download PDFInfo
- Publication number
- WO2021162055A1 WO2021162055A1 PCT/JP2021/005052 JP2021005052W WO2021162055A1 WO 2021162055 A1 WO2021162055 A1 WO 2021162055A1 JP 2021005052 W JP2021005052 W JP 2021005052W WO 2021162055 A1 WO2021162055 A1 WO 2021162055A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- mental
- dnn
- images
- mental image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
Definitions
- This disclosure relates to a mental image visualization method, a mental image visualization device, and a program.
- sensibility is a sensory ability to receive a stimulus from the outside world, and is a sensation accompanied by a specific value judgment that a person feels when observing a visual object.
- a mental image is an image (mental representation) that comes to mind in the mind and exists in the mind.
- Non-Patent Document 1 discloses a technique for visualizing a mental image.
- various face images are obtained by adding random noise to the prepared base image which is a face image.
- the psychological inverse correlation method to select a face image close to the face image of the subject's race from various face images, the mental image of the subject's face regarding racial judgment is visualized.
- the technology to be used is disclosed.
- the psychological inverse correlation method has an image feature that contributes to the generation of a certain sensibility by paying attention to the relationship of what the stimulus was presented when a certain sensibility was generated, for example, beautiful. It is a technology that visualizes what it is.
- various face images obtained by adding random noise to the base image are images derived from the base image. That is, various facial images are strongly defined in the prepared base image. Therefore, the facial images selected from the various facial images may be close to but different from the mental image of the selected person.
- the prepared base image is a face image obtained by averaging the face images obtained from the database, but there is also a problem that the quality is low.
- the present disclosure has been made in view of the above circumstances, and provides a mental image visualization method, a mental image visualization device, and a program capable of visualizing a human mental image with a higher quality image.
- the purpose is to provide a mental image visualization method, a mental image visualization device, and a program capable of visualizing a human mental image with a higher quality image.
- the mental image visualization method is described in DNN (Deep Neural Networks) learned using a data set of image for feature learning for learning features.
- the feature vector includes a step of acquiring from the DNN a feature vector in which the corresponding sample image is converted into an n-dimensional (n is an integer of 100 or more) vector by the DNN. It is used to generate an image that shows a mental image.
- the mental image visualization device includes DNN (Deep Neural Networks) learned using a data set of image for feature learning for learning features.
- the acquisition unit for acquiring a plurality of sample images of a plurality of sample images generated in the DNN and different objects in the same category as the object to be reflected in the feature learning image, and the plurality of sample images.
- the acquisition unit includes an input unit for inputting to the DNN, and the acquisition unit is a feature vector of each of the plurality of sample images, and the corresponding sample image is n-dimensional (n is an integer of 100 or more) according to the DNN.
- the feature vector converted into the vector of is obtained from the DNN, and the feature vector is used to generate an image showing a mental image.
- the mental image visualization method of the present disclosure it is possible to visualize a human mental image with a higher quality image.
- FIG. 1 is a block diagram showing an example of the configuration of the mental image visualization device according to the first embodiment.
- FIG. 2 is a diagram showing the structure of the Generator portion of styleGAN.
- FIG. 3A is a diagram showing an example of a sample image according to the first embodiment.
- FIG. 3B is a diagram showing an example of a feature vector of the sample image shown in FIG. 3A.
- FIG. 4 is a diagram showing an example of a hardware configuration of a computer that realizes the function of the mental image visualization device according to the first embodiment by software.
- FIG. 5 is a flowchart showing the operation of the mental image visualization device according to the first embodiment.
- FIG. 6 is a diagram showing an image obtained by adding and averaging the feature vectors of the two sample images according to the first embodiment.
- FIG. 7 is a block diagram showing an example of the configuration of the mental image visualization system according to the second embodiment.
- FIG. 8 is a diagram showing an example of the network structure of DCNN shown in FIG.
- FIG. 9 is a diagram for explaining the learning method of DCNN shown in FIG.
- FIG. 10 is a diagram showing an example of the results of sensitivity evaluation for a plurality of sample images according to the second embodiment.
- FIG. 11A is a diagram showing an example of a first feature vector calculated by the averaging unit according to the second embodiment.
- FIG. 11B is a diagram showing an image showing a mental image generated from the first feature vector shown in FIG. 11A.
- FIG. 12 is a diagram showing an overall picture of the mental image visualization method in the beauty ugliness evaluation according to the first and second embodiments.
- FIG. 12 is a diagram showing an overall picture of the mental image visualization method in the beauty ugliness evaluation according to the first and second embodiments.
- FIG. 13 is a block diagram showing an example of the configuration of the mental image visualization system according to the first modification of the second embodiment.
- FIG. 14 is a block diagram showing an example of a detailed configuration of the dimensional compression processing unit according to the first modification of the second embodiment.
- FIG. 15 is a diagram schematically showing an operation example of the mental image visualization system according to the first modification of the second embodiment.
- FIG. 16A is a diagram showing an example of a graph in which a plurality of eigenvalues obtained by executing the singular value decomposition according to the operation example shown in FIG. 15 are arranged in rank order.
- FIG. 16B is a diagram for explaining the relationship between an image showing a sub-mental image and an image showing a mental image using the table shown in FIG. 16A.
- FIG. 17 is a block diagram showing an example of a detailed configuration of the dimensional compression processing unit according to the second modification of the second embodiment.
- FIG. 18 is a block diagram showing an example of the configuration of the recommendation system according to the third embodiment.
- FIG. 19 is a diagram showing an example of a mental image or a sub-mental image according to the third embodiment.
- FIG. 20 is a diagram showing an example of the latent space according to the third embodiment and the position of the mental image or the sub-mental image.
- FIG. 21 is a diagram showing an example of the distance between the position of the mental image or sub-mental image and the position of one recommendation candidate image in the latent space according to the third embodiment.
- FIG. 22A is a diagram showing an example of a recommendation candidate image presented by the recommendation image generation UI according to the third embodiment.
- FIG. 22B is a diagram showing an example of a recommendation candidate image presented by the recommendation image generation UI according to the third embodiment.
- FIG. 22C is a diagram showing an example of a recommendation candidate image presented by the recommendation image generation UI according to the third embodiment.
- FIG. 23A is a diagram showing an example of a method of acquiring a mental image or a sub-mental image of the recommendation system according to the third embodiment.
- FIG. 23B is a diagram showing an example of a sample image presentation and evaluation method evaluated by the subject according to the third embodiment.
- FIG. 1 is a block diagram showing an example of the configuration of the mental image visualization device 1 according to the first embodiment.
- the mental image visualization device 1 is realized by a computer or the like using DNN (Deep Neural Networks). More specifically, the mental image visualization device 1 uses the DNN 10 to generate a plurality of sample images. Further, the mental image visualization device 1 uses the DNN 10 to acquire the feature vectors of each of the generated plurality of sample images.
- the sample image will be described as, for example, a face image, and the mental image will be described as relating to the beauty and ugliness of the face, but the present invention is not limited to this.
- the sample image may be an image showing the appearance of an automobile, an image showing the appearance of a house, or an image showing a product.
- the mental image does not have to be about the beauty and ugliness of the face, as long as it is about Kansei adjectives.
- the mental image may be related to the quality of the appearance of the automobile, the quality of the appearance of the house, or the quality of the product.
- any adjectives such as modern and Japanese can be used as Kansei adjectives.
- the mental image visualization device 1 includes a DNN 10, an acquisition unit 11, and an input unit 12, as shown in FIG. The details of each component will be described below.
- the DNN 10 is composed of a multi-layer neural network trained using a data set of image for feature learning for learning features.
- the DNN 10 generates a plurality of sample images in which different objects are shown in the same category as the objects shown in the feature learning image.
- the data set of the feature learning image may be a data set including various face images acquired from an existing database or the like, or may be a data set including various face images created by oneself.
- the DNN 10 is a feature vector of each of the plurality of sample images, and the corresponding sample image is converted into an n-dimensional (n is an integer of 100 or more) vector. Generate a feature vector. The n-dimensional feature vector is used to generate an image showing a mental image.
- the DNN 10 may be composed of, for example, styleGAN (A Style-Based Generator Architecture for Generative Adversarial Networks). If the DNN 10 can generate a plurality of sample images and further generate an n-dimensional feature vector of the input sample image, the DNN 10 is not limited to the case of being composed of a style GAN, but is another GAN or another multi-layer neural network. It may be configured by a network.
- styleGAN A Style-Based Generator Architecture for Generative Adversarial Networks
- styleGAN is a kind of GAN (Generative Adversarial Networks), and can generate a high-resolution image of, for example, 1024 pixels ⁇ 1024 pixels.
- styleGAN is generated after controlling the global attributes of the person image (face contour, presence / absence of glasses, etc.) to local attributes (wrinkles, skin quality, etc.). can do.
- GAN is a kind of model (generative model) that learns data for learning and generates new data similar to the learned data.
- GAN is an architecture that has two networks, Generator and Discriminator, and learns while competing with the two networks, and learns features without giving correct answer data (unsupervised learning). By learning the features from the data, GAN can generate non-existent data or convert it according to the existing data.
- FIG. 2 is a diagram showing the structure of the Generator portion of styleGAN.
- the styleGAN Generator is composed of a mapping network (Mapping network f) and a synthesis network (Synthesis Network g).
- the mapping network is a network composed of multiple layers (8 layers in the figure) of fully connected layers (layers in which all neurons in the presheaf and posterior layers are connected).
- the output of the mapping network has the same size (512 ⁇ 1) as the input layer.
- the mapping network acquires an intermediate vector, also called an intermediate latent variable w (w ⁇ W), by mapping the input vector (latent variable z) to another space (intermediate latent space W). ..
- the synthetic network is a network composed of multiple layers (18 layers in the figure). The output of the final layer of the composite network is converted to RGB.
- the synthetic network has AdaIN (Adaptive Instance Normalization) and a convolution layer.
- AdaIN synthesizes a vector controlled by adding noise to the output of each convolution layer and a vector for an image obtained by performing an affine transformation on an intermediate vector obtained by a mapping network.
- the processing of AdaIN is performed on the output of the convolution layer of each resolution scale (4 ⁇ 4, 8 ⁇ 8, ).
- the processing of AdaIN is the normalization processing in the feature map unit (channel unit).
- FIG. 3A is a diagram showing an example of a sample image according to the first embodiment.
- FIG. 3B is a diagram showing an example of a feature vector of the sample image shown in FIG. 3A.
- the sample image shown in FIG. 3A is represented in gray scale, but the same applies to the sample image which may be represented in color.
- the DNN 10 is trained using a data set in which a plurality of face images for learning the features of the face image are configured as feature learning images. Thereby, the DNN 10 can generate a plurality of sample images, for example, a face image, by using the generator of styleGAN. For example, DNN10 can generate a non-existent female face image as a sample image, as shown in FIG. 3A.
- styleGAN is trained using a data set in which a plurality of face images for learning the features of the face image are configured as feature learning images.
- a feature vector converted into a 512-dimensional vector can be obtained as an intermediate vector.
- the styleGAN mapping network functions as a neural network that transforms an image composed of, for example, 512 pixels ⁇ 512 pixels or 1024 pixels ⁇ 1024 pixels into a 512-dimensional feature vector.
- the DNN 10 converts the sample image into a 512-dimensional vector by using a part of the generator of styleGAN, that is, the mapping network. Generate a feature vector.
- DNN10 can generate the 512-dimensional feature vector shown in FIG. 3B from the sample image shown in FIG. 3A.
- the acquisition unit 11 acquires the feature vector of the sample image input to the DNN 10 by the input unit 12 from the DNN.
- the acquisition unit 11 acquires the feature vector by acquiring the output of the styleGAN mapping network.
- the input unit 12 inputs a plurality of sample images to the DNN 10. In the present embodiment, the input unit 12 inputs the sample image output from the acquisition unit 11 to the DNN 10.
- FIG. 4 is a diagram showing an example of a hardware configuration of a computer 1000 that realizes a mental image visualization function according to the present embodiment by software.
- the computer 1000 is a computer including an input device 1001, an output device 1002, a CPU and a GPU 1003, an internal storage 1004, a RAM 1005, a reading device 1007, a transmitting / receiving device 1008, and a bus 1009.
- the input device 1001, the output device 1002, the CPU and GPU 1003, the built-in storage 1004, the RAM 1005, the reading device 1007, and the transmitting / receiving device 1008 are connected by a bus 1009.
- the input device 1001 is a device that serves as a user interface such as an input button, a touch pad, and a touch panel display, and accepts user operations.
- the input device 1001 may be configured to accept a user's contact operation, a voice operation, a remote control, or the like.
- the built-in storage 1004 is a flash memory or the like. Further, in the built-in storage 1004, at least one of a program for realizing the function of the mental image visualization device 1 and an application using the functional configuration of the mental image visualization device 1 may be stored in advance.
- RAM1005 is a random access memory (RandomAccessMemory), which is used to store data or the like when executing a program or application.
- RandomAccessMemory Random AccessMemory
- the reading device 1007 reads information from a recording medium such as a USB (Universal Serial Bus) memory.
- the reading device 1007 reads the program or application from the recording medium on which the above program or application is recorded and stores the program or application in the built-in storage 1004.
- the transmitter / receiver 1008 is a communication circuit for wirelessly or wired communication.
- the transmission / reception device 1008 communicates with, for example, a server device connected to a network, downloads a program or application as described above from the server device, and stores the program or application in the built-in storage 1004.
- the CPU and GPU 1003 are a central processing unit (Central Processing Unit) and a graphics processing unit (Graphics Processing Unit), and the programs and applications stored in the internal storage 1004 are copied to the RAM 1005 and included in the programs and applications. Instructions are sequentially read from RAM 1005 and executed.
- Central Processing Unit Central Processing Unit
- Graphics Processing Unit Graphics Processing Unit
- FIG. 5 is a flowchart showing the operation of the mental image visualization device 1 according to the first embodiment.
- the mental image visualization device 1 causes the trained DNN10 to generate a plurality of sample images (S10). More specifically, the mental image visualization device 1 learns the DNN 10 shown in FIG. 1 using a data set of image for feature learning for learning features. Then, the mental image visualization device 1 causes the DNN 10 trained in this way to generate a plurality of sample images in which different objects are captured in the same category as the objects captured in the feature learning image.
- the mental image visualization device 1 inputs a plurality of sample images generated in step S10 into the DNN 10 (S11).
- the mental image visualization device 1 acquires the feature vectors of the plurality of sample images generated in step S10 from the DNN 10 (S12). More specifically, the mental image visualization device 1 is a feature vector of each of the plurality of sample images generated in step S10, and the corresponding sample image is n-dimensional (n is an integer of 100 or more) according to DNN10. The feature vector converted into a vector is acquired from DNN10.
- the DNN 10 learned by using the data set of the feature learning image for learning the feature has a different object in the same category as the object reflected in the feature learning image. It is possible to generate a plurality of sample images showing an object. Further, since the DNN 10 can be converted into a feature vector that is a feature vector of each of the plurality of sample images and the corresponding sample image is an n-dimensional (n is an integer of 100 or more) vector, the feature can be converted from the DNN 10 to the feature vector. You can get a vector.
- FIG. 6 is a diagram showing an image obtained by adding and averaging the feature vectors of the two sample images according to the first embodiment.
- the female face images and feature vectors shown in FIGS. 6A and 6B are examples of two different sample images and their respective feature vectors.
- the feature vector generated by the mental image visualization device 1 of the present embodiment is, for example, a 512-dimensional feature vector and has a certain linearity. Therefore, the image generated from the feature vector obtained by averaging the feature vectors of the two different sample images, for example, as shown in FIGS. 6 (a) and 6 (b), is the female face image shown in FIG. 6 (c). As shown in the above, the features of the sample images shown in FIGS. 6A and 6B are included on average. Further, as shown in FIG. 6 (c), the images including the features of the sample images shown in FIGS. 6 (a) and 6 (b) on average are the images shown in FIGS. 6 (a) and 6 (b). It can be seen that the image is similarly high quality.
- the feature vector of the sample image having the highest sensibility evaluation score can be obtained.
- the image generated from the feature vector of the sample image having a high sensitivity evaluation score may be an image showing a mental image.
- the feature vectors of each of a plurality of sample images having a relatively high sensibility evaluation score may be obtained.
- the image generated from the nonlinear transformation F for the feature vector obtained by weighting, adding, and averaging the feature vectors of each of the plurality of sample images having a relatively high sensitivity evaluation score may be used as an image showing a mental image.
- the feature vector generated by the mental image visualization device 1 of the present embodiment can be used to generate an image showing the mental image.
- DNN10 may be used as a method for generating an image from the feature vector.
- DNN10 when DNN10 is composed of styleGAN, an image can be generated from a feature vector by using a composite network of styleGAN.
- a neural network that can generate an image from a multidimensional feature vector is not limited to a styleGAN composite network.
- the sample image for which the sensitivity is evaluated by the psychological inverse correlation method does not depend on the above-mentioned base image, for example, 1024 pixels ⁇ 1024 pixels. High resolution and high image quality can be generated. Further, according to the mental image visualization device 1 of the present embodiment, the feature vector of the generated sample image can be generated. As a result, the feature vectors of the sample images whose sensibilities are evaluated by the psychological inverse correlation method are weighted, added and averaged according to the evaluation results, and the image generated from the feature vectors calculated in this way is an image showing a mental image. Can be obtained as. That is, according to the mental image visualization device 1 of the present embodiment, it is possible to visualize a human mental image with a higher quality image.
- a mental image visualization system 100 including a DCNN that evaluates the sensitivity of a sample image generated by the mental image visualization device and generates an image showing the mental image will be described.
- FIG. 7 is a block diagram showing an example of the configuration of the mental image visualization system 100 according to the second embodiment.
- the same elements as those in FIG. 1 are designated by the same reference numerals, and detailed description thereof will be omitted.
- the function of the mental image visualization system 100 is realized by software using the computer 1000 shown in FIG. 4, as in the first embodiment.
- the mental image visualization system 100 includes a mental image visualization device 1A, a DCNN 13, and an addition averaging unit 14. The details of each component will be described below.
- the mental image visualization device 1A shown in FIG. 7 has the same configuration as the mental image visualization device 1 shown in FIG. In the mental image visualization device 1A, it is further clarified that the DNN 10 generates an image showing the mental image from the first feature vector obtained by the addition averaging unit 14.
- DNN10 generates an image showing a mental image from the first feature vector.
- the first feature vector is input to the DNN 10 by the input unit 12.
- the DNN 10 generates an image showing a mental image from the input first feature vector.
- the DNN 10 is composed of a style GAN
- the DNN 10 inputs a first feature vector into the composite network of the style GAN to cause the composite network to generate an image showing a mental image. Since the details are as described in the first embodiment, the description thereof will be omitted here.
- the acquisition unit 11 acquires an image showing the mental image generated by the DNN 10.
- the input unit 12 inputs the first feature vector obtained from the addition averaging unit 14 to the DNN 10 to the DNN 10.
- the DCNN13 uses a learning data set composed of a plurality of images prepared by using the psychological inverse correlation method and the results of sensitivity evaluations on the plurality of images performed by a subject having a mental image. It consists of a learned convolutional neural network.
- the learning data set may be a data set including various face images acquired from an existing database or the like, or may be a data set including various face images created by oneself. In this way, the DCNN 13 can learn in advance the preference of the subject who has the mental image of the visualization target.
- the plurality of images prepared by using the psychological inverse correlation method are, for example, facial images as in the first embodiment.
- the sensitivity evaluation for a plurality of images is, for example, the sensitivity evaluation for the beauty and ugliness of the face.
- the DCNN 13 predicts the result of the sensitivity evaluation for the plurality of sample images, and the sensitivity by the psychological inverse correlation method for the plurality of sample images. Output as the result of evaluation.
- the DCNN 13 is, for example, a pre-learned CNN (Convolution Neural Networks), one or more convolutional layers provided after the CNN, and a GAP (Global) provided after the one or more convolutional layers. It may be composed of an Average Pooling) layer. Further, the CNN is composed of a convolutional neural network having a plurality of convolutional layers and a plurality of pooling layers.
- CNN Convolution Neural Networks
- GAP Global
- the CNN is composed of a convolutional neural network having a plurality of convolutional layers and a plurality of pooling layers.
- FIG. 8 is a diagram showing an example of the network structure of DCNN13 shown in FIG.
- DCNN13 is, for example, as shown in FIG. 8, a convolutional neural network composed of a pre-learned VGG19, a three-layer convolutional layer, and a one-layer GAP layer.
- the pre-learned VGG19 is an example of a CNN possessed by DCNN13.
- the VGG19 can be obtained from a public database such as the Internet.
- the CNN of DCNN13 is not limited to the pre-learned VGG19.
- the DCNN 13 may have one or more convolution layers formed after the CNN, and is not limited to the case where the three convolution layers shown in FIG. 8 are formed.
- FIG. 9 is a diagram for explaining the learning method of DCNN13 shown in FIG.
- learning data composed of a plurality of facial images prepared by using the psychological inverse correlation method and the results of sensitivity evaluations on the plurality of images performed by a subject having a mental image to be visualized.
- a learning data set is obtained by assigning a sensitivity evaluation score indicating how beautiful the subject feels to each of a plurality of facial images in which a female face is captured.
- the face images of the learning data set are input to DCNN13 one by one as input images, the score of the sensitivity evaluation given by the subject is predicted, and if there is a difference, feedback is given to DCNN13 so as to eliminate the difference.
- all the facial images of the training data set are trained so as to minimize the difference between the score predicted by DCNN13 and the score given by the subject. That is, the DCNN 13 is subjected to learning (supervised learning) in which correct answer data is given using the learning data set.
- the DCNN 13 can learn the sensitivity evaluation of the target person (individual), so that the DCNN 13 can perform the sensitivity evaluation for any facial image on behalf of the target person.
- sensitivity judgment such as beauty and ugliness is performed based on the template (that is, mental image) that the individual has in his / her mind.
- the DCNN13 can perform the sensitivity evaluation for an arbitrary facial image on behalf of the subject by appropriately learning the parameters, the inventors have given the DCNN13 the mind that the individual has in mind. I found that it is possible to learn the target image.
- DCNN13 prepares and trains the above-mentioned learning data set, and saves the sensibility (mental image) of an individual with special skills such as a famous artist or designer in the parameter. Is possible.
- FIG. 10 is a diagram showing an example of the results of sensitivity evaluation for a plurality of sample images according to the second embodiment.
- the facial images of the plurality of women shown in FIG. 10 are examples of a plurality of sample images whose sensibilities were evaluated on behalf of the subject by DCNN13.
- 3.7, 2.2, 4.2, 3.1, ..., Shown in FIG. 10 are the sensibility evaluation scores predicted by DCNN13 on behalf of the subject for each of the facial images of the plurality of women. This is an example.
- FIG. 10 also shows the feature vectors of the facial images of a plurality of women.
- the addition averaging unit 14 performs a nonlinear transformation F on the weighted addition averaging of the feature vectors corresponding to the plurality of sample images according to the result of the sensitivity evaluation by the psychological inverse correlation method for the plurality of sample images, thereby performing the first feature vector. To get.
- the addition averaging unit 14 outputs the first feature vector to the input unit 12 to input the first feature vector to the DNN 10.
- FIG. 11A is a diagram showing an example of a first feature vector calculated by the addition averaging unit 14 according to the second embodiment.
- FIG. 11B is a diagram showing an image showing a mental image generated from the first feature vector shown in FIG. 11A.
- the feature vectors of the plurality of female face images shown in FIG. 10 are added and averaged based on the predicted sensitivity evaluation scores for the plurality of female face images shown in FIG. It is a non-linear conversion after being done.
- the first feature vector shown in FIG. 11A is input to the DNN 10 by the input unit 12.
- the synthetic network of DNN10 can generate the image shown in FIG. 11B as the image showing the mental image from the input first feature vector shown in FIG. 11A.
- the image shown in FIG. 11B corresponds to the mental image of the subject (individual) regarding the beautiful face.
- FIG. 12 is a diagram showing an overall picture of the mental image visualization method in the beauty ugliness evaluation according to the first and second embodiments.
- FIG. 12 shows a method of visualizing a mental image in the beauty ugliness evaluation by performing the beauty ugliness evaluation by an individual.
- the subject having the mental image may evaluate the sensibility of the plurality of sample images generated by the mental image visualization device 1A, or the DCNN13 may predict the sensibility evaluation as described above. good.
- the mental image visualization system 100 adds and averages the feature vectors of the plurality of sample images according to the results of the sensitivity evaluation of the generated plurality of sample images, and then is non-linear.
- An image showing a mental image can be generated from the converted first feature vector.
- the plurality of sample images generated by the mental image visualization device 1A are not images derived from the base image, which is an actual image selected or prepared by a person who attempts visualization, but are along a non-existent image or an existing image. It is a converted image. Further, since the image generated from the first feature vector obtained by performing nonlinear conversion after adding and averaging the feature vectors of the plurality of sample images can be used as an image showing a mental image, the image showing the mental image can be used. , It is not specified only in the prepared sample image. That is, according to the present embodiment, it is possible to generate an image that is closer to or shows the mental image of the subject.
- the plurality of sample images generated by the mental image visualization device 1A are, for example, high-resolution and high-quality images of 1024 pixels ⁇ 1024 pixels. Therefore, an image showing a mental image generated from the first feature vector calculated from the feature vectors of a plurality of sample images can also be generated as a high-quality image.
- the mental image visualization system 100 of the present embodiment it is possible to visualize a human mental image with a higher quality image.
- the DCNN 13 can be made to learn the sensitivity evaluation (mental image) for each individual. This makes it possible to store the sensibilities (mental images) of individuals with special skills, such as renowned artists or designers, in their parameters.
- the artist or designer can save his or her sensibility at a certain point in time as a parameter in a multi-layer neural network called DCNN13. For this reason, an artist or designer can create a work or design at any time by referring to an image showing his or her sensibility in the past.
- a learning data set consisting of a plurality of images prepared by using the psychological inverse correlation method and evaluation results of the plurality of images by a specific group such as a man in his 40s and a resident of Kansai. If the above can be prepared, DCNN13 can be made to learn the mental image of a specific group.
- DCNN13 that has learned the sensitivity evaluation (mental image) of a specific target person or a specific group, it is possible to predict the sensitivity evaluation of the pros and cons of a certain design, for example. This has the effect of eliminating the need to actually conduct a large-scale market research on the pros and cons of the design. Furthermore, there is an effect that it is possible to grasp in advance what kind of sensibility evaluation will be performed by the sales target person without conducting a large-scale market research on the design.
- an image showing the mental image of the designer can be generated, so that the mental image of the designer can be obtained by other than the designer such as a developer or a sales person. Can be shared as an image. For example, at the product image development stage, the mental image of the designer or developer's design can be visualized and shared within the group.
- the mental image visualization system 100 of the present embodiment it is possible to generate an image that visualizes an image (mental image) of a customer who performs custom-built construction or the like in a short time. This also has the effect of being able to develop the product design required by the customer with high accuracy.
- the ideal appearance of a house imaged by a high-income earner can be visualized as a concrete image, which can be used in the design of a building maker. It also has the effect of being easily reflected.
- the mental image visualization system 100 of the present embodiment it is possible to visualize a specific ideal face for each individual as a high-quality image. As a result, it is possible to share with others an image after completion of makeup or cosmetic surgery, which shows an image of an ideal face for each individual.
- the feature vectors corresponding to the plurality of sample images output by the mental image visualization device 1A are weighted, added and averaged according to the result of the sensitivity evaluation by the psychological inverse correlation method for the plurality of sample images by DCNN13. The case of doing so was explained.
- the first feature vector dimensionally compressed to one dimension is obtained from the 512-dimensional feature vector corresponding to a plurality of sample images, but the dimensional compression is not limited to one dimension. .. Dimensional compression may be performed in two or three dimensions.
- a modification 1 of the second embodiment a case where the dimension is compressed to about two or three dimensions will be described.
- FIG. 13 is a block diagram showing an example of the configuration of the mental image visualization system 100B according to the first modification of the second embodiment.
- the same elements as those in FIG. 7 are designated by the same reference numerals, and detailed description thereof will be omitted.
- the function of the mental image visualization system 100B is realized by software using the computer 1000 shown in FIG. 4, as in the first embodiment.
- the mental image visualization system 100B shown in FIG. 13 is different from the mental image visualization system 100 shown in FIG. 7 in that it includes a dimensional compression processing unit 14B instead of the addition averaging unit 14.
- a dimensional compression processing unit 14B instead of the addition averaging unit 14.
- the dimensional compression processing unit 14B is a feature in which the feature vectors corresponding to the plurality of sample images output by the mental image visualization device 1B are weighted according to the result of the sensitivity evaluation by the psychological inverse correlation method for the plurality of sample images by DCNN13. Calculate the vector. Then, the dimensional compression processing unit 14B outputs a plurality of eigenvectors obtained by performing dimensional compression of the weighted feature vector by STC (Spike-triggered covariance) analysis.
- STC Spike-triggered covariance
- FIG. 14 is a block diagram showing an example of a detailed configuration of the dimensional compression processing unit 14B according to the first modification of the second embodiment.
- the dimensional compression processing unit 14B includes a variance-covariance matrix calculation unit 141, a singular value decomposition execution unit 142, an eigenvalue selection unit 143, and an eigenvector derivation unit 144.
- the variance-covariance matrix calculation unit 141 weights the feature vectors corresponding to the plurality of sample images according to the results of the sensitivity evaluation by the psychological inverse correlation method for the plurality of sample images.
- the variance-covariance matrix calculation unit 141 calculates the variance-covariance matrix of the weighted feature vector by STC (Spike-triggered covariance) analysis.
- the singular value decomposition execution unit 142 executes singular value decomposition on the calculated variance-covariance matrix to obtain a plurality of eigenvalues.
- the STC matrix of the weighted feature vector may be calculated and decomposed into singular values to obtain a plurality of eigenvalues.
- STC analysis is an analysis method similar to principal component analysis.
- the STC analysis can be said to be a method of retaking the axis of the space that maximizes the variance of the distribution of notable features in the distribution obtained by giving a random value to the multidimensional vector so as to be orthogonal to each other. Retaking the axes of space so that they are orthogonal in multiple dimensions can be realized by taking the eigenvectors of the STC matrix. Then, by re-expressing the distribution of desired features with the re-taken axis, it is possible to express the multidimensional vector in a narrowed (compressed) form.
- the eigenvalue selection unit 143 selects at least two eigenvalues from the plurality of eigenvalues obtained by the singular value decomposition execution unit 142. For example, the eigenvalue selection unit 143 may select an eigenvalue having a higher variance value and a lower eigenvalue than the average when arranged in rank order from a plurality of eigenvalues obtained by the singular value decomposition execution unit 142. In this modification, the eigenvalue selection unit 143 selects three eigenvalues, such as the first and second largest eigenvalues and the smallest eigenvalues when arranged in rank order.
- the eigenvector derivation unit 144 derives at least two eigenvectors having any of the at least two eigenvalues selected by the eigenvalue selection unit 143.
- the eigenvector derivation unit 144 outputs at least two derived eigenvectors to the mental image visualization device 1B.
- the eigenvector derivation unit 144 derives three eigenvectors having the first and second largest eigenvalues and the smallest eigenvalue. In this case, the eigenvector derivation unit 144 outputs the three eigenvectors derived to the input unit 12 of the mental image visualization device 1B.
- the mental image visualization device 1B shown in FIG. 13 has the same configuration as the mental image visualization devices 1 and 1A shown in FIGS. 1 and 7.
- the DNN 10 generates an image showing at least two sub-mental images from at least two eigenvectors obtained by the dimensional compression processing unit 14B.
- Each of at least two sub-mental images corresponds to one image obtained by decomposing the above-mentioned mental image.
- the input unit 12 inputs at least two eigenvectors obtained from the dimensional compression processing unit 14B to the DNN 10.
- At least two eigenvectors are input to the DNN 10 by the input unit 12.
- the DNN 10 then generates an image showing at least two subpsychic images from the input at least two eigenvectors.
- the input unit 12 inputs each of at least two eigenvectors to the styleGAN generator.
- the styleGAN generator generates an image showing at least two sub-mental images constituting the mental image, which are assumed to be orthogonal to each other.
- the acquisition unit 11 acquires an image showing a sub-mental image generated by the DNN 10.
- FIG. 15 is a diagram schematically showing an operation example of the mental image visualization system 100B according to the first modification of the second embodiment.
- FIG. 15 a case is shown in which an image showing a sub-mental image regarding the quality of the appearance of the automobile is generated.
- a plurality of sample images showing the appearance of the automobile generated by the mental image visualization device 1B are shown as sample images S 1 , S 2 , ..., S N-1 , S N.
- the sample images S 1 , S 2 , ..., S N-1 , and S N are each represented by a feature vector of a 512-dimensional vector as described above by using the mental image visualization system 100B.
- the DCNN 13 is made to output the result of the sensitivity evaluation by the psychological inverse correlation method for the sample images S 1 , S 2 , ..., S N-1 , and S N.
- the dimension compression processing unit 14B calculates the variance-covariance matrix 141a. Specifically, the dimensional compression processing unit 14B weights the sample images S 1 , S 2 , ..., S N-1 , and S N according to the result of the sensitivity evaluation by the psychological inverse correlation method, and the feature vectors W 1 , W.
- the variance-covariance matrix 141a of 2 , ..., W N-1 , W N is calculated by STC analysis.
- the dimensional compression processing unit 14B performs the eigenvector analysis 142a. Specifically, the dimensional compression processing unit 14B executes singular value decomposition on the calculated variance-covariance matrix 141a to obtain 512 eigenvalues. Then, the dimensional compression processing unit 14B creates a graph in which 512 eigenvalues obtained by executing the singular value decomposition are arranged in rank order, for example, a graph as shown in FIG. 16A.
- FIG. 16A is a diagram showing an example of a graph in which a plurality of eigenvalues obtained by executing the singular value decomposition according to the operation example shown in FIG. 15 are arranged in rank order.
- the vertical axis shown in FIG. 16A shows the variance (variation).
- FIG. 16A it can be seen that there are eigenvalues represented by dots that overlap and look like a line, and eigenvalues that are far from what looks like a line. These distant points are the eigenvalues with the first and second largest variance (variation) values when arranged in rank order, and the eigenvalues with the smallest variance (variation) value, respectively, Sub1, Sub2, and Sub. It is shown as Sub512.
- the dimensional compression processing unit 14B is made to select the first and second largest eigenvalues when arranged in rank order and the three smallest eigenvalues, that is, three eigenvalues shown as Sub1, Sub2, and Sub512.
- the selection of these three eigenvalues may be made by an operation on the mental image visualization system 100B or a predetermined algorithm.
- the dimensional compression processing unit 14B derives three eigenvectors having three eigenvalues shown as Sub1, Sub2, and Sub512.
- the mental image visualization device 1B is made to generate an image showing three sub-mental images. It should be noted that this generation may be performed by an operation on the mental image visualization system 100B or a predetermined algorithm. Further, FIG. 15 shows images Sub1, Sub2, and Sub512 showing three sub-mental images generated by the mental image visualization device 1B.
- the images Sub1, Sub2, and Sub512 are originally color images similar to the sample images shown in grayscale shown in FIG. 3A, but are shown as schematic diagrams for convenience.
- FIG. 16B is a diagram for explaining the relationship between an image showing a sub-mental image and an image showing a mental image using the table shown in FIG. 16A.
- the image T 1 showing the mental image is originally a color image similar to the sample image shown in grayscale shown in FIG. 3A, but is shown as a diagram for convenience so that it can be easily compared on the drawing. There is.
- the appearance of an automobile shown in image Sub1 showing a sub-mental images eigenvalues are generated from the highest eigenvectors, close to the exterior of an automobile shown in the image T 1 showing the mental imagery I understand that. That is, it can be said that the sub-mental image generated from the eigenvector having the highest eigenvalue has a high contribution rate constituting the mental image and is close to the subject's preference (mental image).
- the image Sub512 showing a sub-mental images eigenvalues are generated from the highest eigenvector far the be seen from the image T 1 showing the mental imagery.
- the sub-mental image generated from the eigenvector with the lowest eigenvalue has a low contribution rate that constitutes the mental image and is not the ideal (mental image) of the subject.
- the subject's mental image including those relating to the appearance of the vehicle, is composed of not only the subject's ideals but also those that are not.
- the mental image which is different from the subject's ideal (preference), is suppressive but has as a component.
- the mental image is componentized from two or more eigenvectors obtained from the multidimensional feature vectors corresponding to the plurality of sample images output by the mental image visualization device 1B. It is possible to generate and visualize an image of a sub-mental image that looks like it has been decomposed.
- DCNN13 is not essential in the mental image visualization system 100B according to the above-mentioned modification 1.
- the subject may evaluate the sensitivity of the sample image generated by the mental image visualization system 100B by the psychological inverse correlation method, and input the result to the dimension compression processing unit 14B.
- Modification 2 In the above-described modification 1, an example in which dimension compression is performed by STC analysis has been described, but the present invention is not limited to this. Dimension compression may be performed by applying DMD (Dynamic Mode Decomposition). Hereinafter, the points different from the first modification will be mainly described.
- DMD Dynamic Mode Decomposition
- FIG. 17 is a block diagram showing an example of a detailed configuration of the dimensional compression processing unit 14C according to the second modification of the second embodiment.
- the same elements as those in FIG. 14 are designated by the same reference numerals, and detailed description thereof will be omitted.
- the dimensional compression processing unit 14C is a feature in which the feature vectors corresponding to the plurality of sample images output by the mental image visualization device 1B are weighted according to the result of the sensitivity evaluation by the psychological inverse correlation method for the plurality of sample images by DCNN13. Calculate the vector. Then, the dimensional compression processing unit 14B outputs a plurality of eigenvectors obtained by applying DMD to the weighted feature vector and performing dimensional compression.
- the dimensional compression processing unit 14C includes a DMD application unit 141C, an eigenvalue selection unit 143, and an eigenvector derivation unit 144, as shown in FIG.
- the DMD application unit 141C applies the DMD to the feature vectors weighted to the feature vectors corresponding to the plurality of sample images according to the result of the sensitivity evaluation by the psychological inverse correlation method for the plurality of sample images, thereby applying a plurality of eigenvalues. obtain.
- At least two eigenvectors are input to the DNN 10 by the input unit 12.
- the DNN 10 then generates an image showing at least two subpsychic images from the input at least two eigenvectors.
- the input unit 12 inputs each of at least two eigenvectors to the styleGAN generator.
- the styleGAN generator generates an image showing at least two sub-mental images constituting the mental image but not assuming orthogonality to each other.
- DCNN13 is not essential in the mental image visualization system 100B according to this modified example.
- the subject may evaluate the sensitivity of the sample image generated by the mental image visualization system 100B by the psychological inverse correlation method, and input the result to the dimension compression processing unit 14C.
- FIG. 18 is a block diagram showing an example of the configuration of the recommendation system 200 according to the third embodiment.
- the function of the recommendation system 200 is realized by software using the computer 1000 shown in FIG.
- FIG. 19 is a diagram showing an example of a mental image or a sub-mental image according to the third embodiment.
- An example of the mental image or sub-mental image shown in FIG. 19 is originally a color image, but is shown as a line diagram for convenience.
- FIG. 20 is a diagram showing an example of the latent space according to the third embodiment and the position of the mental image or the sub-mental image.
- the recommendation system 200 includes a storage unit 20 and a recommendation image generation UI (User Interface) 21. The details of each component will be described below.
- the storage unit 20 is composed of an HDD (Hard Disk Drive), a memory, or the like, and stores a plurality of recommendation candidate images 201 and the like.
- the plurality of recommendation candidate images 201 are composed of image groups of existing products of, for example, tens to hundreds of scales, such as a plurality of product images, and are image groups of product candidates that the target person (user) wants to recommend (recommend).
- the scale of the image group is an example, and the scale may exceed several hundreds.
- the plurality of recommendation candidate images 201 will be described as being composed of an image group of an existing product (interior product) constituting the interior.
- the recommendation image generation UI 21 presents to the subject a recommendation candidate image 201 showing an existing product close to the mental image of the subject among the plurality of recommendation candidate images 201 stored in the storage unit 20.
- the recommendation image generation UI 21 is a recommendation candidate image 201 showing an interior product that is close to the mental image (preference) of the subject among a plurality of recommendation candidate images 201 that are stored in the storage unit 20 and each indicates an interior product.
- the recommendation image generation UI 21 includes a memory 210, an acquisition unit 211, an embedding execution unit 212, a distance calculation unit 213, a selection image unit 214, and a display control unit 215.
- the memory 210 stores the DNN 2101 and the mental image image (sub-mental image image) 2102.
- the DNN2101 may be a copy of the DNN10 obtained from the mental image visualization system 100 (100B) shown in FIG. 7 (FIG. 13), or may be the trained styleGAN described in embodiments 1 and 2. good.
- the DNN 2101 may be in any form as long as it is stored in the memory 210 in a form in which the latent space of the trained styleGAN in the DNN 10 can be used.
- the styleGAN is pre-learned using, for example, a dataset containing a plurality of existing interior images.
- the mental image image (sub-mental image image) 2102 is generated by, for example, the mental image visualization system 100 (100B) shown in FIG. 7 (FIG. 13), is acquired in advance, and is stored in the memory 210. ing.
- the mental image image (sub-mental image image) 2102 that has been acquired in advance and stored in the memory 210 is, for example, the image Tx of the interior product shown in FIG.
- the acquisition unit 211 acquires a plurality of recommendation candidate images 201 from the storage unit 20 and outputs them to the embedding execution unit 212. Further, the acquisition unit 211 acquires the mental image image (sub-mental image image) 2102 from the memory 210 and outputs it to the embedding execution unit 212.
- the acquisition unit 211 acquires the latent space of the DNN 10 in advance and stores it in the memory 210.
- the acquisition unit 211 acquires the latent space of the DNN 10 by acquiring a copy of the DNN 10 in advance from the mental image visualization system 100B (100).
- the acquisition unit 211 acquires the latent space of the styleGAN in which the points (vector positions) as shown in FIG. 20 are distributed.
- the embedding execution unit 212 embedding the mental image image (sub-mental image image) 2102 acquired from the acquisition unit 211 in advance in the latent space of the DNN 2101, and embedding the mental image image (sub-mental image image) 2102.
- the position (vector position) is obtained.
- the embedding execution unit 212 embeddings the image Tx of the interior product shown in FIG. 19, for example, in the latent space of the DNN 2101, and the position (vector position) of the image Tx as shown in FIG. To get.
- the image Tx of the interior product shown in FIG. 19 is an example of the mental image image (sub-mental image image) 2102.
- the embedding execution unit 212 embedding each of the plurality of recommendation candidate images 201 acquired by the acquisition unit 211 into the latent space of the DNN 2101, and the positions (vector positions) of the plurality of recommendation candidate images 201 in the latent space. ).
- the distance calculation unit 213 calculates the distance between the position of the mental image (vector position) in the latent space of the DNN 2101 and the position (vector position) of each of the plurality of embedding recommended candidate images 201.
- FIG. 21 is a diagram showing an example of the distance between the position of the mental image or sub-mental image and the position of one recommendation candidate image 201a in the latent space according to the third embodiment.
- FIG. 21 shows the position of the image Tx of the interior product shown in FIG. 19 in the latent space shown in FIG. 20 and the position of one recommendation candidate image 201a in the latent space shown in FIG.
- One recommendation candidate image 201a is shown as an image of a curtain which is an example of an interior product.
- the distance calculation unit 213 calculates the distance d between the position of the image Tx of the interior product shown in FIG. 19 and the position of one recommendation candidate image 201a in the latent space shown in FIG. Similarly, the distance calculation unit 213 calculates the distance between the position of the image Tx of the interior product shown in FIG. 19 and the position of each of the plurality of recommendation candidate images 201 in the latent space shown in FIG.
- the selection image unit 214 selects one or more recommendation candidate images 201 corresponding to the distances equal to or less than the threshold value among the plurality of distances calculated by the distance calculation unit 213 among the plurality of recommendation candidate images 201 acquired by the acquisition unit 211. select.
- the selected image unit 214 uses a plurality of distances calculated by the distance calculation unit 213 to display an interior product shown in FIG. 19, which is a mental image image (sub-mental image image) 2102 of the subject.
- Image Select one or more interior products that are close to Tx.
- the display control unit 215 presents one or more recommendation candidate images 201 selected by the selection image unit 214 to the subject having the mental image image (sub-mental image image) 2102. That is, the display control unit 215 controls the display device 300 and presents the recommended product to the target person by displaying an image showing the recommended product on the display device 300.
- the display control unit 215 presents the recommendation candidate image 201 selected by the selection image unit 214, for example, shown in FIGS. 22A to 22C, to the target person by displaying the recommendation candidate image 201 on the display device 300.
- FIGS. 22A to 22C are diagrams showing an example of a recommendation candidate image presented by the recommendation image generation UI 21 according to the third embodiment, respectively.
- 22A to 22C each show an example of the recommendation candidate image 201 presented by the recommendation image generation UI 21 according to the third embodiment and its explanatory text.
- the recommendation candidate images 201a, 201b, and 201c shown in FIGS. 22A to 22C are originally color images, but are shown as line diagrams for convenience of explanation.
- 22A, 22B, and 22C show recommendation candidate images 201a, 201b, and 201c, which are images of a curtain as an example of an interior product, and their explanatory text.
- the display device 300 has a display for displaying an image, characters, or the like.
- the display is, for example, a liquid crystal display, a plasma display, an organic EL (Electro-Luminescence) display, or the like.
- the display device 300 has a function as a UI for receiving an input operation by the target person, and includes, for example, a keyboard, a mouse, a touch sensor, a touch pad, and the like.
- the recommendation system 200 acquires, for example, a mental image image (sub-mental image image) 2102 in advance from the mental image visualization system 100 (100B) shown in FIG. 7 (FIG. 13) and stores the memory. It was explained that it is stored in 210.
- the recommendation system 200 may passively acquire a mental image image (sub-mental image image) 2102 from the mental image visualization system 100 (100B) shown in FIG. 7 (FIG. 13). Not exclusively.
- the recommendation system 200 may actively acquire the mental image image (sub-mental image image) 2102 by cooperating with the mental image visualization system 100 (100B) shown in FIG. 7 (FIG. 13). .. That is, the recommendation system 200 connects the mental image visualization system 100 (100B) shown in FIG. 7 (FIG. 13) to the mental image image (sub-mental image image) through the interaction with the target person via the display device 300. 2102 may be generated.
- FIG. 23A is a diagram showing an example of a method of acquiring a mental image or a sub-mental image of the recommendation system 200 according to the third embodiment.
- FIG. 23B is a diagram showing an example of a sample image presentation and evaluation method evaluated by the subject according to the third embodiment.
- the same elements as those in FIG. 18 and the like are designated by the same reference numerals, and detailed description thereof will be omitted.
- the recommendation system 200 first acquires a plurality of sample images related to the interior product generated by the mental image visualization system 100 (100B). For example, the recommendation system 200 acquires about 10 sample images.
- the recommendation system 200 displays each of the acquired sample images in order on the display device 300, and causes the subject to input how much he / she likes the interior product shown in the displayed sample images.
- FIG. 23B shows an image Sx of an interior with chairs, desks, curtains, etc. as an example of a sample image for evaluation on the display device 300, and a score input for asking the target person to input a desired degree. The button is shown.
- the recommendation system 200 obtains scores for a plurality of sample images including the image Sx input by the subject, and uses these scores as the evaluation result of the sensitivity evaluation by the psychological inverse correlation method, which is a mental image visualization system. Enter in 100 (100B).
- the mental image visualization system 100 (100B) the score which is the evaluation result of the sensitivity evaluation by the psychological inverse correlation method and the feature vector corresponding to the plurality of sample images output by the mental image visualization device 1A From, a mental image (sub-mental image) is generated. Since the details of the generation method have been described in the first and second embodiments, the description thereof will be omitted here.
- the recommendation system 200 acquires a mental image image (sub-mental image image) generated by the mental image visualization system 100 (100B), and stores the mental image image (sub-mental image image) 2102 as a memory. Store in 210.
- the recommendation system 200 acquires an image of the mental image (sub-mental image) of each unspecified subject from the mental image visualization system 100 (100B) using about 10 sample images. can do.
- the recommendation system 200 is an existing image close to a mental image (sub-mental image) by using an image of the mental image (sub-mental image) possessed by the subject. You can recommend the product. In other words, by using the image of the target person's mental image (sub-mental image), the target person's preference even without the target person's behavior history information such as the purchase history required by the conventional recommendation engine. You can select an existing product that suits you and make a recommendation.
- the recommendation system 200 can acquire an image of a mental image (sub-mental image) possessed by each unspecified target person by linking with the mental image visualization system 100 (100B). Then, by using the mental image (sub-mental image) image of each unspecified target person, the target person's behavior history information such as the purchase history required by the conventional recommendation engine is not available. You can select an existing product that suits your taste and make a recommendation. As a result, even for an unspecified target person who visits the EC site, even if there is no behavior history information of the target person such as purchase history, it is possible to select and recommend an existing product that suits the target person's preference. can.
- the recommendation system 200 has been described as a system different from the mental image visualization system 100 (100B), it is not limited to this.
- the recommendation system 200 may include a mental image visualization system 100 (100B) inside.
- Part or all of the components constituting the above-mentioned mental image visualization device, mental image visualization system or recommendation system are specifically a microprocessor, ROM, RAM, hard disk unit, display unit, keyboard. , A computer system composed of a mouse and the like may be used.
- a computer program is stored in the RAM or the hard disk unit.
- the microprocessor operates according to the computer program, each device achieves its function.
- a computer program is configured by combining a plurality of instruction codes indicating instructions to a computer in order to achieve a predetermined function.
- the system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of components on one chip, and specifically, is a computer system including a microprocessor, a ROM, a RAM, and the like. .. A computer program is stored in the RAM. When the microprocessor operates according to the computer program, the system LSI achieves its function.
- the IC card or the module is a computer system composed of a microprocessor, a ROM, a RAM, and the like.
- the IC card or the module may include the above-mentioned super multifunctional LSI.
- the microprocessor operates according to a computer program, the IC card or the module achieves its function. This IC card or this module may have tamper resistance.
- Some or all of the components constituting the above-mentioned mental image visualization device, mental image visualization system, or recommendation system may be distributed and configured as a network structure including a server and cloud storage.
- the data input device and the arithmetic unit can exist separately in a remote place, and a plurality of input devices and arithmetic units may be distributed and exist.
- the present disclosure can be used for mental image visualization methods, mental image visualization devices and programs, and in particular, mental image visualization methods for visualizing the mental image of a subject such as an individual or a group. It can be used for devices and programs.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Analysis (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022500453A JP7482551B2 (ja) | 2020-02-12 | 2021-02-10 | 心的イメージ可視化方法、心的イメージ可視化装置及びプログラム |
| US17/798,750 US12437186B2 (en) | 2020-02-12 | 2021-02-10 | Mental image visualization method, mental image visualization device and recording medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2020-021509 | 2020-02-12 | ||
| JP2020021509 | 2020-02-12 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021162055A1 true WO2021162055A1 (ja) | 2021-08-19 |
Family
ID=77291809
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/005052 Ceased WO2021162055A1 (ja) | 2020-02-12 | 2021-02-10 | 心的イメージ可視化方法、心的イメージ可視化装置及びプログラム |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US12437186B2 (https=) |
| JP (1) | JP7482551B2 (https=) |
| WO (1) | WO2021162055A1 (https=) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12437186B2 (en) * | 2020-02-12 | 2025-10-07 | Osaka University | Mental image visualization method, mental image visualization device and recording medium |
| CN112989904B (zh) * | 2020-09-30 | 2022-03-25 | 北京字节跳动网络技术有限公司 | 风格图像生成方法、模型训练方法、装置、设备和介质 |
| US11989916B2 (en) * | 2021-10-11 | 2024-05-21 | Kyocera Document Solutions Inc. | Retro-to-modern grayscale image translation for preprocessing and data preparation of colorization |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH09101970A (ja) * | 1995-10-06 | 1997-04-15 | Omron Corp | 画像検索方法および画像検索装置 |
| JP2007249319A (ja) * | 2006-03-14 | 2007-09-27 | Doshisha | 画面の表示方法 |
| JP2018063504A (ja) * | 2016-10-12 | 2018-04-19 | 株式会社リコー | 生成モデル学習方法、装置及びプログラム |
| JP6448839B1 (ja) * | 2018-06-20 | 2019-01-09 | 株式会社 ディー・エヌ・エー | 画像生成装置、画像生成器、画像識別器、画像生成プログラム、及び、画像生成方法 |
Family Cites Families (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9594977B2 (en) * | 2015-06-10 | 2017-03-14 | Adobe Systems Incorporated | Automatically selecting example stylized images for image stylization operations based on semantic content |
| DE102015009981A1 (de) * | 2015-07-31 | 2017-02-02 | Eberhard Karls Universität Tübingen | Verfahren und Vorrichtung zur Bildsynthese |
| US9857953B2 (en) * | 2015-11-17 | 2018-01-02 | Adobe Systems Incorporated | Image color and tone style transfer |
| US9940551B1 (en) * | 2016-06-17 | 2018-04-10 | Google Llc | Image generation using neural networks |
| US10755171B1 (en) * | 2016-07-06 | 2020-08-25 | Google Llc | Hiding and detecting information using neural networks |
| US10147459B2 (en) * | 2016-09-22 | 2018-12-04 | Apple Inc. | Artistic style transfer for videos |
| US10198839B2 (en) * | 2016-09-22 | 2019-02-05 | Apple Inc. | Style transfer-based image content correction |
| CN116823593A (zh) * | 2016-10-21 | 2023-09-29 | 谷歌有限责任公司 | 风格化输入图像 |
| US10192321B2 (en) * | 2017-01-18 | 2019-01-29 | Adobe Inc. | Multi-style texture synthesis |
| CN108734749A (zh) * | 2017-04-20 | 2018-11-02 | 微软技术许可有限责任公司 | 图像的视觉风格变换 |
| US10565757B2 (en) * | 2017-06-09 | 2020-02-18 | Adobe Inc. | Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images |
| US10318889B2 (en) * | 2017-06-26 | 2019-06-11 | Konica Minolta Laboratory U.S.A., Inc. | Targeted data augmentation using neural style transfer |
| US10832387B2 (en) * | 2017-07-19 | 2020-11-10 | Petuum Inc. | Real-time intelligent image manipulation system |
| US10789694B1 (en) * | 2017-09-11 | 2020-09-29 | Apple Inc. | Real-time adjustment of temporal consistency constraints for video style |
| KR102814129B1 (ko) * | 2018-08-02 | 2025-05-29 | 삼성전자주식회사 | 영상 처리 장치 및 그 동작방법 |
| CN109472270B (zh) * | 2018-10-31 | 2021-09-24 | 京东方科技集团股份有限公司 | 图像风格转换方法、装置及设备 |
| US11295494B2 (en) * | 2019-06-26 | 2022-04-05 | Adobe Inc. | Image modification styles learned from a limited set of modified images |
| CN110634167B (zh) * | 2019-09-27 | 2021-07-20 | 北京市商汤科技开发有限公司 | 神经网络训练方法及装置和图像生成方法及装置 |
| US11210560B2 (en) * | 2019-10-02 | 2021-12-28 | Mitsubishi Electric Research Laboratories, Inc. | Multi-modal dense correspondence imaging system |
| US11158090B2 (en) * | 2019-11-22 | 2021-10-26 | Adobe Inc. | Enhanced video shot matching using generative adversarial networks |
| US11321939B2 (en) * | 2019-11-26 | 2022-05-03 | Microsoft Technology Licensing, Llc | Using machine learning to transform image styles |
| CN114981836B (zh) * | 2020-01-23 | 2025-05-23 | 三星电子株式会社 | 电子设备和电子设备的控制方法 |
| US12437186B2 (en) * | 2020-02-12 | 2025-10-07 | Osaka University | Mental image visualization method, mental image visualization device and recording medium |
| US11823490B2 (en) * | 2021-06-08 | 2023-11-21 | Adobe, Inc. | Non-linear latent to latent model for multi-attribute face editing |
-
2021
- 2021-02-10 US US17/798,750 patent/US12437186B2/en active Active
- 2021-02-10 WO PCT/JP2021/005052 patent/WO2021162055A1/ja not_active Ceased
- 2021-02-10 JP JP2022500453A patent/JP7482551B2/ja active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH09101970A (ja) * | 1995-10-06 | 1997-04-15 | Omron Corp | 画像検索方法および画像検索装置 |
| JP2007249319A (ja) * | 2006-03-14 | 2007-09-27 | Doshisha | 画面の表示方法 |
| JP2018063504A (ja) * | 2016-10-12 | 2018-04-19 | 株式会社リコー | 生成モデル学習方法、装置及びプログラム |
| JP6448839B1 (ja) * | 2018-06-20 | 2019-01-09 | 株式会社 ディー・エヌ・エー | 画像生成装置、画像生成器、画像識別器、画像生成プログラム、及び、画像生成方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2021162055A1 (https=) | 2021-08-19 |
| US20230086573A1 (en) | 2023-03-23 |
| US12437186B2 (en) | 2025-10-07 |
| JP7482551B2 (ja) | 2024-05-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Hsiao et al. | A morphing method for shape generation and image prediction in product design | |
| JP7482551B2 (ja) | 心的イメージ可視化方法、心的イメージ可視化装置及びプログラム | |
| JP2022513858A (ja) | 顔画像生成用のデータ処理方法、データ処理機器、コンピュータプログラム、及びコンピュータ機器 | |
| CN108135520A (zh) | 从功能性大脑图像生成心理内容的自然语言表示 | |
| CN113661520A (zh) | 修改毛发外观 | |
| Danckaers et al. | Posture normalisation of 3D body scans | |
| Mennella et al. | Generating a novel synthetic dataset for rehabilitation exercises using pose-guided conditioned diffusion models: A quantitative and qualitative evaluation | |
| Eldar et al. | Ergonomic design visualization mapping-developing an assistive model for design activities | |
| Lee et al. | BIM and individual physical needs-trained model-enabled approach to spatial redesign and visualization | |
| Adilova et al. | Personalized Aesthetic Assessment: integrating fuzzy logic and color preferences | |
| Li et al. | Remodeling of mannequins based on automatic binding of mesh to anthropometric parameters | |
| KR20200046844A (ko) | 색채심리진단 플랫폼 상에서 심리상태 결과에 따른 심리개선 개인화 추천 서비스 시스템 | |
| CN116596746A (zh) | 一种个性化定制妆容的方法、系统、电子设备和存储介质 | |
| WO2017219123A1 (en) | System and method for automatically generating a facial remediation design and application protocol to address observable facial deviations | |
| Modi et al. | Role of eye tracking in human computer interaction | |
| Awan et al. | Estimating perceptual attributes of haptic textures using visuo-tactile data | |
| Farooq et al. | A comparative study on diffusion sampling methods across diverse medical imaging modalities | |
| WO2022064660A1 (ja) | 機械学習プログラム、機械学習方法および推定装置 | |
| CN120530424A (zh) | 用于对用户的老化迹象的演变进行建模和评估的方法 | |
| Zonyfar et al. | E-government in the public health sector: kansei engineering method for redesigning website | |
| Förger et al. | Animating with style: defining expressive semantics of motion | |
| Wu | Gesture recognition in virtual reality | |
| Frutos-Bernal et al. | Tucker3-PCovR: The Tucker3 principal covariates regression model | |
| Malikova et al. | Multisensory analytics: case of visual-auditory analysis of scalar fields | |
| Wang | Creation and Research of Interactive Art Design Works Integrating IoT Technology |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21752983 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022500453 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21752983 Country of ref document: EP Kind code of ref document: A1 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 17798750 Country of ref document: US |