JP6924128B2

JP6924128B2 - Morphing image generator and morphing image generation method

Info

Publication number: JP6924128B2
Application number: JP2017225932A
Authority: JP
Inventors: 彰夫石川; 菅谷　史昭; 史昭菅谷
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2017-11-24
Filing date: 2017-11-24
Publication date: 2021-08-25
Anticipated expiration: 2037-11-24
Also published as: JP2019096130A

Description

本発明は、機械学習モデルを用いてモーフィング画像を生成するモーフィング画像生成装置及びモーフィング画像生成方法に関する。 The present invention relates to a morphing image generator and a morphing image generation method for generating a morphing image using a machine learning model.

変化する前後の被写体が写っている画像に基づいて、被写体が変化する過程を段階的に表したモーフィング画像を生成する装置が知られている。特許文献１には、変化する前後の被写体の画像の中間的な画像を生成することによりモーフィング画像を生成する技術が開示されている。 There is known a device that generates a morphing image that shows the process of changing a subject stepwise based on an image showing a subject before and after the change. Patent Document 1 discloses a technique for generating a morphing image by generating an intermediate image of an image of a subject before and after the change.

特開２００１−０７６１７７号公報Japanese Unexamined Patent Publication No. 2001-0767177

モーフィング画像を生成する方法においては、被写体が変化する前の画像における頂点と被写体が変化した後の画像の頂点とを特徴点として抽出し、抽出した特徴点間の中間値を算出することにより中間画像を生成する。しかしながら、変化する前後の画像間に共通する明確な特徴点が存在しない場合、被写体が変化する前後の画像において対応付けられる特徴点を抽出することが困難であった。その結果、滑らかに変化する質の高いモーフィング画像を生成することができないという問題があった。 In the method of generating a morphing image, the vertices of the image before the subject changes and the vertices of the image after the subject changes are extracted as feature points, and the intermediate value between the extracted feature points is calculated to be intermediate. Generate an image. However, when there is no clear feature point in common between the images before and after the change, it is difficult to extract the feature points associated with the images before and after the change of the subject. As a result, there is a problem that it is not possible to generate a high-quality morphing image that changes smoothly.

そこで、本発明はこれらの点に鑑みてなされたものであり、モーフィング画像の質を向上させることができるモーフィング画像生成装置及びモーフィング画像生成方法を提供することを目的とする。 Therefore, the present invention has been made in view of these points, and an object of the present invention is to provide a morphing image generator and a morphing image generation method capable of improving the quality of a morphing image.

本発明の第１の態様に係るモーフィング画像生成装置は、被写体の少なくとも一部が変化する前の画像である第１画像と、被写体の少なくとも一部が変化した後の画像である第２画像と、を取得する画像取得部と、前記第１画像及び前記第２画像のそれぞれに、入力された画像に基づいて当該画像に含まれる被写体の種別を出力可能な機械学習モデルに含まれる複数の処理層を伝搬させる伝搬制御部と、前記複数の処理層から選択した後段処理層、及び前記後段処理層の直前の処理層である前段処理層の両方の処理層において共通に活性化している、前記第１画像に基づいて前記後段処理層及び前記前段処理層から出力された一以上の第１画像出力と前記第２画像に基づいて前記後段処理層及び前記前段処理層から出力された一以上の第２画像出力とを抽出する抽出部と、前記一以上の第１画像出力に基づいて一以上の第１画像特徴点を検出し、かつ前記一以上の第２画像出力に基づいて一以上の第２画像特徴点を検出する特徴点検出部と、前記一以上の第１画像特徴点と前記一以上の第２画像特徴点とに基づいて、前記被写体が変化する過程を段階的に表した一以上の中間画像を生成する中間画像生成部と、を有する。 The morphing image generator according to the first aspect of the present invention includes a first image which is an image before at least a part of the subject is changed and a second image which is an image after at least a part of the subject is changed. , And a plurality of processes included in the machine learning model capable of outputting the type of the subject included in the image based on the input image to each of the first image and the second image. The propagation control unit that propagates the layers, the post-stage treatment layer selected from the plurality of treatment layers, and the pre-stage treatment layer that is the treatment layer immediately before the post-stage treatment layer are commonly activated. One or more first image outputs output from the post-processing layer and the pre-processing layer based on the first image, and one or more output from the post-processing layer and the pre-processing layer based on the second image. An extraction unit that extracts the second image output, one or more first image feature points are detected based on the one or more first image outputs, and one or more based on the one or more second image outputs. The process of changing the subject is shown stepwise based on the feature point detection unit that detects the second image feature point, the one or more first image feature points, and the one or more second image feature points. It has an intermediate image generation unit that generates one or more intermediate images.

前記抽出部は、前記第１画像が前記複数の処理層の一部である前段処理層及び後段処理層の順に伝搬したことにより前記後段処理層から出力された複数の後段第１画像出力、及び前記第２画像が前段処理層及び後段処理層の順に伝搬したことにより前記後段処理層から出力された複数の後段第２画像出力から、共通に活性化している一以上の後段第１画像出力及び一以上の後段第２画像出力を抽出する後段抽出部と、前記一以上の後段第１画像出力及び前記一以上の後段第２画像出力を活性化させる要因となった前記前段処理層から出力された複数の前段第１画像出力、及び前記前段処理層から出力された複数の前段第２画像出力のうち、共通に活性化している一以上の前段第１画像出力及び一以上の前段第２画像出力を抽出する前段抽出部と、を有してもよい。 The extraction unit has a plurality of post-stage first image outputs and a plurality of post-stage first image outputs output from the post-stage processing layer because the first image propagates in the order of the pre-stage processing layer and the post-stage processing layer which are a part of the plurality of processing layers. One or more post-stage first image outputs and one or more post-stage first image outputs that are commonly activated from the plurality of post-stage second image outputs output from the post-stage processing layer due to the propagation of the second image in the order of the pre-stage processing layer and the post-stage processing layer. It is output from the post-stage extraction unit that extracts one or more post-stage second image outputs, and the pre-stage processing layer that is a factor that activates the one or more post-stage first image output and the one or more post-stage second image output. Of the plurality of pre-stage first image outputs and the plurality of pre-stage second image outputs output from the pre-stage processing layer, one or more pre-stage first image outputs and one or more pre-stage second images that are commonly activated. It may have a pre-stage extraction unit that extracts an output.

前記前段抽出部は、前記複数の前段第１画像出力及び前記複数の前段第２画像出力のうち、活性化している大きさに基づいて、前記一以上の前段第１画像出力及び前記一以上の前段第２画像出力を抽出してもよい。 The pre-stage extraction unit has one or more pre-stage first image outputs and one or more pre-stage first image outputs based on the activated size of the plurality of pre-stage first image outputs and the plurality of pre-stage second image outputs. The second image output in the first stage may be extracted.

前記機械学習モデルは、畳み込みニューラルネットワークを含み、前記後段処理層は、出力層、全結合層、正規化層、プーリング層、及び畳み込み層のうちのいずれかの層であってもよい。
前記前段処理層は、全結合層、正規化層、プーリング層、畳み込み層及び入力層のうちのいずれかの層であってもよい。 The machine learning model includes a convolutional neural network, and the post-processing layer may be any one of an output layer, a fully connected layer, a normalized layer, a pooling layer, and a convolutional layer.
The pretreatment layer may be any one of a fully bonded layer, a normalized layer, a pooling layer, a convolution layer, and an input layer.

前記抽出部は、前記複数の処理層のうち、最後尾の処理層である最後尾層を後段処理層として選択した場合において、前記最後尾層において共通に活性化している前記一以上の第１画像出力及び前記一以上の第２画像出力がない場合、前記最後尾層より前の処理層において共通に活性化している前記一以上の第１画像出力及び前記一以上の第２画像出力を抽出してもよい。 When the last layer, which is the last treatment layer, is selected as the subsequent treatment layer among the plurality of treatment layers, the extraction unit is one or more of the first ones that are commonly activated in the last treatment layer. When there is no image output and the one or more second image outputs, the one or more first image outputs and the one or more second image outputs that are commonly activated in the processing layer before the last layer are extracted. You may.

前記モーフィング画像生成装置は、前記特徴点検出部が特定した前記一以上の第１画像特徴点及び前記一以上の第２画像特徴点から、相互の対応関係に基づいて一部の第１画像特徴点及び一部の第２画像特徴点を選択する選択部をさらに有し、前記中間画像生成部は、前記一部の第１画像特徴点と前記一部の第２画像特徴点とに基づいて、前記被写体が変化する過程を段階的に表した一以上の中間画像を生成してもよい。 The morphing image generator is a part of the first image features based on the mutual correspondence from the one or more first image feature points and the one or more second image feature points specified by the feature point detection unit. Further having a selection unit for selecting a point and a part of the second image feature point, the intermediate image generation unit is based on the part of the first image feature point and the part of the second image feature point. , One or more intermediate images may be generated that stepwise represent the process of changing the subject.

前記画像取得部は、変化後の被写体と同じ種別であって異なる形状の被写体が撮像された複数の第２画像を取得し、前記中間画像生成部は、前記一以上の第１画像特徴点と前記複数の第２画像それぞれに基づく前記一以上の第２画像特徴点とに基づいて、前記複数の第２画像から１つの第２画像を選択してもよい。 The image acquisition unit acquires a plurality of second images in which subjects of the same type and different shapes as the changed subject are captured, and the intermediate image generation unit is the one or more first image feature points. One second image may be selected from the plurality of second images based on the one or more second image feature points based on each of the plurality of second images.

前記中間画像生成部は、第１画像特徴点に対応する第２画像特徴点の数が所定の基準値以上である前記複数の第２画像から１つの第２画像を選択してもよい。
前記モーフィング画像生成装置は、前記複数の処理層のうち、前記後段処理層として用いる処理層を選択する指示を受け付ける指示受付部をさらに有し、前記抽出部は、前記指示受付部が受け付けた前記指示が示す前記処理層を、前記後段処理層として使用してもよい。 The intermediate image generation unit may select one second image from the plurality of second images in which the number of second image feature points corresponding to the first image feature points is equal to or greater than a predetermined reference value.
The morphing image generation device further includes an instruction receiving unit that receives an instruction to select a processing layer to be used as the subsequent processing layer among the plurality of processing layers, and the extraction unit receives the instruction receiving unit. The treated layer indicated by the instruction may be used as the post-processed layer.

前記抽出部は、前記複数の処理層のうち一つの層を前記後段処理層として選択して前記一以上の第１画像出力及び前記一以上の第２画像出力を抽出した後に、前記前段処理層として選択した処理層を前記後段処理層として選択して、別の前記一以上の第１画像出力及び前記一以上の第２画像出力を抽出してもよい。 The extraction unit selects one of the plurality of processing layers as the post-stage processing layer, extracts the one or more first image outputs and the one or more second image outputs, and then extracts the pre-stage processing layer. The processing layer selected as may be selected as the post-stage processing layer to extract another one or more first image outputs and one or more second image outputs.

本発明の第２の態様に係るモーフィング画像生成方法は、被写体の少なくとも一部が変化する前の画像である第１画像と、被写体の少なくとも一部が変化した後の画像である第２画像と、を取得するステップと、前記第１画像及び前記第２画像のそれぞれに、入力された画像に基づいて当該画像に含まれる被写体の種別を出力可能な機械学習モデルに含まれる複数の処理層を伝搬させるステップと、前記複数の処理層から選択した後段処理層、及び前記後段処理層の直前の処理層である前段処理層の両方の処理層において共通に活性化している、前記第１画像に基づいて前記後段処理層及び前記前段処理層から出力された一以上の第１画像出力と前記第２画像に基づいて前記後段処理層及び前記前段処理層から出力された一以上の第２画像出力とを抽出するステップと、前記一以上の第１画像出力に基づいて一以上の第１画像特徴点を検出し、かつ前記一以上の第２画像出力に基づいて一以上の第２画像特徴点を検出するステップと、前記一以上の第１画像特徴点と前記一以上の第２画像特徴点とに基づいて、前記被写体が変化する過程を段階的に表した一以上の中間画像を生成するステップと、を有する。 The morphing image generation method according to the second aspect of the present invention includes a first image which is an image before at least a part of the subject is changed and a second image which is an image after at least a part of the subject is changed. , And a plurality of processing layers included in the machine learning model capable of outputting the type of the subject included in the image based on the input image in each of the first image and the second image. In the first image, which is commonly activated in both the step of propagating, the post-stage treatment layer selected from the plurality of treatment layers, and the pre-stage treatment layer which is the treatment layer immediately before the post-stage treatment layer. One or more first image outputs output from the post-stage processing layer and the pre-stage processing layer based on the above, and one or more second image outputs output from the post-stage processing layer and the pre-stage processing layer based on the second image. And one or more first image feature points are detected based on the one or more first image outputs, and one or more second image feature points are detected based on the one or more second image outputs. Based on the step of detecting the above, the one or more first image feature points, and the one or more second image feature points, one or more intermediate images representing the process of changing the subject stepwise are generated. It has steps and.

前記抽出するステップは、前記第１画像が前記複数の処理層の一部である前段処理層及び後段処理層の順に伝搬したことにより前記後段処理層から出力された複数の後段第１画像出力、及び前記第２画像が前段処理層及び後段処理層の順に伝搬したことにより前記後段処理層から出力された複数の後段第２画像出力から、共通に活性化している一以上の後段第１画像出力及び一以上の後段第２画像出力を抽出する前段抽出ステップと、前記一以上の後段第１画像出力及び前記一以上の後段第２画像出力を活性化させる要因となった前記前段処理層から出力された複数の前段第１画像出力、及び前記前段処理層から出力された複数の前段第２画像出力のうち、共通に活性化している一以上の前段第１画像出力及び一以上の前段第２画像出力を抽出する後段抽出ステップと、を有してもよい。 In the extraction step, a plurality of post-stage first image outputs output from the post-stage processing layer due to the propagation of the first image in the order of the pre-stage processing layer and the post-stage processing layer, which are a part of the plurality of processing layers. And one or more post-stage first image outputs that are commonly activated from the plurality of post-stage second image outputs output from the post-stage processing layer due to the propagation of the second image in the order of the pre-stage processing layer and the post-stage processing layer. And output from the pre-stage extraction step that extracts one or more post-stage second image outputs, and the pre-stage processing layer that is a factor that activates the one or more post-stage first image output and the one or more post-stage second image output. Of the plurality of pre-stage first image outputs and the plurality of pre-stage second image outputs output from the pre-stage processing layer, one or more pre-stage first image outputs and one or more pre-stage second images that are commonly activated. It may have a post-stage extraction step of extracting the image output.

前記モーフィング画像生成方法は、前記前段抽出ステップを実行した後に、前記一以上の前段第１画像出力及び前記一以上の前段第２画像出力を、前記複数の後段第１画像出力及び前記複数の後段第２画像出力として、前記後段抽出ステップを実行してもよい。 In the morphing image generation method, after executing the pre-stage extraction step, the one or more pre-stage first image outputs and the one or more pre-stage second image outputs are combined with the plurality of post-stage first image outputs and the plurality of post-stages. As the second image output, the subsequent extraction step may be executed.

前記モーフィング画像生成方法は、前記複数の処理層のそれぞれに対して、前記後段抽出ステップ及び前記前段抽出ステップを実行してもよい。 In the morphing image generation method, the latter-stage extraction step and the first-stage extraction step may be executed for each of the plurality of processing layers.

本発明によれば、モーフィング画像の質を向上させることができるという効果を奏する。 According to the present invention, there is an effect that the quality of the morphing image can be improved.

モーフィング画像を生成する処理の概要を説明するための図である。It is a figure for demonstrating the outline of the process of generating a morphing image. 機械学習モデルの構成の一例を示す図である。It is a figure which shows an example of the structure of the machine learning model. モーフィング画像生成装置の構成を示す図である。It is a figure which shows the structure of the morphing image generator. 抽出部が行う抽出処理について説明するための図である。It is a figure for demonstrating the extraction process performed by an extraction part. 抽出部が行う抽出処理について説明するための図である。It is a figure for demonstrating the extraction process performed by an extraction part. 抽出部が行う抽出処理について説明するための図である。It is a figure for demonstrating the extraction process performed by an extraction part. 抽出部が行う抽出処理について説明するための図である。It is a figure for demonstrating the extraction process performed by an extraction part. 抽出部が行う抽出処理について説明するための図である。It is a figure for demonstrating the extraction process performed by an extraction part. 抽出部が行う抽出処理について説明するための図である。It is a figure for demonstrating the extraction process performed by an extraction part. モーフィング画像生成装置が行う処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed by a morphing image generator. 抽出部が行う処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed by the extraction part.

［モーフィング画像生成装置１の概要］
図１は、モーフィング画像を生成する処理の概要を説明するための図である。モーフィング画像生成装置１は、例えばＰＣ（Personal Computer）である。モーフィング画像生成装置１は、変化前後の被写体が写っている複数の画像に基づいて、機械学習モデルＭを用いてモーフィング画像を生成する装置である。図１に示すモーフィング画像Ａは、第１画像Ａ１に写っている人の顔から第２画像Ａ３に写っている車に変化する過程を段階的に表した画像である。図１に示す例において、人の目は、車のヘッドライトに対応し、人の口は、車の左右のヘッドライトの間に位置するラジエターグリルに対応するものとして説明する。 [Overview of morphing image generator 1]
FIG. 1 is a diagram for explaining an outline of a process for generating a morphing image. The morphing image generator 1 is, for example, a PC (Personal Computer). The morphing image generation device 1 is a device that generates a morphing image using a machine learning model M based on a plurality of images showing a subject before and after the change. The morphing image A shown in FIG. 1 is an image stepwise showing the process of changing from the face of the person shown in the first image A1 to the car shown in the second image A3. In the example shown in FIG. 1, the human eye corresponds to the headlights of the car, and the mouth of the person corresponds to the radiator grille located between the left and right headlights of the car.

モーフィング画像生成装置１は、被写体の少なくとも一部が変化する前の画像である第１画像Ａ１と、被写体の少なくとも一部が変化した後の画像である第２画像Ａ３と、を取得する（図１の（１））。図１に示した第１画像Ａ１は、人の顔を被写体とした画像である。図１に示した第２画像Ａ３は、車を被写体とした画像である。 The morphing image generation device 1 acquires a first image A1 which is an image before at least a part of the subject is changed and a second image A3 which is an image after at least a part of the subject is changed (FIG. 1 (1)). The first image A1 shown in FIG. 1 is an image in which a human face is a subject. The second image A3 shown in FIG. 1 is an image of a car as a subject.

モーフィング画像生成装置１は、取得した第１画像Ａ１及び第２画像Ａ３それぞれを機械学習モデルＭに入力し、当該機械学習モデルＭに含まれる複数の処理層を伝搬させる（図１の（２））。機械学習モデルＭは、入力された画像に基づいて当該画像に含まれる被写体の種別を出力するように学習されたモデルである。 The morphing image generation device 1 inputs the acquired first image A1 and second image A3 to the machine learning model M, and propagates a plurality of processing layers included in the machine learning model M ((2) in FIG. 1). ). The machine learning model M is a model learned to output the type of the subject included in the image based on the input image.

図２は、機械学習モデルＭの構成の一例を示す図である。機械学習モデルＭは、畳み込みニューラルネットワーク（ＣＮＮ：Convolutional Neural Network）を含む。この場合において、機械学習モデルＭは、入力層Ｍ１、第１の畳み込み層Ｍ２、第２の畳み込み層Ｍ３、第１のプーリング層Ｍ４、正規化層Ｍ５、第３の畳み込み層Ｍ６、第２のプーリング層Ｍ７、第１の全結合層Ｍ８、第２の全結合層Ｍ９、及び出力層Ｍ１０を有する。本明細書においては、隣接する２つの処理層のうち、第１画像Ａ１及び第２画像Ａ３が伝搬する際の上流側の処理層を前段処理層と称し、下流側の処理層を後段処理層と称する。 FIG. 2 is a diagram showing an example of the configuration of the machine learning model M. The machine learning model M includes a convolutional neural network (CNN). In this case, the machine learning model M is an input layer M1, a first convolution layer M2, a second convolution layer M3, a first pooling layer M4, a regularization layer M5, a third convolution layer M6, and a second. It has a pooling layer M7, a first fully connected layer M8, a second fully connected layer M9, and an output layer M10. In the present specification, of the two adjacent processing layers, the upstream processing layer when the first image A1 and the second image A3 propagate is referred to as a pre-stage processing layer, and the downstream processing layer is referred to as a post-stage processing layer. It is called.

後段処理層となり得る処理層は、第１の畳み込み層Ｍ２、第２の畳み込み層Ｍ３、第１のプーリング層Ｍ４、正規化層Ｍ５、第３の畳み込み層Ｍ６、第２のプーリング層Ｍ７、第１の全結合層Ｍ８、第２の全結合層Ｍ９、及び出力層Ｍ１０のうちのいずれかの層である。また、前段処理層となり得る処理層は、入力層Ｍ１、第１の畳み込み層Ｍ２、第２の畳み込み層Ｍ３、第１のプーリング層Ｍ４、正規化層Ｍ５、第３の畳み込み層Ｍ６、第２のプーリング層Ｍ７、第１の全結合層Ｍ８、及び第２の全結合層Ｍ９のうちのいずれかの層である。モーフィング画像生成装置１は、取得した画像を機械学習モデルＭに入力し、入力層Ｍ１から出力層Ｍ１０までの各処理層を順伝搬させる、すなわち、推論させることにより、画像に写っている被写体の種別を出力させる。 The treatment layers that can be the subsequent treatment layers are the first convolution layer M2, the second convolution layer M3, the first pooling layer M4, the normalization layer M5, the third convolution layer M6, the second pooling layer M7, and the second. It is one of the fully bonded layer M8 of 1, the second fully bonded layer M9, and the output layer M10. The processing layers that can be the pre-stage processing layers are the input layer M1, the first convolution layer M2, the second convolution layer M3, the first pooling layer M4, the normalized layer M5, the third convolution layer M6, and the second. It is any one of the pooling layer M7, the first fully bonded layer M8, and the second fully bonded layer M9. The morphing image generation device 1 inputs the acquired image to the machine learning model M and forward-propagates each processing layer from the input layer M1 to the output layer M10, that is, infers the subject in the image. Output the type.

図１に戻り、モーフィング画像生成装置１は、機械学習モデルＭが被写体の種別を出力するに至った各処理層における計算結果、すなわち、深層学習による抽象度の高い特徴量を用いて、第１画像Ａ１及び第２画像Ａ３に共通する特徴点を検出する（図１の（３））。ここで、モーフィング画像生成装置１は、共通する特徴点の検出を、伝搬させた順序とは逆の順序で行う。このようにすることで、モーフィング画像生成装置１は、抽象度が高い特徴量に基づく特徴点を検出することができる。 Returning to FIG. 1, the morphing image generator 1 uses the calculation results in each processing layer that led to the machine learning model M outputting the type of subject, that is, the features with a high degree of abstraction by deep learning. A feature point common to the image A1 and the second image A3 is detected ((3) in FIG. 1). Here, the morphing image generator 1 detects common feature points in the reverse order of propagation. By doing so, the morphing image generator 1 can detect feature points based on features with a high degree of abstraction.

モーフィング画像生成装置１は、共通する特徴点を検出することにより、第１画像Ａ１に写っている人の顔の目、口及び第２画像Ａ３に写っている車のヘッドライト、ラジエターグリルにそれぞれ対応関係があることを検出する。対応関係は、第１画像特徴点が示す第１画像に含まれる画素と、第２画像特徴点が示す第２画像に含まれる画素とが一致又は近似した関係である。 By detecting common feature points, the morphing image generator 1 detects the eyes and mouth of the person's face shown in the first image A1 and the headlights and radiator grille of the car shown in the second image A3, respectively. Detect that there is a correspondence. The correspondence relationship is a relationship in which the pixels included in the first image indicated by the first image feature points and the pixels included in the second image indicated by the second image feature points match or approximate.

そして、モーフィング画像生成装置１は、検出した対応関係にある第１画像Ａ１及び第２画像Ａ３それぞれの特徴点に基づいて、被写体が変化する過程を段階的に表した中間画像Ａ２を生成する（図１の（４））。このようにすることで、モーフィング画像生成装置１は、モーフィング画像の質を向上させることができる。
以下、モーフィング画像生成装置１の詳細について説明する。 Then, the morphing image generation device 1 generates an intermediate image A2 that stepwise represents the process of changing the subject based on the feature points of the first image A1 and the second image A3 that are in the detected correspondence relationship (the morphing image generation device 1). (4) of FIG. By doing so, the morphing image generator 1 can improve the quality of the morphing image.
Hereinafter, the details of the morphing image generator 1 will be described.

［モーフィング画像生成装置１の構成］
図３は、モーフィング画像生成装置１の構成を示す図である。モーフィング画像生成装置１は、操作部１１、記憶部１２、及び制御部１３を有する。 [Configuration of morphing image generator 1]
FIG. 3 is a diagram showing the configuration of the morphing image generation device 1. The morphing image generation device 1 includes an operation unit 11, a storage unit 12, and a control unit 13.

操作部１１は、ユーザの操作を受け付ける入力デバイスである。
記憶部１２は、例えば、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）及びハードディスク等の記憶媒体である。記憶部１２は、制御部１３が実行する各種のプログラムを記憶する。記憶部１２は、第１画像及び第２画像を記憶する。 The operation unit 11 is an input device that accepts user operations.
The storage unit 12 is, for example, a storage medium such as a ROM (Read Only Memory), a RAM (Random Access Memory), and a hard disk. The storage unit 12 stores various programs executed by the control unit 13. The storage unit 12 stores the first image and the second image.

制御部１３は、例えばＣＰＵ（Central Processing Unit）である。制御部１３は、記憶部１２に記憶されているプログラムを実行することにより、モーフィング画像生成装置１に係る機能を制御する。制御部１３は、プログラムを実行することにより、画像取得部１３１、伝搬制御部１３２、抽出部１３３、指示受付部１３６、特徴点検出部１３７、選択部１３８、及び中間画像生成部１３９として機能する。 The control unit 13 is, for example, a CPU (Central Processing Unit). The control unit 13 controls the function related to the morphing image generation device 1 by executing the program stored in the storage unit 12. By executing the program, the control unit 13 functions as an image acquisition unit 131, a propagation control unit 132, an extraction unit 133, an instruction reception unit 136, a feature point detection unit 137, a selection unit 138, and an intermediate image generation unit 139. ..

画像取得部１３１は、記憶部１２に記憶されている第１画像と第２画像とを取得する。画像取得部１３１は、取得した第１画像と第２画像とを、伝搬制御部１３２に入力する。 The image acquisition unit 131 acquires the first image and the second image stored in the storage unit 12. The image acquisition unit 131 inputs the acquired first image and the second image to the propagation control unit 132.

伝搬制御部１３２は、第１画像及び第２画像のそれぞれに、機械学習モデルＭに含まれる複数の処理層を伝搬させる。図２に示す例において、伝搬制御部１３２は、第１画像及び第２画像のそれぞれに、機械学習モデルＭに含まれる入力層Ｍ１から出力層Ｍ１０までの各処理層を、順に伝搬させる。 The propagation control unit 132 propagates a plurality of processing layers included in the machine learning model M to each of the first image and the second image. In the example shown in FIG. 2, the propagation control unit 132 propagates each of the processing layers from the input layer M1 to the output layer M10 included in the machine learning model M to each of the first image and the second image in order.

抽出部１３３は、複数の処理層から選択した後段処理層、及び後段処理層の直前の処理層である前段処理層の両方の処理層において共通に活性化している、第１画像に基づいて後段処理層及び前段処理層から出力された一以上の第１画像出力と第２画像に基づいて後段処理層及び前段処理層から出力された一以上の第２画像出力とを抽出する。抽出部１３３が行う抽出処理の詳細については後述するが、抽出部１３３は、後段処理層で共通に活性化している第１画像出力の一部である後段第１画像出力及び第２画像出力の一部である後段第２画像出力を抽出する後段抽出部１３４と、前段処理層で共通に活性化している第１画像出力の一部である前段第１画像出力及び第２画像出力の一部である前段第２画像出力を抽出する前段抽出部１３５とを有する。 The extraction unit 133 is commonly activated in both the post-stage treatment layer selected from the plurality of treatment layers and the pre-stage treatment layer which is the treatment layer immediately before the post-stage treatment layer, and the post-stage based on the first image. One or more first image outputs output from the processing layer and the pre-stage processing layer, and one or more second image outputs output from the post-stage processing layer and the pre-stage processing layer are extracted based on the second image. The details of the extraction process performed by the extraction unit 133 will be described later, but the extraction unit 133 is a part of the first image output and the second image output that are commonly activated in the subsequent processing layer. A part of the first image output and the second image output of the first stage which is a part of the first image output which is commonly activated in the second image output of the second stage which is a part and the extraction unit 134 of the second stage which extracts the second image output of the second stage. It has a pre-stage extraction unit 135 that extracts the pre-stage second image output.

抽出部１３３が抽出する第１画像出力及び第２画像出力は、処理層に含まれる複数のユニットのうち、活性化しているユニットを示す情報である。ユニットは、画像に含まれる一以上の画素である。活性化の定義は、例えば、ユニットの出力値又はユニットの出力値と当該ユニットの結合の重みとの積が、所定の閾値を超えた場合でもよいし、出力の大きい順に所定の個数又は所定の割合に含まれた場合であってもよい。また、全結合層以外の処理層においては、例えば、チャンネルごとに出力の大きい順に所定の個数又は所定の割合に含まれた場合であってもよい。チャンネルは、フィルタ毎に畳み込み演算した出力である。 The first image output and the second image output extracted by the extraction unit 133 are information indicating the activated unit among the plurality of units included in the processing layer. A unit is one or more pixels contained in an image. The definition of activation may be, for example, when the product of the output value of the unit or the output value of the unit and the weight of the connection of the unit exceeds a predetermined threshold value, or a predetermined number or a predetermined number in descending order of output. It may be included in the ratio. Further, in the processing layer other than the fully connected layer, for example, it may be included in a predetermined number or a predetermined ratio in descending order of output for each channel. The channel is the output calculated by convolution for each filter.

抽出部１３３は、複数の処理層のうち、最後尾の処理層である最後尾層を後段処理層として選択することが好ましい。しかし、最後尾層において共通に活性化している第１画像出力及び第２画像出力がない場合がある。そこで、抽出部１３３は、複数の処理層のうち、最後尾層を後段処理層として選択した場合において、最後尾層において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力がない場合、最後尾層より前の処理層において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力を抽出してもよい。 It is preferable that the extraction unit 133 selects the last treatment layer, which is the last treatment layer, as the post-treatment layer among the plurality of treatment layers. However, there are cases where there is no first image output and second image output that are commonly activated in the rearmost layer. Therefore, when the last layer is selected as the subsequent treatment layer among the plurality of treatment layers, the extraction unit 133 outputs one or more first images and one or more second images that are commonly activated in the last layer. When there is no image output, one or more first image outputs and one or more second image outputs that are commonly activated in the processing layer before the rearmost layer may be extracted.

例えば、抽出部１３３が、最後尾層である出力層Ｍ１０を後段処理層として選択した場合において、出力層Ｍ１０において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力がないとする。この場合において、抽出部１３３は、出力層Ｍ１０より前の各処理層に対して、共通に活性化している一以上の第１画像出力及び一以上の第２画像出力を繰り返し探索する。抽出部１３３は、例えば、出力層Ｍ１０の直前の処理層である第２の全結合層Ｍ９において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力があった場合、第２の全結合層Ｍ９を後段処理層として選択する。そして、抽出部１３３は、後段処理層として選択した第２の全結合層Ｍ９において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力を抽出する。このようにすることで、抽出部１３３は、第１画像と第２画像とで一致する領域が少ない場合であっても、それぞれに写る被写体を対応付けることができる。 For example, when the extraction unit 133 selects the output layer M10, which is the rearmost layer, as the post-processing layer, one or more first image outputs and one or more second image outputs that are commonly activated in the output layer M10. Suppose there is no. In this case, the extraction unit 133 repeatedly searches for one or more first image outputs and one or more second image outputs that are commonly activated for each processing layer before the output layer M10. When the extraction unit 133 has one or more first image outputs and one or more second image outputs that are commonly activated in the second fully connected layer M9, which is the processing layer immediately before the output layer M10, for example. , The second fully bonded layer M9 is selected as the post-treatment layer. Then, the extraction unit 133 extracts one or more first image outputs and one or more second image outputs that are commonly activated in the second fully connected layer M9 selected as the post-processing layer. By doing so, the extraction unit 133 can associate the subjects to be captured in each of the first image and the second image even when there are few matching areas.

抽出部１３３は、ユーザによって指定された処理層を後段処理層として選択してもよい。具体的には、指示受付部１３６が、操作部１１を介して、複数の処理層のうち、後段処理層として用いる処理層を選択する指示を受け付ける。そして、抽出部１３３は、指示受付部１３６が受け付けた指示が示す処理層を、後段処理層として使用する。抽出部１３３は、図２に示す例において、ユーザが第２の全結合層Ｍ９を選択した場合に、指示受付部１３６が受け付けた指示が示す第２の全結合層Ｍ９を、後段処理層として使用する。抽出部１３３は、抽出した第１画像出力と第２画像出力とを特徴点検出部１３７に入力する。 The extraction unit 133 may select the processing layer specified by the user as the subsequent processing layer. Specifically, the instruction receiving unit 136 receives an instruction to select a processing layer to be used as the subsequent processing layer from the plurality of processing layers via the operation unit 11. Then, the extraction unit 133 uses the processing layer indicated by the instruction received by the instruction reception unit 136 as the subsequent processing layer. In the example shown in FIG. 2, the extraction unit 133 uses the second fully connected layer M9 indicated by the instruction received by the instruction receiving unit 136 as the subsequent processing layer when the user selects the second fully connected layer M9. use. The extraction unit 133 inputs the extracted first image output and the second image output to the feature point detection unit 137.

特徴点検出部１３７は、一以上の第１画像出力に基づいて一以上の第１画像特徴点を検出し、かつ一以上の第２画像出力に基づいて一以上の第２画像特徴点を検出する。具体的には、特徴点検出部１３７は、まず、一以上の第１画像出力及び一以上の第２画像出力に基づいて、対応する特徴点を探索する。そして、特徴点検出部１３７は、対応関係にある一以上の第１画像出力に基づく一以上の第１画像特徴点と、一以上の第２画像出力に基づく一以上の第２画像特徴点とを検出する。特徴点検出部１３７は、検出した第１画像特徴点及び第２画像特徴点を選択部１３８に入力する。 The feature point detection unit 137 detects one or more first image feature points based on one or more first image outputs, and detects one or more second image feature points based on one or more second image outputs. do. Specifically, the feature point detection unit 137 first searches for a corresponding feature point based on one or more first image outputs and one or more second image outputs. Then, the feature point detection unit 137 includes one or more first image feature points based on one or more first image outputs and one or more second image feature points based on one or more second image outputs. Is detected. The feature point detection unit 137 inputs the detected first image feature point and the second image feature point to the selection unit 138.

選択部１３８は、特徴点検出部１３７が特定した一以上の第１画像特徴点及び一以上の第２画像特徴点から、相互の対応関係に基づいて一部の第１画像特徴点及び一部の第２画像特徴点を選択する。具体的には、選択部１３８は、誤検出した対応関係を除去し、除去した後の対応関係に基づく一以上の第１画像特徴点及び一以上の第２画像特徴点を選択する。対応関係の誤検出は、対応関係にある第１画像特徴点及び第２画像特徴点で互いに齟齬が生じている状態であり、例えば特徴点の移動経路が中間画像の生成過程において交差する場合である。選択部１３８は、例えば、ＲＡＮＳＡＣ（Random Sampling Consensus）法又は最小２乗メディアン（ＬＭｅｄＳ：Least Median of Square）法に基づいて絞り込みを行うことにより対応関係を除去する。また、選択部１３８は、指示受付部１３６を介して、ユーザが選んだ対応関係を選択してもよい。 The selection unit 138 is a part of the first image feature points and a part of the one or more first image feature points and one or more second image feature points specified by the feature point detection unit 137 based on the mutual correspondence relationship. Select the second image feature point of. Specifically, the selection unit 138 removes the erroneously detected correspondence, and selects one or more first image feature points and one or more second image feature points based on the correspondence after the removal. False detection of the correspondence relationship is a state in which the first image feature points and the second image feature points in the correspondence relationship are inconsistent with each other, for example, when the movement paths of the feature points intersect in the intermediate image generation process. be. The selection unit 138 removes the correspondence by, for example, narrowing down based on the RANSAC (Random Sampling Consensus) method or the least squares median (LMedS: Least Median of Square) method. Further, the selection unit 138 may select the correspondence relationship selected by the user via the instruction reception unit 136.

また、例えば、第２画像が３次元空間の座標系を含むＣＧ（Computer Graphics）画像である場合において、特徴点検出部１３７が検出した第２画像特徴点のうち、４点以上の第２画像特徴点が実空間中において同一直線上にあることが判明しているとする。この場合において、選択部１３８は、まず、対応関係に基づく第１画像特徴点及び第２画像特徴点それぞれにおいて複比を計算し、値が著しく異なっている第１画像特徴点及び第２画像特徴点があるか否かを判定する。そして、選択部１３８は、値が著しく異なっていると判定した第１画像特徴点及び第２画像特徴点を誤検出された特徴点であるとして、他の対応関係に基づく第１画像特徴点及び第２画像特徴点を選択する。 Further, for example, when the second image is a CG (Computer Graphics) image including a coordinate system in a three-dimensional space, the second image of four or more points among the second image feature points detected by the feature point detection unit 137. It is assumed that the feature points are found to be on the same straight line in the real space. In this case, the selection unit 138 first calculates the double ratio at each of the first image feature point and the second image feature point based on the correspondence, and the first image feature point and the second image feature whose values are significantly different from each other. Determine if there is a point. Then, the selection unit 138 considers that the first image feature point and the second image feature point that are determined to have significantly different values are the feature points that are erroneously detected, and the first image feature point and the first image feature point based on other correspondence relationships. Select the second image feature point.

中間画像生成部１３９は、一以上の第１画像特徴点と一以上の第２画像特徴点とに基づいて、被写体が変化する過程を段階的に表した一以上の中間画像を生成する。具体的には、中間画像生成部１３９は、選択部１３８が誤検出を除去した後の対応関係に基づく一部の第１画像特徴点と一部の第２画像特徴点とに基づいて、被写体が変化する過程を段階的に表した一以上の中間画像を生成する。 The intermediate image generation unit 139 generates one or more intermediate images that stepwise represent the process of changing the subject based on one or more first image feature points and one or more second image feature points. Specifically, the intermediate image generation unit 139 is a subject based on a part of the first image feature points and a part of the second image feature points based on the correspondence after the selection unit 138 removes the false detection. Generates one or more intermediate images that stepwise represent the process of change.

中間画像生成部１３９は、例えば、第１画像特徴点が示す第１画像の画素における座標と、第１画像特徴点に対応する第２画像特徴点が示す第２画像の画素における座標とに基づいて、変化ステップを計算する。そして、中間画像生成部１３９は、計算した変化ステップに基づいて、一以上の中間画像を生成する。変化ステップの計算方法は、公知の技術を使用できる。 The intermediate image generation unit 139 is based on, for example, the coordinates in the pixels of the first image indicated by the first image feature points and the coordinates in the pixels of the second image indicated by the second image feature points corresponding to the first image feature points. And calculate the change step. Then, the intermediate image generation unit 139 generates one or more intermediate images based on the calculated change step. A known technique can be used for the calculation method of the change step.

中間画像生成部１３９は、所定の条件を満たす場合に、対応する第１画像特徴点と第２画像特徴点との間を補間する補間特徴点を生成してもよい。具体的には、中間画像生成部１３９は、所定の条件を満たす場合に、対応する第１画像特徴点と第２画像特徴点との間を補間することにより補間特徴点を生成し、複数の補間特徴点に基づいて中間画像を生成してもよい。所定の条件は、例えば、第１画像及び第２画像の被写体の種別が異なる場合、又は生成する中間画像の数が多い場合等である。 The intermediate image generation unit 139 may generate an interpolated feature point that interpolates between the corresponding first image feature point and the second image feature point when a predetermined condition is satisfied. Specifically, the intermediate image generation unit 139 generates an interpolated feature point by interpolating between the corresponding first image feature point and the second image feature point when a predetermined condition is satisfied, and a plurality of interpolation feature points are generated. An intermediate image may be generated based on the interpolated feature points. The predetermined conditions are, for example, when the types of subjects in the first image and the second image are different, or when the number of intermediate images to be generated is large.

ところで、第１画像及び第２画像の被写体の種別が異なる場合、第２画像として選択された被写体の形状によっては、対応する特徴点を検出できない可能性がある。そこで、中間画像生成部１３９は、複数の第２画像からモーフィングに適した画像を選択してもよい。具体的には、まず、画像取得部１３１は、変化後の被写体と同じ種別であって異なる形状の被写体が撮像された複数の第２画像を取得する。この場合において、中間画像生成部１３９は、特徴点検出部１３７が検出した一以上の第１画像出力に基づく一以上の第１画像特徴点と複数の第２画像それぞれに基づく一以上の第２画像特徴点とに基づいて、複数の第２画像から１つの第２画像を選択する。 By the way, when the types of subjects in the first image and the second image are different, the corresponding feature points may not be detected depending on the shape of the subject selected as the second image. Therefore, the intermediate image generation unit 139 may select an image suitable for morphing from a plurality of second images. Specifically, first, the image acquisition unit 131 acquires a plurality of second images in which subjects of the same type and different shapes as the changed subject are captured. In this case, the intermediate image generation unit 139 includes one or more first image feature points based on one or more first image outputs detected by the feature point detection unit 137 and one or more second images based on each of the plurality of second images. One second image is selected from the plurality of second images based on the image feature points.

中間画像生成部１３９は、例えば、第１画像特徴点に対応する第２画像特徴点の数が所定の基準値以上である複数の第２画像から１つの第２画像を選択する。所定の基準値は、例えば、第１画像及び第２画像の被写体の種別が同じであるか否かによって変わる変動値である。中間画像生成部１３９は、第１画像及び第２画像の被写体の種別が同じである場合、第１画像及び第２画像の被写体の種別が異なる場合に比べて基準値を低く設定する。反対に、中間画像生成部１３９は、第１画像及び第２画像の被写体の種別が異なる場合、第１画像及び第２画像の被写体の種別が同じである場合に比べて基準値を高くする。このようにすることで、中間画像生成部１３９は、モーフィング画像の質を向上させることができる。 The intermediate image generation unit 139 selects, for example, one second image from a plurality of second images in which the number of second image feature points corresponding to the first image feature points is equal to or greater than a predetermined reference value. The predetermined reference value is, for example, a variable value that changes depending on whether or not the types of subjects in the first image and the second image are the same. The intermediate image generation unit 139 sets the reference value lower when the types of the subjects of the first image and the second image are the same, as compared with the case where the types of the subjects of the first image and the second image are different. On the contrary, the intermediate image generation unit 139 raises the reference value when the types of the subjects of the first image and the second image are different, as compared with the case where the types of the subjects of the first image and the second image are the same. By doing so, the intermediate image generation unit 139 can improve the quality of the morphing image.

また、中間画像生成部１３９は、例えば、複数の第２画像のうち、第１画像に基づく第１画像特徴点に対応する第２画像特徴点が最も多い第２画像を選択する。具体的には、ユーザが、第１画像から特定の領域（図１に示す例において、第１画像Ａ１の目又は口の領域）を指定したとする。この場合において、中間画像生成部１３９は、複数の第２画像のうち、指示受付部１３６を介して、ユーザによって指定された第１画像における特定の領域に含まれる第１画像特徴点に対応する第２画像特徴点が最も多い第２画像を選択する。そして、中間画像生成部１３９は、第１画像及び選択した第２画像に基づいて、一以上の中間画像を生成する。このようにすることで、中間画像生成部１３９は、ユーザが意図したモーフィング画像を生成することができる。中間画像生成部１３９は、生成した中間画像を記憶部１２に記憶させる。 Further, the intermediate image generation unit 139 selects, for example, the second image having the largest number of second image feature points corresponding to the first image feature points based on the first image among the plurality of second images. Specifically, it is assumed that the user specifies a specific area (the area of the eyes or the mouth of the first image A1 in the example shown in FIG. 1) from the first image. In this case, the intermediate image generation unit 139 corresponds to the first image feature point included in the specific area in the first image designated by the user via the instruction reception unit 136 among the plurality of second images. Second image Select the second image with the most feature points. Then, the intermediate image generation unit 139 generates one or more intermediate images based on the first image and the selected second image. By doing so, the intermediate image generation unit 139 can generate the morphing image intended by the user. The intermediate image generation unit 139 stores the generated intermediate image in the storage unit 12.

［抽出処理］
続いて、抽出部１３３が行う抽出処理について説明する。上述のとおり、抽出部１３３は、後段抽出部１３４及び前段抽出部１３５を有する。後段抽出部１３４は、第１画像が複数の処理層の一部である前段処理層及び後段処理層の順に伝搬したことにより後段処理層から出力された複数の後段第１画像出力、及び第２画像が前段処理層及び後段処理層の順に伝搬したことにより後段処理層から出力された複数の後段第２画像出力から、共通に活性化している一以上の後段第１画像出力及び一以上の後段第２画像出力を抽出する。 [Extraction process]
Subsequently, the extraction process performed by the extraction unit 133 will be described. As described above, the extraction unit 133 has a rear-stage extraction unit 134 and a front-stage extraction unit 135. The post-stage extraction unit 134 outputs a plurality of post-stage first images and a second image output from the post-stage processing layer because the first image propagates in the order of the pre-stage processing layer and the post-stage processing layer, which are a part of the plurality of processing layers. One or more post-stage first image outputs and one or more post-stages that are commonly activated from the plurality of post-stage second image outputs output from the post-stage processing layers due to the image propagating in the order of the pre-stage processing layer and the post-stage processing layer. Extract the second image output.

前段抽出部１３５は、一以上の後段第１画像出力及び一以上の後段第２画像出力を活性化させる要因となった前段処理層から出力された複数の前段第１画像出力、及び前段処理層から出力された複数の前段第２画像出力のうち、共通に活性化している一以上の前段第１画像出力及び一以上の前段第２画像出力を抽出する。 The pre-stage extraction unit 135 includes a plurality of pre-stage first image outputs and a pre-stage processing layer output from the pre-stage processing layer that has been a factor in activating one or more rear-stage first image outputs and one or more post-stage second image outputs. Of the plurality of pre-stage second image outputs output from, one or more pre-stage first image outputs and one or more pre-stage second image outputs that are commonly activated are extracted.

図４から図９は、抽出部１３３が行う抽出処理について説明するための図である。図４から図９は、前段処理層から後段処理層に伝搬させた状態を示している。図４から図９において、実線で示すユニットを結合する結合線は、結合するユニットから出力があったことを示し、破線で示す結合線は、結合するユニットから出力が無かったことを示す。また、結合線を示す線の太さは、結合するユニットからの出力の大きさを示す。 4 to 9 are diagrams for explaining the extraction process performed by the extraction unit 133. 4 to 9 show a state of propagation from the pre-stage processing layer to the post-stage processing layer. In FIGS. 4 to 9, the connecting line connecting the units shown by the solid line indicates that there was an output from the connecting unit, and the connecting line shown by the broken line indicates that there was no output from the connecting unit. The thickness of the line indicating the connecting line indicates the magnitude of the output from the unit to be connected.

図４の場合において、後段処理層は、最後尾層（例えば、出力層又は全結合層等）又は抽出部１３３が選択した最後尾層より前の処理層（全結合層又はプーリング層等）であり、前段処理層は、後段処理層の直前の処理層（例えば、全結合層又はプーリング層等）である。図４においては、後段処理層が出力層Ｍ２０であり、前段処理層が全結合層Ｍ１９であるとして説明する。 In the case of FIG. 4, the post-stage treatment layer is a treatment layer (for example, an output layer or a fully connected layer) or a treatment layer before the last layer selected by the extraction unit 133 (such as a fully connected layer or a pooling layer). Yes, the pre-stage treatment layer is a treatment layer immediately before the post-stage treatment layer (for example, a fully bonded layer or a pooling layer). In FIG. 4, it is assumed that the post-stage processing layer is the output layer M20 and the pre-stage processing layer is the fully connected layer M19.

図４（ａ）は、抽出前の状態であり、図４（ｂ）は抽出後の状態である。第１画像において、出力層Ｍ２０は、ユニットＵ５、Ｕ８が活性化しており、全結合層Ｍ１９は、ユニットＵ２、Ｕ５、Ｕ６、Ｕ７、Ｕ８が活性化している。第２画像において、出力層Ｍ２０は、ユニットＵ３、Ｕ５が活性化しており、全結合層Ｍ１９は、ユニットＵ２、Ｕ４、Ｕ５、Ｕ８が活性化している。 FIG. 4A is a state before extraction, and FIG. 4B is a state after extraction. In the first image, units U5 and U8 are activated in the output layer M20, and units U2, U5, U6, U7 and U8 are activated in the fully connected layer M19. In the second image, the output layer M20 has the units U3 and U5 activated, and the fully connected layer M19 has the units U2, U4, U5 and U8 activated.

この場合において、後段抽出部１３４は、後段処理層である出力層Ｍ２０から出力された後段第１画像出力であるユニットＵ５、Ｕ８、及び出力層Ｍ２０から出力された後段第２画像出力であるユニットＵ３、Ｕ５を比較する。そして、後段抽出部１３４は、共通に活性化している後段第１画像出力のユニットＵ５及び後段第２画像出力のユニットＵ５を抽出する。 In this case, the post-stage extraction unit 134 is a unit U5, U8 which is a rear-stage first image output output from the output layer M20 which is a post-stage processing layer, and a unit which is a rear-stage second image output output from the output layer M20. Compare U3 and U5. Then, the rear-stage extraction unit 134 extracts the unit U5 of the rear-stage first image output and the unit U5 of the rear-stage second image output that are commonly activated.

続いて、前段抽出部１３５は、後段第１画像出力のユニットＵ５を活性化させる要因となった前段処理層である全結合層Ｍ１９から出力された前段第１画像出力であるユニットＵ２、Ｕ５、Ｕ６、及び後段第２画像出力のユニットＵ５を活性化させる要因となった全結合層Ｍ１９から出力された前段第２画像出力であるユニットＵ２、Ｕ５、Ｕ８を比較する。そして、前段抽出部１３５は、共通に活性化している前段第１画像出力のユニットＵ２、Ｕ５、及び前段第２画像出力のユニットＵ２、Ｕ５を抽出する。 Subsequently, the pre-stage extraction unit 135 includes units U2, U5, which are the pre-stage first image output units output from the pre-stage processing layer M19, which is a factor that activates the unit U5 of the rear-stage first image output. The units U2, U5, and U8, which are the first-stage second image outputs output from the fully connected layer M19, which are the factors that activate the U6 and the second-stage second image output unit U5, are compared. Then, the pre-stage extraction unit 135 extracts the units U2 and U5 of the pre-stage first image output and the units U2 and U5 of the pre-stage second image output that are commonly activated.

抽出部１３３は、出力層Ｍ２０から全結合層Ｍ１９までの出力を抽出すると、次の処理層に対する出力を抽出する。具体的には、抽出部１３３は、処理層ごとに、共通に活性化している第１画像出力及び第２画像出力を抽出する処理を、伝搬制御部１３２が伝搬させた順序とは逆の順序で繰り返し行う。より具体的には、抽出部１３３は、複数の処理層のうち一つの層を後段処理層として選択して一以上の第１画像出力及び一以上の第２画像出力を抽出した後に、前段処理層として選択した処理層を後段処理層として選択して、別の一以上の第１画像出力及び一以上の第２画像出力を抽出する。このようにすることで、抽出部１３３は、第１画像及び第２画像に対する比較の精度を高めることができる。 When the extraction unit 133 extracts the output from the output layer M20 to the fully connected layer M19, the extraction unit 133 extracts the output for the next processing layer. Specifically, the extraction unit 133 performs the process of extracting the first image output and the second image output that are commonly activated for each processing layer in the reverse order of the order in which the propagation control unit 132 propagates. Repeat with. More specifically, the extraction unit 133 selects one of the plurality of processing layers as the post-processing layer, extracts one or more first image outputs and one or more second image outputs, and then performs the pre-stage processing. The processing layer selected as the layer is selected as the subsequent processing layer, and another one or more first image outputs and one or more second image outputs are extracted. By doing so, the extraction unit 133 can improve the accuracy of comparison with respect to the first image and the second image.

図５は、第１画像に基づいて、前段処理層から後段処理層に伝搬させた状態を示している。図６は、第２画像に基づいて、前段処理層から後段処理層に伝搬させた状態を示している。図５及び図６の場合において、後段処理層は、全結合層Ｍ１８であり、前段処理層は、全結合層以外の処理層（例えば、プーリング層又は畳み込み層等）である。図５及び図６においては、前段処理層がプーリング層Ｍ１７であるとして説明する。また、図５及び図６において、前段処理層は、３つのチャンネルを有する。上段の第１チャンネルは、ユニットＵ１１、Ｕ１２、Ｕ１３、Ｕ１４、及びＵ１５を含む。中段の第２チャンネルは、ユニットＵ２１、Ｕ２２、Ｕ２３、Ｕ２４、及びＵ２５を含む。下段の第３チャンネルは、ユニットＵ３１、Ｕ３２、Ｕ３３、Ｕ３４、及びＵ３５を含む。 FIG. 5 shows a state in which the image is propagated from the pre-stage processing layer to the post-stage processing layer based on the first image. FIG. 6 shows a state in which the image is propagated from the pre-stage processing layer to the post-stage processing layer based on the second image. In the case of FIGS. 5 and 6, the post-stage treatment layer is a fully-bonded layer M18, and the front-stage treatment layer is a treatment layer other than the fully-bonded layer (for example, a pooling layer or a convolution layer). In FIGS. 5 and 6, the pretreatment layer will be described as the pooling layer M17. Further, in FIGS. 5 and 6, the pretreatment layer has three channels. The first channel in the upper row includes units U11, U12, U13, U14, and U15. The second channel in the middle stage includes units U21, U22, U23, U24, and U25. The lower third channel includes units U31, U32, U33, U34, and U35.

第１画像において、プーリング層Ｍ１７は、第１チャンネルに含まれるユニットＵ１３及び第２チャンネルに含まれるユニットＵ２１、Ｕ２４が活性化している。第２画像において、全結合層Ｍ１８は、第２チャンネルに含まれるユニットＵ２２、Ｕ２４、Ｕ２５及び第３チャンネルに含まれるユニットＵ３２、Ｕ３３が活性化している。 In the first image, in the pooling layer M17, the units U13 included in the first channel and the units U21 and U24 included in the second channel are activated. In the second image, in the fully connected layer M18, the units U22, U24, U25 included in the second channel and the units U32, U33 included in the third channel are activated.

前段抽出部１３５は、後段第１画像出力のユニットＵ５を活性化させる要因となった前段処理層であるプーリング層Ｍ１７から出力された前段第１画像出力、及び後段第２画像出力のユニットＵ５を活性化させる要因となったプーリング層Ｍ１７から出力された前段第２画像出力を比較する。前段抽出部１３５は、活性化しているユニットの有無を調べ、活性化している前段第１画像出力の第１チャンネルに含まれるユニットＵ１３及び第２チャンネルに含まれるＵ２１、Ｕ２４と、前段第２画像出力の第２チャンネルに含まれるユニットＵ２２、Ｕ２４、Ｕ２５及び第３チャンネルに含まれるＵ３２、Ｕ３３とに着目する。 The front-stage extraction unit 135 outputs the front-stage first image output and the rear-stage second image output unit U5 output from the pooling layer M17, which is the front-stage processing layer, which is a factor that activates the rear-stage first image output unit U5. The second image output of the previous stage output from the pooling layer M17, which is a factor for activation, is compared. The pre-stage extraction unit 135 examines the presence or absence of activated units, and unit U13 included in the first channel of the activated pre-stage first image output, U21 and U24 included in the second channel, and the pre-stage second image. Focus on the units U22, U24, U25 included in the second channel of the output and U32, U33 included in the third channel.

そして、前段抽出部１３５は、前段第１画像出力と前段第２画像出力との両方において活性化しているユニットが存在しているチャンネルが第２チャンネルであることから、前段第１画像出力の第２チャンネルに含まれるユニットＵ２１、Ｕ２４及び前段第２画像出力の第２チャンネルに含まれるユニットＵ２２、Ｕ２４、Ｕ２５を抽出する。 Then, in the pre-stage extraction unit 135, since the channel in which the unit activated in both the pre-stage first image output and the pre-stage second image output exists is the second channel, the first stage first image output is the first channel. The units U21 and U24 included in the two channels and the units U22, U24 and U25 included in the second channel of the second image output in the previous stage are extracted.

図７の場合において、後段処理層は、プーリング層Ｍ１６であり、前段処理層は、プーリング層以外の処理層（例えば、畳み込み層又は正規化層等）である。図７においては、前段処理層が畳み込み層Ｍ１５であるとして説明する。また、図７において、前段処理層は、チャンネルが１つであるとして説明する。第１画像において、プーリング層Ｍ１６は、ユニットＵ５が活性化しており、畳み込み層Ｍ１５は、ユニットＵ３、Ｕ５が活性化している。第２画像において、プーリング層Ｍ１６は、ユニットＵ３が活性化しており、畳み込み層Ｍ１５は、ユニットＵ３、Ｕ４が活性化している。 In the case of FIG. 7, the post-stage treatment layer is the pooling layer M16, and the front-stage treatment layer is a treatment layer other than the pooling layer (for example, a convolution layer or a normalized layer). In FIG. 7, the pre-stage processing layer will be described as the convolution layer M15. Further, in FIG. 7, the pre-stage processing layer will be described as having one channel. In the first image, the pooling layer M16 has the unit U5 activated, and the convolution layer M15 has the units U3 and U5 activated. In the second image, the pooling layer M16 has the unit U3 activated, and the convolution layer M15 has the units U3 and U4 activated.

ここで、抽出部１３３は、画像の圧縮を行うプーリング層においては、直前の処理層からプーリング層に結合している複数のユニットのうち、チャンネルごとに活性化している程度に基づいて出力を抽出する。具体的には、前段抽出部１３５は、複数の前段第１画像出力及び複数の前段第２画像出力のうち、活性化している大きさに基づいて、一以上の前段第１画像出力及び一以上の前段第２画像出力を抽出する。前段抽出部１３５は、例えば、複数の前段第１画像出力及び複数の前段第２画像出力のうち、チャンネルごとに最も大きく活性化している一以上の前段第１画像出力及び一以上の前段第２画像出力を抽出する。 Here, in the pooling layer that compresses the image, the extraction unit 133 extracts the output based on the degree of activation for each channel among the plurality of units bonded to the pooling layer from the immediately preceding processing layer. do. Specifically, the pre-stage extraction unit 135 has one or more pre-stage first image outputs and one or more pre-stage first image outputs based on the activated size of the plurality of pre-stage first image outputs and the plurality of pre-stage second image outputs. The second image output in the previous stage of is extracted. The pre-stage extraction unit 135 is, for example, one or more pre-stage first image outputs and one or more pre-stage second image outputs that are most activated for each channel among the plurality of pre-stage first image outputs and the plurality of pre-stage second image outputs. Extract the image output.

この場合において、後段抽出部１３４は、直前の抽出処理において前段処理層として選択したプーリング層Ｍ１６を選択して、プーリング層Ｍ１６から出力された後段第１画像出力のユニットＵ５、及びプーリング層Ｍ１６から出力された後段第２画像出力のユニットＵ３を抽出する。そして、前段抽出部１３５は、後段第１画像出力のユニットＵ３、Ｕ５及び後段第２画像出力のユニットＵ３、Ｕ４のうち、チャンネルごとに最も大きく活性化している前段第１画像出力のユニットＵ５、及び前段第２画像出力のユニットＵ４を抽出する。このようにすることで、前段抽出部１３５は、画像の中で特徴となる領域を特定することができる。 In this case, the post-stage extraction unit 134 selects the pooling layer M16 selected as the pre-stage processing layer in the immediately preceding extraction process, and from the post-stage first image output unit U5 and the pooling layer M16 output from the pooling layer M16. The unit U3 of the output second image output after the output is extracted. Then, the front-stage extraction unit 135 has the front-stage first image output unit U5, which is most activated for each channel among the rear-stage first image output units U3 and U5 and the rear-stage second image output units U3 and U4. And the unit U4 of the second image output in the previous stage is extracted. By doing so, the pre-stage extraction unit 135 can specify a characteristic region in the image.

図８の場合において、後段処理層は、畳み込み層Ｍ１４であり、前段処理層は、畳み込み層を含む他の処理層（例えば、正規化層又はプーリング層等）である。図８においては、前段処理層が正規化層Ｍ１３であるとして説明する。また、図８において、前段処理層は、チャンネルが１つであるとして説明する。第１画像において、畳み込み層Ｍ１４は、ユニットＵ５が活性化しており、正規化層Ｍ１３は、ユニットＵ３、Ｕ５、Ｕ６が活性化している。第２画像において、畳み込み層Ｍ１４は、ユニットＵ３が活性化しており、正規化層Ｍ１３は、ユニットＵ３、Ｕ４、Ｕ５が活性化している。 In the case of FIG. 8, the post-stage treatment layer is a convolution layer M14, and the front-stage treatment layer is another treatment layer including the convolution layer (for example, a normalized layer or a pooling layer). In FIG. 8, the pre-stage processing layer will be described as the normalization layer M13. Further, in FIG. 8, the pre-stage processing layer will be described as having one channel. In the first image, the convolution layer M14 has the unit U5 activated, and the normalized layer M13 has the units U3, U5, and U6 activated. In the second image, the convolution layer M14 has the unit U3 activated, and the normalized layer M13 has the units U3, U4, and U5 activated.

この場合において、後段抽出部１３４は、直前の抽出処理において前段処理層として選択した畳み込み層Ｍ１４を選択して、畳み込み層Ｍ１４から出力された後段第１画像出力のユニットＵ５、及び畳み込み層Ｍ１４から出力された後段第２画像出力のユニットＵ３を抽出する。 In this case, the post-stage extraction unit 134 selects the convolution layer M14 selected as the pre-stage processing layer in the immediately preceding extraction process, and from the post-stage first image output unit U5 and the convolution layer M14 output from the convolution layer M14. The unit U3 of the output second image output after the output is extracted.

続いて、前段抽出部１３５は、後段第１画像出力のユニットＵ５を活性化させる要因となった前段処理層である正規化層Ｍ１３から出力された前段第１画像出力、及び後段第２画像出力のユニットＵ４を活性化させる要因となった前段処理層である正規化層Ｍ１３から出力された前段第２画像出力を比較する。ここで、前段抽出部１３５は、後段処理層が畳み込み層である場合、後段抽出部１３４が後段処理層から抽出したユニットに結合する前段処理層の複数のユニットのうち、前段第１画像出力と前段第２画像出力とにおいて位置が相対的に同じであり、かつチャンネルが共通するユニットを抽出する。この場合、前段抽出部１３５は、前段第１画像出力と前段第２画像出力とにおいて位置が相対的に同じであり、かつチャンネルが共通するユニットとして、前段第１画像出力のユニットＵ５、Ｕ６、及び前段第２画像出力のユニットＵ３、Ｕ４を抽出する。 Subsequently, the front-stage extraction unit 135 outputs the front-stage first image output and the rear-stage second image output output from the normalization layer M13, which is the front-stage processing layer, which is a factor that activates the unit U5 of the rear-stage first image output. The second image output of the pre-stage output from the normalization layer M13, which is the pre-stage processing layer that became a factor for activating the unit U4 of the above, is compared. Here, when the post-stage processing layer is a convolutional layer, the pre-stage extraction unit 135 has the same as the pre-stage first image output among the plurality of units of the pre-stage processing layer that are coupled to the units extracted from the post-stage processing layer by the post-stage extraction unit 134. Extract the units that are relatively the same in position and have the same channel as the second image output in the previous stage. In this case, the front-stage extraction unit 135 is a unit in which the positions of the front-stage first image output and the front-stage second image output are relatively the same and the channels are common. And the units U3 and U4 of the second image output in the previous stage are extracted.

図９の場合において、後段処理層は、正規化層Ｍ１２であり、前段処理層は、正規化層以外の処理層（例えば、畳み込み層又はプーリング層等）である。図９においては、前段処理層がプーリング層Ｍ１１であるとして説明する。また、図９において、前段処理層は、チャンネルが１つであるとして説明する。第１画像において、正規化層Ｍ１２は、ユニットＵ５が活性化している。第２画像において、正規化層Ｍ１２は、ユニットＵ３が活性化している。 In the case of FIG. 9, the post-stage treatment layer is the normalization layer M12, and the front-stage treatment layer is a treatment layer other than the normalization layer (for example, a convolution layer or a pooling layer). In FIG. 9, the pretreatment layer will be described as the pooling layer M11. Further, in FIG. 9, the pre-stage processing layer will be described as having one channel. In the first image, the normalized layer M12 has the unit U5 activated. In the second image, the normalized layer M12 has the unit U3 activated.

ここで、抽出部１３３は、画像に対して前処理を行う正規化層においては、後段処理層において活性化しているユニットに結合している前段処理層に含まれる複数のユニットのうち、中心のユニットを抽出する。この場合において、後段抽出部１３４は、後段処理層として選択した正規化層Ｍ１２から出力された後段第１画像出力のユニットＵ５、及び正規化層Ｍ１２から出力された後段第２画像出力のユニットＵ３を抽出する。 Here, in the normalization layer that preprocesses the image, the extraction unit 133 is the center of the plurality of units included in the pretreatment layer that is bound to the unit that is activated in the post-processing layer. Extract the unit. In this case, the post-stage extraction unit 134 includes a rear-stage first image output unit U5 output from the normalization layer M12 selected as the post-stage processing layer, and a rear-stage second image output unit U3 output from the normalization layer M12. Is extracted.

そして、前段抽出部１３５は、正規化層Ｍ１２から出力された後段第１画像出力のユニットＵ５に結合しているプーリング層Ｍ１１のユニットのうち、中心のユニットＵ５を抽出する。同様に、前段抽出部１３５は、正規化層Ｍ１２から出力された後段第２画像出力のユニットＵ３に結合しているプーリング層Ｍ１１のユニットのうち、中心のユニットＵ３を抽出する。 Then, the front-stage extraction unit 135 extracts the central unit U5 from the units of the pooling layer M11 coupled to the unit U5 of the rear-stage first image output output from the normalization layer M12. Similarly, the front-stage extraction unit 135 extracts the central unit U3 from the units of the pooling layer M11 coupled to the unit U3 of the rear-stage second image output output from the normalization layer M12.

抽出部１３３は、上述の抽出処理を入力層まで繰り返し行うことが好ましい。しかし、抽出部１３３は、抽出処理を最初の処理層まで行わずに、途中の処理層（例えば、プーリング層又は正規化層等）で終了してもよい。このように、抽出部１３３は、伝搬制御部１３２が伝搬させた順序とは逆の順序で抽出処理を行うことにより、抽象度が高い出力を抽出することができる。 It is preferable that the extraction unit 133 repeats the above-mentioned extraction process up to the input layer. However, the extraction unit 133 may end the extraction process at an intermediate process layer (for example, a pooling layer, a normalization layer, or the like) without performing the extraction process up to the first process layer. In this way, the extraction unit 133 can extract an output having a high degree of abstraction by performing the extraction process in the order opposite to the order in which the propagation control unit 132 propagates.

［モーフィング画像生成装置１の処理］
続いて、モーフィング画像生成装置１が行う処理の流れを説明する。図１０は、モーフィング画像生成装置１が行う処理の流れを示すフローチャートである。本フローチャートは、モーフィング画像生成装置１が、記憶部１２に第１画像及び第２画像が格納され、モーフィング画像を生成する処理を実行する操作を受け付けたことを契機として開始する。 [Processing of morphing image generator 1]
Subsequently, the flow of processing performed by the morphing image generator 1 will be described. FIG. 10 is a flowchart showing a flow of processing performed by the morphing image generator 1. This flowchart starts when the morphing image generation device 1 receives an operation in which the first image and the second image are stored in the storage unit 12 and the process of generating the morphing image is executed.

画像取得部１３１は、記憶部１２に記憶されている第１画像と第２画像とを取得する（Ｓ１）。画像取得部１３１は、取得した第１画像と第２画像とを、伝搬制御部１３２に入力する。伝搬制御部１３２は、画像取得部１３１から入力された第１画像及び第２画像のそれぞれに、機械学習モデルＭに含まれる入力層Ｍ１から出力層Ｍ１０までの複数の処理層を、入力層Ｍ１から順に伝搬させる（Ｓ２）。 The image acquisition unit 131 acquires the first image and the second image stored in the storage unit 12 (S1). The image acquisition unit 131 inputs the acquired first image and the second image to the propagation control unit 132. The propagation control unit 132 applies a plurality of processing layers from the input layer M1 to the output layer M10 included in the machine learning model M to each of the first image and the second image input from the image acquisition unit 131, and the input layer M1. Propagate in order from (S2).

抽出部１３３は、後段処理層及び前段処理層の両方の処理層において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力を抽出する処理を行う（Ｓ３）。図１１は、抽出部１３３が行う処理の流れを示すフローチャートである。抽出部１３３は、指示受付部１３６が、操作部１１を介して、複数の処理層のうち、後段処理層として用いる処理層を選択する指示を受け付けたか否かを判定する（Ｓ３１）。 The extraction unit 133 performs a process of extracting one or more first image outputs and one or more second image outputs that are commonly activated in both the processing layers of the post-stage processing layer and the front-stage processing layer (S3). FIG. 11 is a flowchart showing the flow of processing performed by the extraction unit 133. The extraction unit 133 determines whether or not the instruction receiving unit 136 has received an instruction to select a processing layer to be used as the subsequent processing layer from the plurality of processing layers via the operation unit 11 (S31).

抽出部１３３は、指示受付部１３６が指示を受け付けたと判定した場合、指示受付部１３６が受け付けた指示が示す処理層を、後段処理層として選択する（Ｓ３２）。抽出部１３３は、例えば、指示受付部１３６が第１の全結合層Ｍ８を示す指示を受け付けたと判定した場合、指示受付部１３６が受け付けた指示が示す第１の全結合層Ｍ８を、後段処理層として使用する。一方、抽出部１３３は、指示受付部１３６が指示を受け付けていないと判定した場合、最後尾層（例えば、出力層Ｍ１０）で共通に活性化している一以上の第１画像出力及び一以上の第２画像出力があるか否かを判定する（Ｓ３３）。 When the extraction unit 133 determines that the instruction receiving unit 136 has received the instruction, the extraction unit 133 selects the processing layer indicated by the instruction received by the instruction receiving unit 136 as the subsequent processing layer (S32). When, for example, the extraction unit 133 determines that the instruction receiving unit 136 has received the instruction indicating the first fully connected layer M8, the extraction unit 133 processes the first fully connected layer M8 indicated by the instruction received by the instruction receiving unit 136 in a subsequent stage. Used as a layer. On the other hand, when the extraction unit 133 determines that the instruction reception unit 136 has not received the instruction, the extraction unit 133 has one or more first image outputs and one or more images that are commonly activated in the rearmost layer (for example, the output layer M10). It is determined whether or not there is a second image output (S33).

抽出部１３３は、出力層Ｍ１０で共通に活性化している一以上の第１画像出力及び一以上の第２画像出力があると判定した場合、最後尾層である出力層Ｍ１０を、後段処理層として使用する（Ｓ３４）。一方、抽出部１３３は、出力層Ｍ１０で共通に活性化している一以上の第１画像出力及び一以上の第２画像出力がないと判定した場合、出力層Ｍ１０より前の各処理層に対して、共通に活性化している一以上の第１画像出力及び一以上の第２画像出力を繰り返し探索する。そして、抽出部１３３は、共通に活性化している一以上の第１画像出力及び一以上の第２画像出力がある処理層（例えば、第２の全結合層Ｍ９）を、後段処理層として使用する（Ｓ３５）。抽出部１３３は、選択した後段処理層、及び前段処理層の両方の処理層において共通に活性化している、第１画像に基づいて後段処理層及び前段処理層から出力された一以上の第１画像出力と第２画像に基づいて後段処理層及び前段処理層から出力された一以上の第２画像出力とを抽出する。 When the extraction unit 133 determines that there is one or more first image outputs and one or more second image outputs that are commonly activated in the output layer M10, the output layer M10, which is the last layer, is used as a post-processing layer. It is used as (S34). On the other hand, when the extraction unit 133 determines that there is no one or more first image outputs and one or more second image outputs commonly activated in the output layer M10, the extraction unit 133 refers to each processing layer before the output layer M10. Therefore, one or more first image outputs and one or more second image outputs that are commonly activated are repeatedly searched. Then, the extraction unit 133 uses a processing layer having one or more first image outputs and one or more second image outputs (for example, the second fully connected layer M9) that are commonly activated as the subsequent processing layer. (S35). The extraction unit 133 is one or more firsts output from the post-processing layer and the pre-processing layer based on the first image, which are commonly activated in both the selected post-processing layer and the pre-processing layer. Based on the image output and the second image, one or more second image outputs output from the post-stage processing layer and the pre-stage processing layer are extracted.

具体的には、まず、後段抽出部１３４は、選択した後段処理層から出力された複数の後段第１画像出力、及び選択した後段処理層から出力された複数の後段第２画像出力から、共通に活性化している一以上の後段第１画像出力及び一以上の後段第２画像出力を抽出する（Ｓ３６）。そして、前段抽出部１３５は、後段抽出部１３４が抽出した一以上の後段第１画像出力及び一以上の後段第２画像出力を活性化させる要因となった前段処理層から出力された複数の前段第１画像出力、及び前段処理層から出力された複数の前段第２画像出力のうち、共通に活性化している一以上の前段第１画像出力及び一以上の前段第２画像出力を抽出する（Ｓ３７）。 Specifically, first, the post-stage extraction unit 134 is common from the plurality of post-stage first image outputs output from the selected post-stage processing layer and the plurality of post-stage second image outputs output from the selected post-stage processing layer. One or more subsequent first image outputs and one or more subsequent second image outputs that are activated in the above are extracted (S36). Then, the front-stage extraction unit 135 has a plurality of front stages output from the front-stage processing layer that has been a factor in activating one or more rear-stage first image outputs and one or more rear-stage second image outputs extracted by the rear-stage extraction unit 134. Of the first image output and the plurality of pre-stage second image outputs output from the pre-stage processing layer, one or more pre-stage first image outputs and one or more pre-stage second image outputs that are commonly activated are extracted ( S37).

続いて、抽出部１３３は、前段処理層より前に別の処理層があるか否かを判定する（Ｓ３８）。抽出部１３３は、前段処理層（例えば、第２のプーリング層Ｍ７）より前に別の処理層（例えば、第３の畳み込み層Ｍ６）があると判定した場合、第２のプーリング層Ｍ７を後段処理層として選択し（Ｓ３９）、処理をＳ３６に戻す。一方、抽出部１３３は、前段処理層（例えば、入力層Ｍ１）より前に別の処理層がないと判定した場合、抽出した一以上の第１画像出力及び一以上の第２画像出力を特徴点検出部１３７に入力し、抽出処理を終了する。 Subsequently, the extraction unit 133 determines whether or not there is another processing layer before the pre-stage processing layer (S38). When the extraction unit 133 determines that there is another treatment layer (for example, the third convolution layer M6) before the pre-stage treatment layer (for example, the second pooling layer M7), the extraction unit 133 puts the second pooling layer M7 in the rear stage. It is selected as the processing layer (S39), and the processing is returned to S36. On the other hand, when the extraction unit 133 determines that there is no other processing layer before the pre-stage processing layer (for example, the input layer M1), the extraction unit 133 is characterized by one or more extracted first image outputs and one or more second image outputs. Input to the point detection unit 137 and end the extraction process.

図１０に戻り、特徴点検出部１３７は、一以上の第１画像出力及び一以上の第２画像出力に基づいて、対応する特徴点を探索し、対応関係にある一以上の第１画像出力に基づく一以上の第１画像特徴点と、一以上の第２画像出力に基づく一以上の第２画像特徴点とを検出する（Ｓ４）。続いて、選択部１３８は、特徴点検出部１３７が検出した第１画像特徴点及び第２画像特徴点に不適切な特徴点があるか否かを判定する（Ｓ５）。選択部１３８は、例えば、ＲＡＮＳＡＣ法に基づいて絞り込みを行う。 Returning to FIG. 10, the feature point detection unit 137 searches for the corresponding feature points based on one or more first image outputs and one or more second image outputs, and outputs one or more first images having a corresponding relationship. One or more first image feature points based on the above and one or more second image feature points based on one or more second image outputs are detected (S4). Subsequently, the selection unit 138 determines whether or not the first image feature points and the second image feature points detected by the feature point detection unit 137 have inappropriate feature points (S5). The selection unit 138 narrows down based on, for example, the RANSAC method.

選択部１３８は、第１画像特徴点及び第２画像特徴点に不適切な特徴点があると判定した場合、不適切な特徴点、すなわち、誤検出した対応関係にある第１画像特徴点及び第２画像特徴点を除去し（Ｓ６）、除去した後の対応関係に基づく一部の第１画像特徴点及び一部の第２画像特徴点を選択する。中間画像生成部１３９は、選択部１３８が、第１画像特徴点及び第２画像特徴点に不適切な特徴点がないと判定した場合、又は誤検出した対応関係を除去した後に、一以上の第１画像特徴点と一以上の第２画像特徴点とに基づいて、一以上の中間画像を生成する（Ｓ７）。 When the selection unit 138 determines that the first image feature point and the second image feature point have an inappropriate feature point, the inappropriate feature point, that is, the first image feature point and the first image feature point having a corresponding erroneous detection relationship. The second image feature point is removed (S6), and a part of the first image feature point and a part of the second image feature point based on the correspondence after the removal are selected. The intermediate image generation unit 139 determines that there are no inappropriate feature points in the first image feature point and the second image feature point, or after removing the erroneously detected correspondence, one or more of the selection unit 138. One or more intermediate images are generated based on the first image feature point and one or more second image feature points (S7).

中間画像生成部１３９は、例えば、第１画像特徴点が示す第１画像の画素における座標と、第１画像特徴点に対応する第２画像特徴点が示す第２画像の画素における座標とに基づいて、変化ステップを計算する。そして、中間画像生成部１３９は、計算した変化ステップに基づいて、一以上の中間画像を生成する。中間画像生成部１３９は、生成した中間画像を記憶部１２に記憶させる。 The intermediate image generation unit 139 is based on, for example, the coordinates in the pixels of the first image indicated by the first image feature points and the coordinates in the pixels of the second image indicated by the second image feature points corresponding to the first image feature points. And calculate the change step. Then, the intermediate image generation unit 139 generates one or more intermediate images based on the calculated change step. The intermediate image generation unit 139 stores the generated intermediate image in the storage unit 12.

［実施形態における効果］
以上説明したとおり、モーフィング画像生成装置１は、取得した第１画像及び第２画像のそれぞれに、機械学習モデルＭに含まれる複数の処理層を伝搬させる。モーフィング画像生成装置１は、伝搬させた順とは逆の順序で後段処理層及び前段処理層の両方の処理層において共通に活性化している一以上の第１画像出力及び一以上の第２画像出力を、処理層ごとに抽出し、対応関係にある第１画像特徴点及び第２画像特徴点をそれぞれ検出する。そして、モーフィング画像生成装置１は、誤検出した対応関係を除去した一以上の第１画像特徴点と一以上の第２画像特徴点とに基づいて、一以上の中間画像を生成する。 [Effect in Embodiment]
As described above, the morphing image generation device 1 propagates a plurality of processing layers included in the machine learning model M to each of the acquired first image and the second image. The morphing image generator 1 has one or more first image outputs and one or more second images that are commonly activated in both the processing layers of the post-stage processing layer and the pre-stage processing layer in the reverse order of the propagation order. The output is extracted for each processing layer, and the corresponding first image feature points and second image feature points are detected, respectively. Then, the morphing image generation device 1 generates one or more intermediate images based on one or more first image feature points and one or more second image feature points from which the erroneously detected correspondence is removed.

このようにすることで、モーフィング画像生成装置１は、畳み込みニューラルネットワークを含む機械学習モデルＭを使用し、深層学習による抽象度が高い特徴量を求めることにより、第１画像及び第２画像に基づく中間画像を生成することができる。すなわち、モーフィング画像生成装置１は、ユーザが第１画像における特定の領域及び第２画像における特定の領域を対応付けすることなく、対応関係にある第１画像特徴点及び第２画像特徴点をそれぞれ検出することにより、第１画像及び第２画像に基づく中間画像を生成することができる。その結果、モーフィング画像生成装置１は、モーフィング画像の質を向上させることができる。 By doing so, the morphing image generator 1 is based on the first image and the second image by using the machine learning model M including the convolutional neural network and obtaining the features having a high degree of abstraction by deep learning. Intermediate images can be generated. That is, in the morphing image generation device 1, the user does not associate a specific area in the first image with a specific area in the second image, and the user sets the corresponding first image feature points and the second image feature points, respectively. By detecting, an intermediate image based on the first image and the second image can be generated. As a result, the morphing image generator 1 can improve the quality of the morphing image.

モーフィング画像生成装置１は、例えば、時短ビデオ等によってカットされた部分において、映像に連続性がなく不自然に見えているような場合に、連続性がない前後の映像に基づいて中間画像を生成することにより、自然で連続性のある映像にすることができる。また、モーフィング画像生成装置１は、例えば、２つの原画に基づいて一以上の中間画像を生成することにより、アニメーション制作における「中割り」の工程を自動化することができる。 The morphing image generation device 1 generates an intermediate image based on the images before and after the image without continuity when, for example, the image is discontinuous and looks unnatural in a portion cut by a time-saving video or the like. By doing so, it is possible to obtain a natural and continuous image. Further, the morphing image generation device 1 can automate the process of "intermediate division" in animation production by generating one or more intermediate images based on, for example, two original images.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されず、その要旨の範囲内で種々の変形及び変更が可能である。例えば、装置の分散・統合の具体的な実施の形態は、以上の実施の形態に限られず、その全部又は一部について、任意の単位で機能的又は物理的に分散・統合して構成することができる。また、複数の実施の形態の任意の組み合わせによって生じる新たな実施の形態も、本発明の実施の形態に含まれる。組み合わせによって生じる新たな実施の形態の効果は、もとの実施の形態の効果を合わせ持つ。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments, and various modifications and changes can be made within the scope of the gist thereof. be. For example, the specific embodiment of the distribution / integration of the device is not limited to the above embodiment, and all or a part thereof may be functionally or physically distributed / integrated in any unit. Can be done. Also included in the embodiments of the present invention are new embodiments resulting from any combination of the plurality of embodiments. The effect of the new embodiment produced by the combination has the effect of the original embodiment together.

１モーフィング画像生成装置
１１操作部
１２記憶部
１３制御部
１３１画像取得部
１３２伝搬制御部
１３３抽出部
１３４後段抽出部
１３５前段抽出部
１３６指示受付部
１３７特徴点検出部
１３８選択部
１３９中間画像生成部
1 Morphing image generator 11 Operation unit 12 Storage unit 13 Control unit 131 Image acquisition unit 132 Propagation control unit 133 Extraction unit 134 Post-stage extraction unit 135 Front-stage extraction unit 136 Instruction reception unit 137 Feature point detection unit 138 Selection unit 139 Intermediate image generation unit

Claims

An image acquisition unit that acquires a first image that is an image before at least a part of the subject is changed and a second image that is an image after at least a part of the subject is changed.
A propagation control unit that propagates a plurality of processing layers included in a machine learning model capable of outputting the type of subject included in the image based on the input image to each of the first image and the second image.
The post-stage treatment based on the first image, which is commonly activated in both the post-stage treatment layer selected from the plurality of treatment layers and the pre-stage treatment layer which is the treatment layer immediately before the post-stage treatment layer. Extraction of one or more first image outputs output from the layer and the pre-processed layer and one or more second image outputs output from the post-processed layer and the pre-processed layer based on the second image. Department and
A feature point detection unit that detects one or more first image feature points based on the one or more first image outputs and detects one or more second image feature points based on the one or more second image outputs. When,
An intermediate image generation unit that generates one or more intermediate images that stepwise represent the process of changing the subject based on the one or more first image feature points and the one or more second image feature points.
Morphing image generator with.

The extraction unit
The plurality of post-stage first image outputs and the second image output from the post-stage processing layer are produced by propagating the first image in the order of the pre-stage processing layer and the post-stage processing layer which are a part of the plurality of processing layers. One or more post-stage first image outputs and one or more post-stage first images that are commonly activated from the plurality of post-stage second image outputs output from the post-stage processing layers by propagating in the order of the pre-stage processing layer and the post-stage processing layer. 2 Post-stage extraction unit that extracts image output and
A plurality of pre-stage first image outputs output from the pre-stage processing layer, which are factors that activate the one or more post-stage first image output and the one or more post-stage second image output, and output from the pre-stage processing layer. Among the plurality of pre-stage second image outputs, one or more pre-stage first image outputs and one or more pre-stage second image outputs that are commonly activated, and a pre-stage extraction unit that extracts one or more pre-stage second image outputs.
Have,
The morphing image generator according to claim 1.

The pre-stage extraction unit has one or more pre-stage first image outputs and one or more pre-stage first image outputs based on the activated size of the plurality of pre-stage first image outputs and the plurality of pre-stage second image outputs. Extract the second image output in the previous stage,
The morphing image generator according to claim 2.

The machine learning model includes a convolutional neural network.
The post-treatment layer is any one of an output layer, a fully connected layer, a normalized layer, a pooling layer, and a convolution layer.
The morphing image generator according to claim 2 or 3.

The pretreatment layer is any one of a fully bonded layer, a normalized layer, a pooling layer, a convolution layer, and an input layer.
The morphing image generator according to claim 4.

When the last layer, which is the last treatment layer, is selected as the subsequent treatment layer among the plurality of treatment layers, the extraction unit is one or more of the first ones that are commonly activated in the last treatment layer. When there is no image output and the one or more second image outputs, the one or more first image outputs and the one or more second image outputs that are commonly activated in the processing layer before the last layer are extracted. do,
The morphing image generator according to claim 1 or 5.

From the one or more first image feature points and the one or more second image feature points specified by the feature point detection unit, some first image feature points and some second image feature points based on mutual correspondence. It also has a selection section for selecting image feature points,
The intermediate image generation unit generates one or more intermediate images that stepwise represent the process of changing the subject based on the part of the first image feature points and the part of the second image feature points. do,
The morphing image generator according to any one of claims 1 to 6.

The image acquisition unit acquires a plurality of second images in which subjects of the same type and different shapes as the changed subject are captured.
The intermediate image generation unit is one first from the plurality of second images based on the one or more first image feature points and the one or more second image feature points based on each of the plurality of second images. 2 Select an image,
The morphing image generator according to any one of claims 1 to 7.

The intermediate image generation unit selects one second image from the plurality of second images in which the number of second image feature points corresponding to the first image feature points is equal to or greater than a predetermined reference value.
The morphing image generator according to claim 8.

Further, it has an instruction receiving unit that receives an instruction to select a processing layer to be used as the subsequent processing layer among the plurality of processing layers.
The extraction unit uses the processing layer indicated by the instruction received by the instruction receiving unit as the subsequent processing layer.
The morphing image generator according to any one of claims 1 to 9.

The extraction unit selects one of the plurality of processing layers as the post-stage processing layer, extracts the one or more first image outputs and the one or more second image outputs, and then extracts the pre-stage processing layer. The processing layer selected as is selected as the subsequent processing layer, and another one or more first image outputs and one or more second image outputs are extracted.
The morphing image generator according to any one of claims 1 to 10.

A step of acquiring a first image which is an image before at least a part of the subject is changed and a second image which is an image after at least a part of the subject is changed.
A step of propagating a plurality of processing layers included in a machine learning model capable of outputting the type of the subject included in the image based on the input image to each of the first image and the second image.
The post-stage treatment based on the first image, which is commonly activated in both the post-stage treatment layer selected from the plurality of treatment layers and the pre-stage treatment layer which is the treatment layer immediately before the post-stage treatment layer. A step of extracting one or more first image outputs output from the layer and the pre-processed layer and one or more second image outputs output from the post-processed layer and the pre-processed layer based on the second image. When,
A step of detecting one or more first image feature points based on the one or more first image outputs and detecting one or more second image feature points based on the one or more second image outputs.
A step of generating one or more intermediate images that stepwise represent the process of changing the subject based on the one or more first image feature points and the one or more second image feature points.
Morphing image generation method having.

The extraction step
The plurality of post-stage first image outputs and the second image output from the post-stage processing layer are produced by propagating the first image in the order of the pre-stage processing layer and the post-stage processing layer which are a part of the plurality of processing layers. One or more post-stage first image outputs and one or more post-stage first images that are commonly activated from the plurality of post-stage second image outputs output from the post-stage processing layers by propagating in the order of the pre-stage processing layer and the post-stage processing layer. 2 Post-stage extraction step to extract image output and
A plurality of pre-stage first image outputs output from the pre-stage processing layer, which are factors that activate the one or more post-stage first image output and the one or more post-stage second image output, and output from the pre-stage processing layer. A pre-stage extraction step for extracting one or more pre-stage first image outputs and one or more pre-stage second image outputs that are commonly activated among the plurality of pre-stage second image outputs.
The morphing image generation method according to claim 12.

After executing the pre-stage extraction step, the one or more pre-stage first image outputs and the one or more pre-stage second image outputs are used as the plurality of post-stage first image outputs and the plurality of post-stage second image outputs. Execute the latter extraction step,
The morphing image generation method according to claim 13.

The latter-stage extraction step and the first-stage extraction step are executed for each of the plurality of processing layers.
The morphing image generation method according to claim 13 or 14.