JP7300828B2

JP7300828B2 - Learning data generation system, learning data generation method, learning method for machine learning model

Info

Publication number: JP7300828B2
Application number: JP2018240247A
Authority: JP
Inventors: 瑶子山口; 大地小池; 智也藤井; 高志末永
Original assignee: NTT Data Corp
Current assignee: NTT Data Corp
Priority date: 2018-12-21
Filing date: 2018-12-21
Publication date: 2023-06-30
Anticipated expiration: 2038-12-21
Also published as: JP2020102041A

Description

特許法第３０条第２項適用「平成３０年度人口知能技術等を活用した先行図形商標調査業務の高度化・効率化実証的研究事業」平成３０年７月２７日応募Application of Article 30, Paragraph 2 of the Patent Law "Fiscal 2018 Empirical Research Project for Advancement and Efficiency of Prior Design Trademark Searches Utilizing Artificial Intelligence Technology, etc." Application date: July 27, 2018

本発明は、機械学習モデルの学習に用いる学習データを生成する学習データ生成システム、学習データ生成方法、機械学習モデルの学習方法に関する。 The present invention relates to a learning data generation system, a learning data generation method, and a machine learning model learning method for generating learning data used for learning a machine learning model.

従来、複数の種類のパーツ画像からなるイラスト画像（入力画像）を対象に類似画像検索を行なう場合、イラスト画像及び比較対象の比較画像の各々のパーツ画像の図形における局所特徴である注目点をそれぞれ検出している。
そして、イラスト画像及び比較画像の各々において検出された注目点のそれぞれの一致度合いを求め、この一致度合いにより類似性の判定が行なわれる。
例えば、イラスト画像における、外郭形状、骨格形状、構成要素、配置の特徴を抽出し、これらに対して個別に重み付けをすることで、イラスト画像の全体を特徴付ける要素を決定することが開示されている（例えば、特許文献１）。 Conventionally, when similar image retrieval is performed for an illustration image (input image) composed of a plurality of types of parts images, points of interest, which are local features in figures of each parts image of the illustration image and the comparative image to be compared, are respectively Detecting.
Then, the degree of matching between the points of interest detected in each of the illustration image and the comparison image is obtained, and the degree of similarity is determined based on the degree of matching.
For example, it is disclosed that the outline shape, skeletal shape, constituent elements, and arrangement features are extracted from the illustration image, and the elements that characterize the entire illustration image are determined by individually weighting these features. (For example, Patent Document 1).

ところで、この類似性の判定には、識別用の機械学習モデル（識別用機械学習モデル）が用いられることがあり、この識別用の機械学習モデルを学習させるため、大量の学習データが必要となる。
しかし、上述した注目点の比較を行なう判定方法の場合、イラスト画像の真に比較したい、すなわち真に注目したいパーツ画像以外の注目点が検出される虞があり、真に注目したいパーツ画像（象徴パーツ画像）の類似性を判定することが困難である。 By the way, a machine learning model for discrimination (machine learning model for discrimination) may be used for this similarity judgment, and a large amount of learning data is required to train this machine learning model for discrimination. .
However, in the case of the determination method that compares the points of interest described above, there is a risk that points of interest other than the part images that are to be truly compared, i. It is difficult to determine the similarity of parts images).

特開平４－２６８９６９号公報JP-A-4-268969

しかしながら、機械学習モデルを用いた類似画像の検索の場合、他の一般的な多角形や円状の図形あるいは文字が象徴パーツ画像に重なって形成されたイラスト画像もあり、象徴パーツ画像と他の画像とを完全に分離することが困難である。
すなわち、機械学習モデルを用いて、一般的な多角形や円状の図形あるいは文字を分離あるいは除去しようとする場合、モデルの学習に必要な学習データを多数準備することは非常に手間がかかり、また他の図形が重なった部分については象徴パーツ画像からの分離あるいは除去が不完全になることがある。 However, in the case of searching for similar images using machine learning models, there are also illustration images formed by overlapping other general polygonal or circular figures or characters on the symbolic parts image. It is difficult to separate completely from the image.
In other words, when trying to separate or remove general polygonal or circular graphics or characters using a machine learning model, it is very troublesome to prepare a large amount of learning data necessary for learning the model. In addition, separation or removal from the symbol parts image may be imperfect for portions where other graphics are overlapped.

本発明は、このような事情に鑑みてなされたもので、ディープラーニング（Deep Learning）などを用いた機械学習において、それぞれの機械学習モデルの学習に用いる学習データを、容易に多数生成することが可能な学習データ生成システム、学習データ生成方法、機械学習モデルの学習方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and in machine learning using deep learning, etc., it is possible to easily generate a large number of learning data used for learning of each machine learning model. An object of the present invention is to provide a learning data generation system, a learning data generation method, and a learning method for a machine learning model.

この発明は上述した課題を解決するためになされたもので、本発明の学習データ生成システムは、所定の入力画像に対して予め定められたルールによる変更を加えて変更画像を生成する機械学習モデルの学習に使用する学習用入力画像及び学習用変更画像からなる学習データを生成する学習データ生成システムであり、画像の構成要素となるパーツ画像が記憶された複数の画像データベースと、前記画像データベースの各々から前記パーツ画像を選択する選択部と、前記選択部が選択したパーツ画像を組み合わせた複合画像を生成し、前記学習用入力画像とする複合画像生成部と、前記選択部が選択したパーツ画像における特定のパーツ画像からなる前記学習用変更画像を生成する変更画像生成部とを備えることを特徴とする。 The present invention has been made to solve the above-described problems. A learning data generation system according to the present invention is a machine learning model that generates a modified image by modifying a predetermined input image according to a predetermined rule. a learning data generation system for generating learning data consisting of a learning input image and a learning changed image used for learning of a plurality of image databases storing part images serving as constituent elements of an image; a selection unit that selects the parts image from each; a composite image generation unit that generates a composite image combining the parts images selected by the selection unit and is used as the learning input image; and the parts image selected by the selection unit. and a modified image generation unit that generates the modified learning image composed of the specific parts image in the above.

本発明の学習データ生成システムは、前記画像データベースが、前記学習用入力画像に対する視覚印象を観察した際に与える象徴的なパーツ画像である象徴パーツ画像が蓄積された象徴図形データベースと、前記象徴パーツ画像の象徴性を補う、それ自体に象徴性を有さない一般的なパーツ画像である一般パーツ画像が蓄積された一般図形データベースとを含むことを特徴とする。 The learning data generation system of the present invention comprises a symbolic graphic database in which symbolic part images are accumulated, which are symbolic part images given when the visual impression of the input image for learning is observed, in the image database; and a general graphic database in which general parts images, which are general parts images having no symbolism themselves, are stored to supplement the symbolicity of the image.

本発明の学習データ生成システムは、前記画像データベースが、文字を示すパーツ画像である文字パーツ画像が蓄積された文字図形データベースを含むことを特徴とする。 The learning data generation system of the present invention is characterized in that the image database includes a character/graphic database in which character part images, which are part images representing characters, are accumulated.

本発明の学習データ生成システムは、前記学習用入力画像が、前記象徴パーツ画像に加えて、前記一般パーツ画像及び前記文字パーツ画像のいずれか、あるいは双方とから構成され、前記学習用変更画像が、前記学習用入力画像における前記象徴パーツ画像のみが配置されていることを特徴とする。 In the learning data generation system of the present invention, the input image for learning is composed of the symbol parts image and either or both of the general parts image and the character parts image, and the modified learning image is , wherein only the symbol parts image is arranged in the input image for learning.

本発明の学習データ生成システムは、前記パーツ画像の各々の形状を変形する形状拡張部をさらに備えることを特徴とする。 The learning data generation system of the present invention is characterized by further comprising a shape expansion unit that deforms the shape of each part image.

本発明の学習データ生成システムは、前記学習データ生成システムが生成器と識別機とを備え、前記生成器は、前記学習用入力画像から所定の要素を抽出して学習用生成画像を生成する機械学習モデルからなり、前記識別機は、前記学習用入力画像及び前記学習用変更画像からなる学習データ、または、前記学習用入力画像及び前記学習用生成画像からなる学習データの評価を行う機械学習モデルからなる、ことを特徴とする。 In the learning data generation system of the present invention, the learning data generation system includes a generator and a discriminator, and the generator is a machine that extracts predetermined elements from the input image for learning and generates a generated image for learning. a learning model, wherein the classifier evaluates learning data consisting of the learning input image and the learning modified image, or learning data consisting of the learning input image and the learning generated image. It consists of

本発明の学習データ生成システムは、前記機械学習モデルがセマンティックセグメンテーションであり、前記学習用変更画像が前記学習用入力画像における前記パーツ画像の各々の種類が選別され、当該種類を示すラベルを付与する変更が行われていることを特徴とする。 In the learning data generation system of the present invention, the machine learning model is semantic segmentation, the modified image for learning is selected for each type of the parts image in the input image for learning, and a label indicating the type is added. Characterized by changes being made.

本発明の学習データ生成方法は、所定の入力画像に対して予め定められたルールによる変更を加えて変更画像を生成する機械学習モデルの学習に使用する学習用入力画像及び学習用変更画像からなる学習データをコンピュータシステムが生成する学習データ生成方法であり、画像の構成要素となるパーツ画像が記憶された複数の画像データベースの各々から前記パーツ画像を選択する選択過程と、前記選択過程において選択されたパーツ画像を組み合わせた複合画像を生成し、前記学習用入力画像とする複合画像生成過程と、前記選択過程において選択されたパーツ画像における特定のパーツ画像からなる前記学習用変更画像を生成する変更画像生成過程とを含むことを特徴とする。 The learning data generation method of the present invention comprises a learning input image and a learning modified image used for learning a machine learning model that generates a modified image by modifying a predetermined input image according to a predetermined rule. A learning data generation method in which a computer system generates learning data, comprising: a selection process of selecting the parts image from each of a plurality of image databases in which parts images constituting an image are stored; a composite image generation process of generating a composite image by combining the parts images, and using it as the input image for learning; and an image generation process.

本発明の機械学習モデルの学習方法は、所定の入力画像に対して予め定められたルールによる変更を加えて変更画像を生成する機械学習モデルをコンピュータシステムが学習する学習方法であり、画像の構成要素となるパーツ画像が記憶された複数の画像データベースの各々から、前記所定の入力画像の学習用データである学習用入力画像の生成に用いる前記パーツ画像を選択する選択過程と、前記選択過程において選択された前記パーツ画像を組み合わせた複合画像を生成し、前記学習用入力画像とする複合画像生成過程と、前記選択過程において選択されたパーツ画像における特定のパーツ画像からなる学習用変更画像を生成する変更画像生成過程と、前記学習用入力画像を入力することにより、前記学習用変更画像が出力される前記機械学習モデルを学習させる学習過程とを含むことを特徴とする。 The machine learning model learning method of the present invention is a learning method in which a computer system learns a machine learning model that generates a modified image by modifying a predetermined input image according to a predetermined rule. a selection process of selecting the parts image used for generating the learning input image, which is learning data for the predetermined input image, from each of a plurality of image databases storing element parts images; A composite image generation process of generating a composite image by combining the selected part images and using it as the learning input image, and generating a learning modified image composed of a specific part image in the part image selected in the selection process. and a learning process of inputting the learning input image and training the machine learning model outputting the learning modified image.

この発明によれば、ディープラーニング（Deep Learning）等を用いた機械学習において、それぞれの機械学習モデルの学習に用いる学習データを、容易に多数生成することが可能な学習データ生成システム、学習データ生成方法、機械学習モデルの学習方法を提供することができる。 According to the present invention, in machine learning using deep learning or the like, a learning data generation system capable of easily generating a large number of learning data used for learning each machine learning model, and a learning data generation system. A method, a learning method for a machine learning model can be provided.

図１は、本発明の第１の実施形態による学習データ生成システムの構成例を示す図である。FIG. 1 is a diagram showing a configuration example of a learning data generation system according to a first embodiment of the present invention. 本実施形態の学習データ生成システムにおける学習用データの生成の流れを説明する概念図である。FIG. 3 is a conceptual diagram illustrating the flow of learning data generation in the learning data generation system of the present embodiment; 学習用画像データ記憶部２０に記憶されている学習用データの例を示す図である。3 is a diagram showing an example of learning data stored in a learning image data storage unit 20; FIG. ＧＡＮの学習に用いる学習用データの生成の動作例を示すフローチャートである。FIG. 10 is a flowchart showing an operation example of generating learning data used for GAN learning. FIG. 機械学習モデルである識別器の初期学習を説明する概念図である。FIG. 4 is a conceptual diagram illustrating initial learning of a discriminator, which is a machine learning model; 学習データを用いてＧＡＮの生成器と識別器との学習を説明する概念図である。FIG. 4 is a conceptual diagram illustrating learning of a GAN generator and a discriminator using learning data; 本発明の第２の実施形態による学習データ生成システムの構成例を示す図である。It is a figure which shows the structural example of the learning data generation system by the 2nd Embodiment of this invention. 本実施形態の学習データ生成システムにおける学習用データの生成の流れを説明する概念図である。FIG. 3 is a conceptual diagram illustrating the flow of learning data generation in the learning data generation system of the present embodiment; 入力される入力画像におけるパーツ画像の種別を判定し、それぞれのパーツ画像におけるピクセルの各々にラベルを付与する機械学習モデルの学習例を説明する概念図である。FIG. 4 is a conceptual diagram illustrating a learning example of a machine learning model that determines the types of parts images in an input image to be input and assigns a label to each pixel in each parts image.

＜第１の実施形態＞
以下、図面を参照して、本発明の第１の実施形態による学習データ生成システムについて説明する。図１は、本発明の第１の実施形態による学習データ生成システムの構成例を示す図である。本実施形態においては、ＧＡＮ（Generative Adversarial Network）の学習用データの生成を例として説明する。
図１において、本実施形態における学習データ生成システム１は、制御部１１、パーツ画像選択部１２、文字列生成部１３、形状拡張部１４、複合画像生成部１５、画像表示部１６、象徴図形データベース１７、一般図形データベース１８、文字図形データベース１９及び学習用画像データ記憶部２０の各々を備えている。また、学習データ生成システム１は、後述する識別機５０１及び生成器５０２（図５、図６参照）を含む構成としても良い。 <First embodiment>
A learning data generation system according to a first embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram showing a configuration example of a learning data generation system according to a first embodiment of the present invention. In this embodiment, generation of learning data for a GAN (Generative Adversarial Network) will be described as an example.
In FIG. 1, the learning data generation system 1 in this embodiment includes a control unit 11, a parts image selection unit 12, a character string generation unit 13, a shape extension unit 14, a composite image generation unit 15, an image display unit 16, a symbolic graphic database 17, a general graphic database 18, a character graphic database 19, and a learning image data storage unit 20, respectively. Also, the learning data generation system 1 may be configured to include a discriminator 501 and a generator 502 (see FIGS. 5 and 6), which will be described later.

制御部１１は、図示しない入力手段（キーボード、マウスによる画面選択）から入力される制御信号を、この制御信号の示す制御内容に対応させて、パーツ画像選択部１２、文字列生成部１３、形状拡張部１４、複合画像生成部１５及び画像表示部１６のそれぞれに出力する。また、制御部１１は、外部から供給される象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々を、象徴図形データベース１７、一般図形データベース１８、文字図形データベース１９それぞれに対して書き込んで記憶させる。 The control unit 11 makes a control signal input from an input means (not shown) (screen selection by a keyboard or mouse) correspond to the control contents indicated by the control signal, and selects a parts image selection unit 12, a character string generation unit 13, a shape It outputs to each of the expansion unit 14, the composite image generation unit 15, and the image display unit 16. FIG. The control unit 11 also writes and stores the symbolic parts images, general parts images, and character parts images supplied from the outside into the symbolic graphic database 17, the general graphic database 18, and the character graphic database 19, respectively.

パーツ画像選択部１２は、画像表示部１６の表示画面に対して、パーツ画像選択画面を表示する。ここで、パーツ画像は、本実施形態において、組み合わせて一つの複合画像とされた、学習用データにおける学習用入力画像の生成に用いる画像である。本実施形態においては、学習用データは、学習用入力画像及び学習用変更画像の各々の画像データの組として構成されている。 The parts image selection unit 12 displays a parts image selection screen on the display screen of the image display unit 16 . Here, in the present embodiment, the parts image is an image that is combined into one composite image and used to generate a learning input image in the learning data. In this embodiment, the data for learning is configured as a set of image data of each of the input image for learning and the modified image for learning.

図２は、本実施形態の学習データ生成システムにおける学習用データの生成の流れを説明する概念図である。図２（ａ）は、画像表示部１６の表示画面における学習用データの画像領域１６Ｓを示している。また、図２（ａ）は、パーツ画像が何も表示されておらず、画像領域１６Ｓのみが表示されている。この画像領域１６Ｓは、縦及び横の各々の所定のピクセル数により、領域の高さと幅を指定することでサイズが特定される。
図２（ｂ）は、象徴パーツ画像１０１が作業者により選択され、画像表示部１６の表示画面における画像領域１６Ｓの所定の位置に、選択された象徴パーツ画像１０１が配置された状態を示している。ここで、象徴パーツ画像は、任意の画像を観察した際、観察者が特徴と感じる視覚的印象を与える象徴的な図形の画像データであり、一般パーツ画像に比較して相対的に複雑な形状をした画像である。本実施形態では、複数の象徴パーツ画像が、象徴図形データベース１７に予め書き込まれて記憶されている。 FIG. 2 is a conceptual diagram illustrating the flow of learning data generation in the learning data generation system of this embodiment. 2A shows an image area 16S of learning data on the display screen of the image display unit 16. FIG. Further, in FIG. 2A, no parts image is displayed, and only the image area 16S is displayed. This image area 16S is sized by specifying the height and width of the area by a predetermined number of pixels in each of the vertical and horizontal directions.
FIG. 2(b) shows a state in which the symbolic parts image 101 is selected by the operator and the selected symbolic parts image 101 is placed at a predetermined position in the image area 16S on the display screen of the image display unit 16. there is Here, the symbolic part image is image data of a symbolic figure that gives an observer a visual impression of a feature when observing an arbitrary image, and has a relatively complicated shape compared to the general part image. This is an image with In this embodiment, a plurality of symbolic part images are written and stored in the symbolic figure database 17 in advance.

次に、図２（ｃ）は、一般パーツ画像１０２が作業者により選択され、画像表示部１６の表示画面における画像領域１６Ｓの所定の位置に、選択された一般パーツ画像１０２が象徴パーツ画像１０１とともに配置された状態を示している。ここで、一般パーツ画像は、複合画像として学習用入力画像を生成する際、上述した象徴パーツ画像と組み合わせる、観察者が一般的と感じる視覚的印象を与える図形、例えば直線状、曲線状、多角形状や円状の図形の画像データであり、象徴パーツ画像に比較して相対的に単純な形状をした画像である。すなわち、一般パーツ画像は、象徴パーツ画像の象徴性を補う、それ自体に象徴性を有さない一般的な画像である。本実施形態では、複数の一般パーツ画像が、一般図形データベース１８に予め書き込まれて記憶されている。 Next, in FIG. 2C, a general parts image 102 is selected by the operator, and the selected general parts image 102 is displayed at a predetermined position in the image area 16S on the display screen of the image display unit 16 as the symbol parts image 101. It shows a state where it is arranged with Here, the general parts image is a figure, such as a straight line, a curved line, or a polygonal shape, which is combined with the above-described symbolic parts image to give a general visual impression to the observer when generating a learning input image as a composite image. It is image data of a shape or a circular figure, and is an image having a relatively simple shape compared to the symbol parts image. That is, the general parts image is a general image that supplements the symbolism of the symbolic parts image and does not have symbolism itself. In this embodiment, a plurality of general parts images are written in advance and stored in the general graphic database 18 .

そして、図２（ｄ）は、文字パーツ画像１０３が作業者により選択され、画像表示部１６の表示画面における画像領域１６Ｓの所定の位置に、選択された文字パーツ画像１０３が文字列画像として象徴パーツ画像１０１及び一般パーツ画像１０２とともに配置された状態を示している。ここで、文字パーツ画像は、漢字、ひらがな、カタカナ、アルファベット、数字などの一文字の画像であり、複合画像として学習用入力画像を生成する際、上述した象徴パーツ画像と組み合わせられる。また、文字パーツ画像１０３は、単数あるいは複数により、意味のある単語を示す文字列とされて用いられが、これに限らず意味の無い文字列であってもよい。本実施形態では、所定の文字コード（テキスト形式）に基づき、フォントや色などの文字飾りを指定して生成された文字パーツ画像が、文字図形データベースに予め書き込まれて記憶されている。 FIG. 2D shows that the character part image 103 is selected by the operator, and the selected character part image 103 is symbolized as a character string image at a predetermined position in the image area 16S on the display screen of the image display unit 16. It shows a state where it is arranged together with a parts image 101 and a general parts image 102 . Here, the character part image is an image of a single character such as kanji, hiragana, katakana, alphabets, numbers, etc., and is combined with the above-described symbol part image when generating a learning input image as a composite image. Also, the character part image 103 is used as a character string indicating a meaningful word, singularly or plurally, but is not limited to this and may be a meaningless character string. In this embodiment, character parts images generated by designating character decorations such as fonts and colors based on predetermined character codes (text format) are written in advance and stored in the character graphic database.

本実施形態においては、図２（ｂ）（すなわち、図２（ｅ））が学習用変更画像として用いられ、図２（ｄ）が学習用入力画像として用いられる。すなわち、図２（ｄ）に示す学習用入力画像と、図２（ｅ）に示す学習用変更画像との組により、学習用データ１００が構成されている。 In this embodiment, FIG. 2(b) (that is, FIG. 2(e)) is used as the modified image for learning, and FIG. 2(d) is used as the input image for learning. That is, the learning data 100 is composed of a set of the learning input image shown in FIG. 2(d) and the learning modified image shown in FIG. 2(e).

図１に戻り、文字列生成部１３は、作業者が文字コードの各々を配列する順に文字列画像を生成する。例えば、図２（ｄ）における「ニゴニゴハウス」の文字列画像は、「ニ」、「ゴ」、「ニ」、「ゴ」、「ハ」、「ウ」、「ス」のそれぞれの文字パーツ画像が順番に配列されて形成される。なお、文字列として配列させる際、この文字列に配列させる文字コードをランダムに指定してもよく、この場合は意味の無い文字列画像が形成される。
また、「ニ」、「ゴ」、「ニ」、「ゴ」、「ハ」、「ウ」、「ス」のそれぞれの文字パーツ画像は、パーツ画像選択部１２により作業者の設定により選択される。なお、画像領域１６Ｓにおいて、文字飾りの配置をランダムに指定してもよく、この場合は指定した数の一の文字列から、複数の文字列画像が文字飾りとして形成される。 Returning to FIG. 1, the character string generation unit 13 generates a character string image in the order in which the operator arranges each character code. For example, the character string image of "Nigonigo House" in FIG. The images are arranged and formed in order. When arranging as a character string, the character codes to be arranged in this character string may be specified at random, in which case a meaningless character string image is formed.
In addition, the character part images of "ni", "go", "ni", "go", "ha", "u", and "su" are selected by the operator's setting by the parts image selection unit 12. be. In the image area 16S, the arrangement of character decorations may be specified at random. In this case, a plurality of character string images are formed as character decorations from the designated number of character strings.

形状拡張部１４は、配置された象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々の図形の形状を変化させる処理を、作業者の入力する条件に対応して行なう。なお、形状拡張部１４は、上述した図形の形状を変化させる処理をランダムに行っても良い。
例えば、象徴パーツ画像であれば、拡大、縮小、縦横比、図形を形成する線分の太さなどを変化させる。また、一般パーツ画像においても、拡大、縮小、縦横比、図形を形成する線分の太さなどを変化させる。文字パーツ画像においては、拡大、縮小、縦横比、図形を形成する線分の太さ、フォント、色などを変化させる。なお、形状拡張部１４は、上述し文字パーツ画像の形状を変化させる処理を、文字飾りの文字列における文字パーツ画像毎にランダムに行っても良い。 The shape extension unit 14 performs processing for changing the shape of each of the placed symbol parts images, general parts images, and character parts images in accordance with conditions input by the operator. Note that the shape extension unit 14 may randomly perform the process of changing the shape of the figure described above.
For example, in the case of a symbol parts image, enlargement, reduction, aspect ratio, thickness of line segments forming a figure, etc. are changed. In general parts images as well, enlargement, reduction, aspect ratio, thickness of line segments forming figures, etc. are changed. In the character part image, enlargement, reduction, aspect ratio, thickness of line segment forming the figure, font, color, etc. are changed. Note that the shape extension unit 14 may randomly perform the process of changing the shape of the character part image described above for each character part image in the character string of the character decoration.

形状拡張部１４は、画像領域１６Ｓ内に配置された象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々の図形の位置を変化させる処理を、作業者が入力する条件に対応して行う。なお、この象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々の図形の位置を変化させる処理は、ランダムに行っても良い。また、図形の位置を変化させる処理をランダムに行う場合、画像領域１６Ｓを複数の区画に分割し、象徴パーツ画像、一般パーツ画像及び文字パーツ画像のいずれかの一の画像が配置された区画とは異なる区画に対し、一の画像と異なる他の画像を配置することで、象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々の重なりを避けるように配置してもよい。 The shape extension unit 14 changes the position of each figure of the symbol parts image, the general parts image, and the character parts image arranged in the image area 16S in accordance with the conditions input by the operator. The process of changing the position of each figure of the symbol parts image, the general parts image, and the character parts image may be performed randomly. Further, when the processing for changing the position of the figure is performed randomly, the image area 16S is divided into a plurality of sections, and one section in which one of the symbol parts image, the general parts image, and the character parts image is arranged. may be arranged so as to avoid overlap of each of the symbol parts image, the general parts image, and the character parts image by arranging another image different from one image in different sections.

複合画像生成部１５は、画像領域１６Ｓ内に配置された、パーツ画像選択部１２が選択した象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々を合成して複合画像を生成し、この複合画像を学習用入力画像とする。
そして、複合画像生成部１５は、画像領域１６Ｓに対して象徴画像のみが配置された画像を学習用変更画像とし、上記学習用入力画像とともに学習用データを構成し、学習用画像データ記憶部２０に対して書き込んで記憶させる。
画像表示部１６は、画像表示装置であり、例えば、カラーの液晶パネルである。 The composite image generation unit 15 generates a composite image by compositing each of the symbol parts image, the general parts image, and the character parts image selected by the parts image selection unit 12 and arranged in the image area 16S. is an input image for learning.
Then, the composite image generation unit 15 sets an image in which only the symbol image is arranged in the image region 16S as a modified image for learning, configures learning data together with the input image for learning, and stores the image data for learning 20 as a modified image for learning. be written to and stored.
The image display unit 16 is an image display device, such as a color liquid crystal panel.

象徴図形データベース１７は、象徴パーツ画像の種類に分類される図形の画像が蓄積されているデータベースである。
一般図形データベース１８は、一般パーツ画像の種類に分類される図形の画像が蓄積されているデータベースである。
文字図形データベース１９は、文字パーツ画像の種類に分類される図形の画像が蓄積されているデータベースである。これら象徴パーツ画像、一般パーツ画像及び文字パーツ画像は、学習用入力画像を構成する際の構成要素として用いられる。 The symbolic figure database 17 is a database in which images of figures classified into the types of symbolic part images are accumulated.
The general figure database 18 is a database in which images of figures classified into types of general parts images are accumulated.
The character/figure database 19 is a database in which images of figures classified into types of character part images are accumulated. These symbol parts images, general parts images, and character parts images are used as constituent elements when constructing an input image for learning.

学習用画像データ記憶部２０は、複合画像生成部１５により学習用データが書き込まれて記憶されている。
図３は、学習用画像データ記憶部２０に記憶されている学習用データの例を示す図である。図３（ａ）が学習用入力画像２０１ａと学習用変更画像２０１ｂとで学習用データ２０１が形成されている。図３（ｂ）が学習用入力画像２０２ａと学習用変更画像２０２ｂとで学習用データ２０２が形成されている。図３（ｃ）が学習用入力画像２０３ａと学習用変更画像２０３ｂとで学習用データ２０３が形成されている。図３（ｄ）が学習用入力画像２０４ａと学習用変更画像２０４ｂとで学習用データ２０４が形成されている。 The learning image data storage unit 20 stores learning data written by the composite image generation unit 15 .
FIG. 3 is a diagram showing an example of learning data stored in the learning image data storage unit 20. As shown in FIG. In FIG. 3A, learning data 201 is formed by a learning input image 201a and a learning modified image 201b. In FIG. 3B, learning data 202 is formed by a learning input image 202a and a learning changed image 202b. In FIG. 3C, learning data 203 is formed by a learning input image 203a and a learning changed image 203b. In FIG. 3D, the learning data 204 is formed by the learning input image 204a and the learning changed image 204b.

図３（ａ）及び図３（ｂ）の各々の学習用入力画像は、象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々を用いて形成されている。ただし、図３（ａ）は、文字列が横方向に配列された文字パーツ画像から形成されている。一方、図３（ｂ）は、文字列が縦方向に配列された文字パーツ画像から形成されている。
図３（ｃ）の学習用入力画像は、象徴パーツ画像と変形された一般パーツ画像とから構成され、文字パーツ画像を含んでいない。
また、図３（ｄ）の学習用入力画像は、象徴パーツ画像と文字パーツ画像が配列して形成された文字列（「山小屋ホテル」の文字列）とから構成され、一般パーツ画像を含んでいない。 Each learning input image in FIGS. 3A and 3B is formed using symbol parts images, general parts images, and character parts images. However, FIG. 3A is formed from character part images in which character strings are arranged in the horizontal direction. On the other hand, FIG. 3B is formed from character part images in which character strings are arranged in the vertical direction.
The input image for learning in FIG. 3(c) is composed of symbol parts images and deformed general parts images, and does not contain character parts images.
The input image for learning in FIG. 3(d) is composed of a symbol part image and a character string formed by arranging character part images (a character string of "Yamagoya Hotel"), and includes a general part image. not present.

上述したように、学習用入力画像は、象徴図形データベース１７に記憶されている象徴パーツ画像を核の画像として、一般図形データベース１８に記憶されている一般パーツ画像、文字図形データベース１９に記憶されている文字パーツ画像のいずれか、あるいは双方を配置した複合画像として形成される。
一方、学習用変更画像は、象徴パーツ画像のみが配置された画像として形成される。 As described above, the learning input images are stored in the general parts image stored in the general graphic database 18 and the character graphic database 19, with the symbolic parts image stored in the symbolic graphic database 17 as the core image. A composite image is formed by arranging one or both of the character part images.
On the other hand, the modified image for learning is formed as an image in which only symbol parts images are arranged.

ＧＡＮにおいては、機械学習モデルとして、生成器（図６における生成器５０２）及び識別器（図５及び図６における識別器５０１）の各々の２種類があり、生成器の機械学習モデルは、入力される学習用入力画像の象徴部分のみを抽出した学習用生成画像として出力する機能（所定のルール）を有する。そして、制御部１１は、上記生成器（生成器５０２）の機械学習モデルから得た学習用生成画像もまた、上記学習用入力画像とともに学習用データを構成し、学習用画像データ記憶部２０に対して書き込んで記憶させる。 In GAN, there are two types of machine learning models: a generator (generator 502 in FIG. 6) and a classifier (classifier 501 in FIGS. 5 and 6). It has a function (predetermined rule) of outputting a generated learning image obtained by extracting only the symbolic portion of the learning input image. Then, the control unit 11 configures the learning generated image obtained from the machine learning model of the generator (generator 502) as learning data together with the learning input image, and stores it in the learning image data storage unit 20. Write it down and store it.

また、識別機（識別器５０１）の機械学習モデルは、学習用入力画像と学習用変更画像とからなる学習用データ、または、学習用入力画像と学習用生成画像とからなる学習用データを入力する。そして、上記識別器５０１は、上記学習用データを入力した際、入力された学習用データが、学習用変更画像（すなわち象徴図形データベース１７に格納された象徴パーツ画像のみからなる画像）を含む場合には「本物」と判定し、一方、学習用生成画像（すなわち生成器５０２により生成された画像）を含む場合には「偽物」と判定する機能を有する。 In addition, the machine learning model of the classifier (classifier 501) receives learning data consisting of a learning input image and a learning modified image, or learning data consisting of a learning input image and a learning generated image. do. When the learning data is input to the discriminator 501, if the input learning data includes a modified learning image (that is, an image consisting only of the symbolic part images stored in the symbolic graphic database 17), has a function of judging that it is "genuine", and judging that it is "fake" if it contains a generated image for learning (that is, an image generated by the generator 502).

そして、上記機械学習モデル全体（識別器５０１及び生成器５０２）としては、学習過程において、既知である学習用画像の「本物」または「偽物」の別（区別）と、識別機５０１が判定した「本物」または「偽物」の度合とを比較する。識別機５０１が判定した「偽物」らしさの度合に応じて、識別器５０１を欺いて判定結果が「本物」となる学習用生成画像を生成器５０２に生成させるため、識別器５０１の判定結果の出力値（後述する評価値）が生成器５０２に対してフィードバックされる。
換言すると、識別機５０１が「偽物」を「偽物」らしいと判定した場合とは、生成器５０２が生成した学習用生成画像（学習用入力画像から特徴部と思しき箇所を抽出した画像）の精度が低く、学習用変更画像（最終的に抽出したい特徴部の画像）と異なることを識別機５０１が容易に識別できる場合、と評価できるため、生成器５０２の抽出精度を向上させるべく、出力値のフィードバックを行うのである。 Then, as for the entire machine learning model (classifier 501 and generator 502), in the learning process, the classification (discrimination) of known learning images as “genuine” or “fake” is determined by the classifier 501. Compare the degree of "genuine" or "fake". In order to cause the generator 502 to generate a learning generated image whose determination result is "genuine" by deceiving the classifier 501 according to the degree of "fake"-likeness determined by the classifier 501, the determination result of the classifier 501 is changed. An output value (evaluation value described later) is fed back to the generator 502 .
In other words, when the discriminator 501 determines that a “fake” is likely to be a “fake”, the accuracy of the generated learning image generated by the generator 502 (an image obtained by extracting a portion that seems to be a characteristic part from the input image for learning) is low and the classifier 501 can easily identify that it is different from the modified image for learning (the image of the feature part to be finally extracted). feedback.

さらに、生成器５０２の学習が進む（識別器５０１を欺いて判定結果が「本物」となる学習用生成画像を生成できるようになる）につれて、生成器５０２が生成した学習用生成画像は、学習用変更画像に徐々に近似していく。この段階に至ると、識別機５０１は、入力された学習用データの「本物」と「偽物」の別を正しく判定できなくなり、判定を誤る状況となる。
換言すると、識別機５０１が「偽物」を「本物」と誤って判定した場合とは、生成器５０２が生成した学習用生成画像の精度が高く、学習用変更画像と異なることが識別機５０１にとって定かでない場合、と評価できる。このように、識別機５０１が「偽物」を「本物」と誤って判定する学習用生成画像を生成器５０２が生成することこそ、生成器５０２の学習の完了を意味する。 Furthermore, as the learning of the generator 502 progresses (it becomes possible to generate a learning generated image that deceives the classifier 501 and the determination result is “genuine”), the learning generated image generated by the generator 502 becomes a learning image. The modified image is gradually approximated. At this stage, the discriminator 501 cannot correctly determine whether the input learning data is “genuine” or “fake,” resulting in an erroneous determination.
In other words, when the classifier 501 erroneously determines the “fake” to be the “genuine”, the classifier 501 believes that the learning generated image generated by the generator 502 has high accuracy and is different from the learning modified image. If you are not sure, you can evaluate. In this way, the fact that the generator 502 generates the generated learning image in which the discriminator 501 erroneously determines the “fake” to be the “genuine” signifies the completion of the learning of the generator 502 .

そして、上述のように機械学習モデルを機械学習させた学習済モデルである生成器５０２を用いて、画像の類似性を判定するシステムを構成すれば、未知の一般画像に対して、精度よく画像の特徴部（例えば、抽象パーツ画像に類する抽象的な画像部分）を抽出することが可能となる。 Then, if a system for judging image similarity is configured using the generator 502, which is a learned model obtained by subjecting the machine learning model to machine learning, as described above, an unknown general image can be generated with high accuracy. (for example, an abstract image portion similar to an abstract parts image) can be extracted.

図４は、ＧＡＮの学習に用いる学習用データの生成の動作例を示すフローチャートである。
ステップＳ１０１：パーツ画像選択部１２は、象徴図形データベース１７に記憶されている象徴パーツ画像を読み出し、画像表示部１６の表示画面における画像領域１６Ｓの近傍の領域に、読み出した象徴パーツ画像の各々を表示する。 FIG. 4 is a flowchart showing an operation example of generating learning data used for GAN learning.
Step S101: The parts image selection section 12 reads out the symbolic parts images stored in the symbolic graphic database 17, and displays each of the read out symbolic parts images in an area near the image area 16S on the display screen of the image display section 16. indicate.

ステップＳ１０２：作業者は、画像表示部１６の表示画面に表示された象徴パーツ画像の各々を観察し、これらの象徴パーツ画像のなかから、学習用データの生成に使用する象徴パーツ画像を選択する。
そして、作業者は、選択した象徴パーツ画像を、画像領域１６Ｓの所定の位置に移動させて配置する。 Step S102: The operator observes each symbolic part image displayed on the display screen of the image display unit 16, and selects a symbolic part image to be used for generating learning data from among these symbolic part images. .
Then, the operator moves the selected symbol parts image to a predetermined position in the image area 16S and arranges it.

ステップＳ１０３：次に、パーツ画像選択部１２は、画像表示部１６の表示画面に対して、一般パーツ画像の選択を行なうか否かを選択する通知の表示を行なう。
すなわち、パーツ画像選択部１２は、学習用入力画像に対して、象徴パーツ画像とともに一般パーツ画像を含めるか否かの判定を、作業者に対して促す。
このとき、作業者は、学習用入力画像に一般パーツ画像を含ませないと判断した場合、一般パーツ画像の選択を行なわないことを示す入力を行なう。一方、作業者は、学習用入力画像に一般パーツ画像を含ませると判断した場合、一般パーツ画像の選択を行なうことを示す入力を行なう。 Step S103: Next, the parts image selection unit 12 displays a notification on the display screen of the image display unit 16 to select whether or not to select a general parts image.
That is, the parts image selection unit 12 prompts the operator to determine whether or not to include the general parts image together with the symbolic parts image in the learning input image.
At this time, if the operator determines that the general parts image should not be included in the input image for learning, the operator performs an input indicating that the general parts image is not to be selected. On the other hand, when the operator determines that the general parts image should be included in the learning input image, the operator performs an input indicating selection of the general parts image.

そして、パーツ画像選択部１２は、作業者が一般パーツ画像を学習用入力画像に配置することを選択した場合、処理をステップＳ１０４へ進める。
一方、パーツ画像選択部１２は、作業者が一般パーツ画像を学習用入力画像に配置しないことを選択した場合、処理をステップＳ１０６へ進める。 Then, when the operator selects to arrange the general parts image in the learning input image, the parts image selection unit 12 advances the process to step S104.
On the other hand, when the operator selects not to arrange the general parts image in the learning input image, the parts image selection unit 12 advances the process to step S106.

ステップＳ１０４：パーツ画像選択部１２は、一般図形データベース１８に記憶されている一般パーツ画像を読み出し、画像表示部１６の表示画面における画像領域１６Ｓの近傍の領域に、読み出した一般パーツ画像の各々を表示する。 Step S104: The parts image selection unit 12 reads the general parts images stored in the general figure database 18, and displays each of the read general parts images in an area near the image area 16S on the display screen of the image display unit 16. indicate.

ステップＳ１０５：作業者は、画像表示部１６の表示画面に表示された一般パーツ画像の各々を観察し、これらの一般パーツ画像のなかから、学習用データの生成に使用する一般パーツ画像を選択する。
そして、作業者は、画像領域１６Ｓにおいてすでに配置されている象徴パーツ画像の位置に対応させて、選択した一般パーツ画像を画像領域１６Ｓの所定の位置に移動させて配置する。 Step S105: The operator observes each of the general parts images displayed on the display screen of the image display unit 16, and selects a general parts image to be used for generating learning data from among these general parts images. .
Then, the operator moves the selected general parts image to a predetermined position in the image area 16S and arranges it in correspondence with the position of the symbol parts image already arranged in the image area 16S.

ステップＳ１０６：パーツ画像選択部１２は、画像表示部１６の表示画面に対して、文字パーツ画像の選択を行なうか否かを選択する通知の表示を行なう。
すなわち、パーツ画像選択部１２は、学習用入力画像に対して、象徴パーツ画像とともに文字パーツ画像を含めるか否かの判定を、作業者に対して促す。
このとき、作業者は、学習用入力画像に文字パーツ画像を含まないと判断した場合、文字パーツ画像の選択を行なわないことを示す入力を行なう。一方、作業者は、学習用入力画像に文字パーツ画像を含ませると判断した場合、文字パーツ画像の選択を行なうことを示す入力を行なう。 Step S106: The parts image selection unit 12 displays a notification on the display screen of the image display unit 16 to select whether or not to select a character parts image.
That is, the parts image selection unit 12 prompts the operator to determine whether or not to include the character parts image together with the symbol parts image in the learning input image.
At this time, when the operator determines that the input image for learning does not include the character part image, the operator performs an input indicating that no character part image is to be selected. On the other hand, when the operator determines to include the character part image in the input image for learning, the operator performs an input indicating selection of the character part image.

そして、パーツ画像選択部１２は、作業者が文字パーツ画像を学習用入力画像に配置することを選択した場合、処理をステップＳ１０７へ進める。
一方、パーツ画像選択部１２は、作業者が文字パーツ画像を学習用入力画像に配置しないことを選択した場合、処理をステップＳ１０９へ進める。 Then, when the operator selects to arrange the character parts image in the input image for learning, the parts image selection unit 12 advances the process to step S107.
On the other hand, when the operator selects not to arrange the character parts image in the input image for learning, the parts image selection unit 12 advances the process to step S109.

ステップＳ１０７：パーツ画像選択部１２は、文字図形データベース１９に記憶されている文字パーツ画像を読み出し、画像表示部１６の表示画面における画像領域１６Ｓの近傍の領域に、読み出した文字パーツ画像の各々を表示する。 Step S107: The part image selection unit 12 reads the character part images stored in the character/graphic database 19, and displays each of the read character part images in an area near the image area 16S on the display screen of the image display unit 16. indicate.

ステップＳ１０８：作業者は、画像表示部１６の表示画面に表示された文字パーツ画像の各々を観察し、これらの文字パーツ画像のなかから、学習用データの生成に使用する文字パーツ画像を単数あるいは複数個選択する。
そして、文字列生成部１３は、作業者が、画像領域１６Ｓにおいてすでに配置されている象徴パーツ画像の位置に対応させて、選択した文字パーツ画像の各々を画像領域１６Ｓの所定の位置に移動させる処理に対応させた文字列（単数で選択した際、一文字の文字列となる）を一つの文字列画像として生成する。 Step S108: The operator observes each of the character part images displayed on the display screen of the image display unit 16, and selects one or more character part images to be used for generating learning data from among these character part images. Select multiple.
Then, the character string generator 13 causes the operator to move each of the selected character part images to a predetermined position in the image area 16S in correspondence with the position of the symbol parts image already arranged in the image area 16S. A character string corresponding to the process (when a single character is selected, it becomes a character string of one character) is generated as one character string image.

ステップＳ１０９：複合画像生成部１５は、画像表示部１６の表示画面における画像領域１６Ｓ内の画像を合成して複合画像を生成し、この複合画像を学習用入力画像とする。
また、複合画像生成部１５は、ステップＳ１０２における象徴パーツ画像が配置された画像領域１６Ｓを、学習用変更画像として象徴パーツ画像のみの画像を生成する。
そして、複合画像生成部１５は、生成した学習用入力画像及び学習用変更画像の各々を組み合わせて、学習用データとして学習用画像データ記憶部２０に対して書き込んで記憶させる。 Step S109: The composite image generation unit 15 generates a composite image by synthesizing the images in the image area 16S on the display screen of the image display unit 16, and uses this composite image as an input image for learning.
In addition, the composite image generating unit 15 generates an image of only the symbolic parts image as a modified image for learning from the image area 16S in which the symbolic parts image is arranged in step S102.
Then, the composite image generation unit 15 combines the generated learning input image and learning modified image, and writes and stores them in the learning image data storage unit 20 as learning data.

また、ここで、複合画像生成部１５は、象徴パーツ画像、一般パーツ画像及び文字列画像の各々を組み合わせた配置において、画像領域１６Ｓ内の上下左右の配置位置、それぞれのパーツ画像の重なり具合などのレイアウトパターンの各々を表示し、ユーザが選択したレイアウトパターンに従って、象徴パーツ画像、一般パーツ画像、文字列画像のそれぞれを配置させる構成としても良い。なお、象徴パーツ画像に対して同様の処理を行っても良い。 Further, here, the composite image generating unit 15 determines the top, bottom, left, and right layout positions in the image area 16S, the degree of overlap of each part image, etc., in the layout in which the symbol parts image, the general parts image, and the character string image are combined. may be displayed, and the symbol parts image, the general parts image, and the character string image may be arranged according to the layout pattern selected by the user. Note that the same processing may be performed on the symbol parts image.

ステップＳ１１０：次に、形状拡張部１４は、画像表示部１６の表示画面における画像領域１６Ｓ内の画像を変形する形状拡張処理を行なうか否かの判定を促す通知画像を、画像表示部１６の表示画面に表示する。
すなわち、形状拡張部１４は、画像領域１６Ｓ内の画像に対して、例えば一般パーツ画像の変形、文字列画像の変形、一般パーツ画像の画像領域１６Ｓ内における配置位置の移動、文字列画像の画像領域１６Ｓ内における配置位置の移動、象徴パーツ画像の画像領域１６Ｓ内における配置位置の移動などの形状拡張処理を実行するか否かの判定を、作業者に対して促す。なお、象徴パーツ画像に対しても同様に、上述した形状拡張処理を行っても良い。 Step S110: Next, the shape extension unit 14 displays a notification image on the image display unit 16 to prompt a determination as to whether or not to perform shape extension processing to transform the image in the image area 16S on the display screen of the image display unit 16. Display on the display screen.
That is, the shape extension unit 14 performs, for example, deformation of the general parts image, deformation of the character string image, movement of the arrangement position of the general parts image within the image region 16S, and image The operator is prompted to determine whether or not to execute shape expansion processing such as movement of the arrangement position within the area 16S and movement of the arrangement position of the symbol parts image within the image area 16S. It should be noted that the above-described shape expansion processing may be similarly performed on the symbol parts image.

そして、形状拡張部１４は、作業者が画像領域１６Ｓ内の画像に対して形状拡張処理を実行することを選択した場合、処理をステップＳ１１１へ進める。
一方、パーツ画像選択部１２は、作業者が画像領域１６Ｓ内の画像に対して形状拡張処理を実行しないことを選択した場合、処理をステップＳ１１２へ進める。 Then, when the operator selects to execute the shape expansion process on the image within the image area 16S, the shape expansion unit 14 advances the process to step S111.
On the other hand, when the operator selects not to perform the shape expansion process on the image within the image area 16S, the parts image selection unit 12 advances the process to step S112.

ステップＳ１１１：次に、形状拡張部１４は、作業者の操作に対応して、画像領域１６Ｓに配置された一般パーツ画像及び文字パーツ画像の各々のいずれかあるいは双方との図形を変形させるデータ拡張処理を行なう。
ここで、データ拡張処理としては、すでに述べたように、一般パーツ画像の変形、文字列画像の変形、一般パーツ画像の画像領域１６Ｓ内における配置位置の移動、文字列画像の画像領域１６Ｓ内における配置位置の移動、象徴パーツ画像の画像領域１６Ｓ内における配置位置の移動などである。 Step S111: Next, the shape extension unit 14 performs data extension that transforms the figure with either or both of the general parts image and the character parts image arranged in the image area 16S in response to the operator's operation. process.
Here, as described above, the data extension processing includes deformation of the general parts image, deformation of the character string image, movement of the arrangement position of the general parts image within the image area 16S, This includes movement of the arrangement position, movement of the arrangement position within the image area 16S of the symbol parts image, and the like.

一般パーツ画像の変形は、例えば、直線の太さや長さを変えたり（拡大や縮小も含む）、破線、波線、一点鎖線などの線種を変更したり、色を変更したり、あるいは多角形であれば形状を変化させたり、象徴パーツ画像に対する相対位置を変化させたり、反転あるいは回転させたりなどの処理、あるいはこれらの処理の組合せが含まれる。
文字列画像の変形は、例えば、文字パーツ画像の各々のフォントを変えたり、文字列画像を屈曲させたり、文字列における文字パーツ画像を削除したり、文字列における文字パーツ画像の各々の拡大縮小を行なったり、象徴パーツ画像に対する相対位置を変化させたり、反転あるいは回転させたり、文字パーツ画像の間隔の調整などの処理、あるいはこれらの処理の組合せが含まれる。 General part images can be transformed, for example, by changing the thickness and length of straight lines (including enlargement and reduction), changing line types such as dashed lines, wavy lines, and dashed-dotted lines, changing colors, and changing polygons. If so, processing such as changing the shape, changing the position relative to the symbol parts image, reversing or rotating, or a combination of these processing is included.
Transformation of the character string image includes, for example, changing the font of each character part image, bending the character string image, deleting the character part image in the character string, and scaling each character part image in the character string. , changing the relative position with respect to the symbol part image, reversing or rotating the character part image, adjusting the interval between the character part images, or a combination of these processes.

また、象徴パーツ画像の変形は、拡大、縮小、反転あるいは回転させたり、画像領域１６Ｓ内における配置位置の移動、図形を描画する線種の変更などの処理、あるいはこれらの処理の組合せが含まれる。
上述したデータ拡張処理により、一回配置した画像領域１６Ｓ内の画像から、多数の変形された学習用入力画像が生成されるため、これを学習用変更画像と組み合わせることで、容易に多数の学習データを生成することができる。 Further, deformation of the symbol parts image includes processing such as enlargement, reduction, inversion or rotation, movement of the arrangement position within the image area 16S, change of line type for drawing figures, or a combination of these processing. .
By the above-described data augmentation processing, a large number of transformed input images for learning are generated from the images in the image region 16S that have been arranged once. data can be generated.

ステップＳ１１２：複合画像生成部１５は、学習データの生成を終了するか否かの判定を促す通知画像を、画像表示部１６の表示画面に表示する。
そして、形状拡張部１４は、作業者が学習データの生成を終了することを選択した場合、処理を終了する。
一方、パーツ画像選択部１２は、作業者が学習データの生成を終了しないことを選択した場合、処理をステップＳ１０１へ進める。 Step S112: The composite image generation unit 15 displays on the display screen of the image display unit 16 a notification image prompting a determination as to whether or not to end the generation of learning data.
Then, when the operator selects to terminate the generation of learning data, the shape extension unit 14 terminates the processing.
On the other hand, when the operator selects not to finish generating the learning data, the parts image selection unit 12 advances the process to step S101.

次に、ＧＡＮにおける機械学習モデルの学習としては、上述した処理で生成した学習用データを用いて識別器の学習を行なわせる。
図５は、機械学習モデルである識別器の初期学習を説明する概念図である。図５において、識別器５０１に対して、学習用データ２０１の学習用入力画像２０１ａと学習用変更画像２０１ｂ（本物）とを入力し、本物なので識別情報（評価値）を「１」として学習を行なわせる。同様に、識別器５０１に対して、学習用データ２０２の学習用入力画像２０２ａと学習用変更画像２０２ｂ（本物）とを入力し、本物なので識別情報（評価値）を「１」として学習を行なわせる。また、識別器５０１に対して、学習用データ２０５として、学習用入力画像２０５ａと、予め準備した学習用生成画像２０５ｂ（偽物）と入力し、偽物なので識別情報（評価値）を「０」として学習を行なわせる。
この識別器５０１に対する学習の処理を、学習用画像データ記憶部２０に蓄積された多くの学習用データにより行い、初期学習を行なう。 Next, as the learning of the machine learning model in the GAN, learning of the classifier is performed using the learning data generated by the above-described processing.
FIG. 5 is a conceptual diagram illustrating initial learning of a classifier, which is a machine learning model. In FIG. 5, a learning input image 201a and a learning modified image 201b (genuine) of the learning data 201 are input to a classifier 501, and since they are genuine, learning is performed with the identification information (evaluation value) set to "1". Let it be done. Similarly, a learning input image 202a and a learning modified image 202b (genuine) of the learning data 202 are input to the classifier 501, and since they are genuine, learning is performed with the identification information (evaluation value) set to "1". Let Also, a learning input image 205a and a previously prepared learning generated image 205b (fake) are input to the classifier 501 as learning data 205, and the identification information (evaluation value) is set to "0" because it is a fake. Let them learn.
The learning process for the discriminator 501 is performed using a large amount of learning data accumulated in the learning image data storage unit 20, and initial learning is performed.

図６は、学習データを用いてＧＡＮの生成器と識別器との学習を説明する概念図である。図６において、生成器５０２は識別器５０１と同様に機械学習モデルであるが、識別器５０１と真贋判定を行なうのではなく、画像生成を行なう機械学習モデルである。すなわち、本実施形態においては、入力画像から象徴パーツ画像を抜き出し、学習用生成画像に変更する、言い換えると入力画像から一般パーツ画像及び文字パーツ画像の各々を除去する機械学習モデルである。ここで、識別器５０１は、すでに述べた初期学習が済んでいることを前提としている。 FIG. 6 is a conceptual diagram illustrating learning of a GAN generator and classifier using learning data. In FIG. 6, the generator 502 is a machine learning model similar to the discriminator 501, but it is a machine learning model that generates an image instead of performing authenticity determination with the discriminator 501. FIG. That is, the present embodiment is a machine learning model that extracts the symbol parts image from the input image and changes it to a generated image for learning, in other words, removes the general parts image and the character parts image from the input image. Here, it is assumed that the classifier 501 has already completed the initial learning.

ＧＡＮの学習としては、学習用データ２０１における学習用入力画像２０１ａを生成器５０２へ入力する。これにより、生成器５０２は、学習用入力画像２０１ａに対して何らかの変更を行ない、変更画像として学習用生成画像２０１ａ’を出力する。
そして、識別器５０１に対して、学習用入力画像２０１ａと学習用変更画像２０１ｂとの組からなる学習データ、または学習用入力画像２０１ａと学習用生成画像２０１ａ’との組からなる学習データを供給する。
識別器５０１は、入力（供給）された学習用データが、学習用変更画像（すなわち象徴図形データベース１７に格納された象徴パーツ画像のみからなる画像）を含む場合には「本物」と判定して「１」に近い値を評価値として出力し、学習用生成画像（すなわち生成器５０２により生成された画像）を含む場合には「偽物」と判定して「０」に近い値を評価値として出力する。識別器５０１は、判定結果の評価値を、生成器５０２に対して出力する。 For GAN learning, a learning input image 201 a in learning data 201 is input to the generator 502 . As a result, the generator 502 makes some changes to the learning input image 201a and outputs a learning generated image 201a' as a modified image.
Then, the classifier 501 is supplied with learning data consisting of a set of the input image for learning 201a and the changed image for learning 201b, or learning data consisting of a set of the input image for learning 201a and the generated image for learning 201a′. do.
The discriminator 501 determines that the input (supplied) learning data is “genuine” when it includes a modified learning image (that is, an image consisting only of the symbolic part images stored in the symbolic figure database 17). A value close to "1" is output as the evaluation value, and if the generated image for learning (that is, the image generated by the generator 502) is included, it is determined as "fake" and a value close to "0" is set as the evaluation value. Output. The discriminator 501 outputs the evaluation value of the determination result to the generator 502 .

生成器５０２の学習の目的は、識別器５０１を欺くよう学習用生成画像を、生成器５０２が作成することであるため、識別器５０１による学習用生成画像の出力する評価値が「１」となることが望ましい。
そこで、生成器５０１の学習においては、「本物」を示す「１」と、識別器５０１の出力した評価値との差分であるロスが小さくなるよう、生成器５０２のパラメータ（例えば、機械学習モデルにおける関数のパラメータ）の更新を行う。
この学習において、学習用データは、学習用画像データ記憶部２０に格納されている学習用データ（学習用入力画像及び学習用生成画像の組）を用いて行なわれる。 Since the purpose of the learning of the generator 502 is to generate a training generated image to deceive the classifier 501, the evaluation value output from the training generated image by the classifier 501 is “1”. It is desirable to be
Therefore, in the learning of the generator 501, the parameter of the generator 502 (for example, the machine learning model function parameters) are updated.
In this learning, learning data (a set of a learning input image and a learning generated image) stored in the learning image data storage unit 20 is used as the learning data.

これにより、上述したＧＡＮの学習により、生成器５０２は、徐々に、入力される学習用入力画像から象徴パーツ画像のみを抽出した学習用生成画像を生成するように学習される。
一方、識別器５０１は、徐々に入力される学習用入力画像と学習用生成画像との比較において、学習用生成画像を偽物と判定できる確率が低下していく。
したがって、理論的には、学習済みのＧＡＮにおいては、識別器５０１が、学習用入力画像と、生成器５０２が生成した学習用生成画像との真贋判定で５０％の割合で、学習用生成画像を本物あるいは偽物と判定する状態まで、識別器５０１及び生成器５０２の各々が学習される。なお、実際には、システムの出力値と期待する値との差をロスと呼び、学習回数を繰り返してこのロスを減少させるよう学習を行い、識別機５０１のロスと生成器５０２のロスとの、双方の減少幅が収束する状態まで学習を繰り返す。また、識別機５０１及び生成器５０２の学習は、学習用生成画像の生成と、生成された学習用生成画像の評価値のフィードバックの繰り返しを所定回数実行した段階で終了するものとしてもよい。 As a result, the generator 502 is gradually learned by the GAN learning described above so as to generate a generated learning image by extracting only the symbol parts image from the inputted learning input image.
On the other hand, the classifier 501 compares the learning input image and the learning generated image that are gradually input, and the probability that the learning generated image can be determined as a fake decreases.
Therefore, theoretically, in a trained GAN, the classifier 501 determines the authenticity of the learning input image and the learning generated image generated by the generator 502 at a rate of 50%. Each of the discriminator 501 and the generator 502 is trained until it determines that the is genuine or fake. In practice, the difference between the output value of the system and the expected value is called a loss. , the learning is repeated until both widths of decrease converge. Further, the learning of the discriminator 501 and the generator 502 may end when the generation of the generated learning image and the feedback of the evaluation value of the generated learning generated image are repeated a predetermined number of times.

上述したように、本実施形態によれば、象徴図形データベース１７、一般図形データベース１８及び文字図形データベース１９の各々から、象徴パーツ画像、一般パーツ画像、文字パーツ画像それぞれを選択し、画像表示部１６の表示画面における画像領域１６Ｓに配置する処理により、学習用入力画像と学習用変更画像とを組とした学習用データを生成することができるため、ＧＡＮにおける識別器５０１及び生成器５０２の機械学習モデルの学習に用いる学習データを容易に大量に生成することができる。 As described above, according to this embodiment, the symbol parts image, the general parts image, and the character parts image are selected from the symbolic figure database 17, the general figure database 18, and the character figure database 19, respectively, and the image display unit 16 By the processing of arranging in the image area 16S on the display screen, it is possible to generate learning data in which the learning input image and the learning modified image are combined. A large amount of learning data used for model learning can be easily generated.

また、本実施形態によれば、画像表示部１６の表示画面における画像領域１６Ｓに象徴パーツ画像、一般パーツ画像及び文字パーツ画像の各々を配置して一つの学習用データを作成した後、この配置のレイアウトの変更、あるいは象徴パーツ画像、一般パーツ画像、文字パーツ画像のそれぞれのデータ拡張処理を行なうことにより、作成した学習用データの変形バージョンを生成することにより、学習用データのバリエーションを容易に増加させ、容易に大量の学習用データを得ることができる。 Further, according to the present embodiment, after each of the symbol parts image, the general parts image, and the character parts image is arranged in the image area 16S on the display screen of the image display unit 16 to create one piece of learning data, this arrangement By changing the layout of or by performing data extension processing for each of the symbol part images, general part images, and character part images, a modified version of the created learning data is generated, making it easy to create variations in the learning data. can be increased, and a large amount of training data can be obtained easily.

なお、本実施形態においては、各々のパーツ画像の選択（ステップＳ１０１、Ｓ１０４、Ｓ１０７）を作業者が行うものとしたが、システム化する（学習データ生成システム１に対して、パーツ画像の選択を予め設定された選択ルールに基づいて行なう機能のアプリケーションを加える）ことで機械的に行ってもよい。この場合、上記選択ルールとしては、例えば、象徴図形データベース１７、一般図形データベース１８、文字図形データベース１８に格納された各々の画像を、順番に選択したり、ランダムに選択したりするものとすればよい。 In this embodiment, the selection of each part image (steps S101, S104, S107) is performed by the operator, but systematization (selection of the part image for the learning data generation system 1 It may be performed mechanically by adding an application of a function performed based on preset selection rules. In this case, as the selection rule, for example, the images stored in the symbolic graphic database 17, the general graphic database 18, and the character graphic database 18 are selected in order or randomly. good.

また、本実施形態においては、画像領域１６Ｓにおける各々のパーツ画像の配置（ステップＳ１０２、Ｓ１０５、Ｓ１０８）を作業者が行うものとしたが、システム化する（学習データ生成システム１に対して、パーツ画像の配置を予め設定された配置ルールに基づいて行なう機能のアプリケーションを加える）ことで機械的に行ってもよい。この場合、上記配置ルールとしては、例えば、画像領域１６Ｓを仮想的に複数の区画に分割し、各々のパーツ画像をそれぞれの区画に順番に配置したり、ランダムに配置したりするものとすればよい。 In addition, in the present embodiment, the placement of each part image in the image area 16S (steps S102, S105, and S108) was performed by the operator, but systematization (parts The arrangement of images may be performed mechanically by adding an application having a function of arranging images based on preset arrangement rules. In this case, as the arrangement rule, for example, the image area 16S is virtually divided into a plurality of sections, and each part image is arranged in each section in order or randomly. good.

また、本実施形態においては、一般パーツ画像の使用、文字パーツ画像の使用の判断（ステップＳ１０３、Ｓ１０６）を作業者が行うものとしたが、システム化する（学習データ生成システム１に対して、上記使用の判断を予め設定された使用判断ルールに基づいて行なう機能のアプリケーションを加える）ことで機械的に行ってもよい。この場合、上記使用判断ルールとしては、例えば、各々のパーツ画像を使用するか否かの判断を、ランダムに行うものとすればよい。 In the present embodiment, the operator determines whether to use the general parts image or the character parts image (steps S103 and S106). The determination of use may be performed mechanically by adding an application having a function of performing the determination of use based on preset usage determination rules. In this case, as the usage determination rule, for example, whether or not to use each part image may be randomly determined.

また、本実施形態においては、一般パーツ画像の使用、文字パーツ画像の使用、データ拡張処理の実施の判断（ステップＳ１０３、Ｓ１０６、Ｓ１１０）を作業者が行うものとしたが、システム化する（学習データ生成システム１に対して、上記実施の判断を予め設定された実施判断ルールに基づいて行なう機能のアプリケーションを加える）ことで機械的に行ってもよい。この場合、上記実施判断ルールとしては、例えば、一般パーツ画像の使用、文字パーツ画像の使用、データ拡張処理の実施の各々の判断を、ランダムに行うものとすればよい。 In this embodiment, the operator determines whether to use general parts images, to use character parts images, and to perform data extension processing (steps S103, S106, and S110). It may be performed mechanically by adding an application having a function of performing the above-described execution judgment based on a preset execution judgment rule to the data generation system 1 . In this case, as the implementation determination rule, for example, the use of general parts images, the use of character parts images, and the execution of data extension processing may be determined at random.

また、本実施形態においては、画像領域１６Ｓのサイズの特定を所定のピクセル数により高さと幅を指定して行なうものとしたが、システム化する（学習データ生成システム１に対して、上記画像領域１６Ｓのサイズを所定のピクセル数により高さと幅を指定する予め設定されたサイズ指定ルールに基づいて行なう機能のアプリケーションを加える）ことで機械的に行ってもよい。この場合、上記サイズ指定ルールとしては、例えば、高さと幅の各々のピクセル数の値を、ランダムに行うものとすればよい。 In the present embodiment, the size of the image area 16S is specified by specifying the height and width using a predetermined number of pixels. The size of 16S may be mechanically performed by adding the application of a function based on preset size specification rules that specify height and width by a predetermined number of pixels. In this case, as the size designation rule, for example, the number of pixels in each of the height and width may be set randomly.

＜第２の実施形態＞
本発明の第２の実施形態による学習データ生成システムについて説明する。図７は、本発明の第２の実施形態による学習データ生成システムの構成例を示す図である。本実施形態においては、セマンティックセグメンテーションの学習用データの生成を例として説明する。
また、機械学習モデルの一つとして、入力される入力画像から所定の種類の物体の画像を検出するセマンティックセグメンテーション（Semantic Segmentation）がある。セマンティックセグメンテーションは、入力画像の全体や、入力画像画像の一部の検出ではなく、入力画像における画素（あるいはピクセル）の一つ一つに対して、その画素（あるいはピクセル）が示す意味をラベル付けして、入力画像における各物体の画像であるパーツ画像の検出を行なう。
このため、セマンティックセグメンテーションの場合、学習データとしては検出したいパーツ画像に対してラベル付け、すなわちパーツ画像における画素（あるいはピクセル）の各々に、物体の種類を示すラベル付けを行なった学習データを準備する必要がある。
図７において、本実施形態における学習データ生成システム１０は、制御部１１１、パーツ画像選択部１１２、ラベル付与部１１３、形状拡張部１１４、複合画像生成部１１５、画像表示部１１６、動物図形データベース１１７、建物図形データベース１１８、道路図形データベース１１９及び学習用画像データ記憶部１２０の各々を備えている。 <Second embodiment>
A learning data generation system according to a second embodiment of the present invention will be described. FIG. 7 is a diagram showing a configuration example of a learning data generation system according to the second embodiment of the present invention. In this embodiment, generation of training data for semantic segmentation will be described as an example.
Also, as one of machine learning models, there is semantic segmentation for detecting an image of a predetermined type of object from an input image. Semantic segmentation does not detect the entire input image or a part of the input image, but rather labels each pixel (or pixel) in the input image with the meaning that the pixel (or pixel) indicates. Then, parts images, which are images of each object in the input image, are detected.
For this reason, in the case of semantic segmentation, training data is prepared by labeling the part images to be detected, that is, by labeling each pixel (or pixel) in the part image to indicate the type of object. There is a need.
7, the learning data generation system 10 in this embodiment includes a control unit 111, a parts image selection unit 112, a labeling unit 113, a shape extension unit 114, a composite image generation unit 115, an image display unit 116, and an animal figure database 117. , a building graphic database 118, a road graphic database 119, and a learning image data storage unit 120, respectively.

制御部１１１は、図示しない入力手段（キーボード、マウスによる画面選択）から入力される制御信号を、この制御信号の示す制御内容に対応させて、パーツ画像選択部１１２、ラベル付与部１１３、形状拡張部１１４、複合画像生成部１１５及び画像表示部１１６のそれぞれに出力する。また、制御部１１は、外部から供給される動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々を、動物図形データベース１１７、建物図形データベース１１８、道路図形データベース１１９それぞれに対して書き込んで記憶させる。本実施形態においては、動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々を組合せて、学習用データを生成する説明を行なっているが、動物パーツ画像、建物パーツ画像、道路パーツ画像のそれぞれは図形の一例であり、学習用データを生成する際の組み合わせる図形はどのような種類を用いてもかまわない。 The control unit 111 makes a control signal input from input means (not shown) (screen selection by a keyboard or mouse) correspond to the control contents indicated by the control signal, and controls the parts image selection unit 112, the labeling unit 113, the shape extension unit 113, and the It outputs to the unit 114, the composite image generation unit 115, and the image display unit 116, respectively. The control unit 11 also writes the animal part images, the building part images, and the road part images supplied from the outside to the animal graphic database 117, the building graphic database 118, and the road graphic database 119, respectively, for storage. In the present embodiment, an explanation has been given of generating learning data by combining animal part images, building part images, and road part images. It is an example of a figure, and any kind of figure may be used to combine when generating learning data.

パーツ画像選択部１１２は、画像表示部１１６の表示画面に対して、パーツ画像選択画面を表示する。ここで、パーツ画像は、本実施形態において、一例として動物パーツ画像、建物パーツ画像、道路パーツ画像などであり、各々組み合わせて一つの複合画像とする、学習用データにおける学習用入力画像の生成に用いる画像である。本実施形態においては、第１の実施形態と同様に、学習用データは、学習用入力画像及び学習用変更画像の各々の画像データの組として構成されている。 The parts image selection unit 112 displays a parts image selection screen on the display screen of the image display unit 116 . Here, in the present embodiment, the parts images are, for example, animal parts images, building parts images, road parts images, etc., and are combined to form one composite image. image used. In the present embodiment, as in the first embodiment, the learning data is configured as a set of image data of each of an input image for learning and a modified image for learning.

図８は、本実施形態の学習データ生成システムにおける学習用データの生成の流れを説明する概念図である。図８（ａ）は、画像表示部１１６の表示画面における学習用データの画像領域１１６Ｓを示している。また、図８（ａ）は、パーツ画像が何も表示されておらず、画像領域１１６Ｓのみが表示されている。
図８（ｂ）は、動物パーツ画像４０１が作業者により選択され、画像表示部１１６の表示画面における画像領域１１６Ｓの所定の位置に、選択された動物パーツ画像４０１が配置された状態を示している。ここで、動物パーツ画像は、複合画像として生成される学習用入力画像を生成するためのパーツ画像の一種であり、人間、犬、猫などの動物の図形の画像データである。 FIG. 8 is a conceptual diagram illustrating the flow of learning data generation in the learning data generation system of this embodiment. 8A shows an image area 116S of learning data on the display screen of the image display unit 116. FIG. Further, in FIG. 8A, no parts image is displayed, and only the image area 116S is displayed.
FIG. 8B shows a state in which an animal part image 401 is selected by the operator and the selected animal part image 401 is placed at a predetermined position in the image area 116S on the display screen of the image display unit 116. there is Here, the animal parts image is a kind of parts image for generating a learning input image generated as a composite image, and is image data of figures of animals such as humans, dogs, and cats.

次に、図８（ｃ）は、建物パーツ画像４０２が作業者により選択され、画像表示部１１６の表示画面における画像領域１１６Ｓの所定の位置に、選択された建物パーツ画像４０２が動物パーツ画像４０１とともに配置された状態を示している。ここで、建物パーツ画像は、複合画像として学習用入力画像を生成するためのパーツ画像の一種であり、上述した動物パーツ画像と組み合わせる、住宅、工場、スーパーマーケット、ビルディングなどの図形の画像データである。 Next, in FIG. 8C, a building parts image 402 is selected by the operator, and the selected building parts image 402 is displayed at a predetermined position in the image area 116S on the display screen of the image display unit 116, and the animal parts image 401 is displayed. It shows a state where it is arranged with Here, the building parts image is a kind of parts image for generating learning input images as composite images, and is image data of figures such as houses, factories, supermarkets, buildings, etc. combined with the animal parts images described above. .

そして、図８（ｄ）は、道路パーツ画像４０３が作業者により選択され、画像表示部１１６の表示画面における画像領域１１６Ｓの所定の位置に、選択された道路パーツ画像４０３が動物パーツ画像４０１及び建物パーツ画像４０２とともに配置された状態を示している。ここで、道路パーツ画像は、高速道路、農道、歩道、横断歩道、一般道路（複数の車線種類有り）などの図形の画像データである。本実施形態においては、図８（ｄ）が学習用入力画像として用いられる。であり、この学習用入力画像の各パーツにラベルを付加したデータが学習用変更画像として用いられる。 8(d), the road parts image 403 is selected by the operator, and the selected road parts image 403 is displayed at a predetermined position in the image area 116S on the display screen of the image display unit 116. It shows a state where it is arranged together with the building parts image 402 . Here, the road part image is graphic image data such as highways, farm roads, sidewalks, pedestrian crossings, general roads (with multiple types of lanes), and the like. In this embodiment, FIG. 8D is used as the learning input image. , and data obtained by adding a label to each part of the input image for learning is used as a modified image for learning.

図８（ｅ）は、図８（ｄ）の各パーツ画像を構成するピクセルそれぞれにラベルを付与した画像である。図８（ｅ）においては、ラベルとして各ピクセルの色を付与しており、動物パーツ画像のピクセルを赤色、建物パーツ画像のピクセルを青色、道路パーツ画像のピクセルを黄色としている。
このように、学習用入力画像の各パーツ、すなわち動物パーツ画像の各々のピクセルには動物であることを示すラベル、建物パーツ画像の各々のピクセルには建物であることを示すラベル、道路パーツ画像の各々のピクセルには道路であることを示すラベルを付与する。
本実施形態においては、図８（ｅ）が学習用変更画像であり、この学習用入力画像の各パーツのピクセルそれぞれにラベルを付与した画像データである。 FIG. 8(e) is an image in which labels are assigned to the pixels constituting each parts image in FIG. 8(d). In FIG. 8E, the color of each pixel is given as a label, and the pixel of the animal part image is red, the pixel of the building part image is blue, and the pixel of the road part image is yellow.
In this way, each part of the input image for learning, that is, each pixel of the animal parts image has a label indicating that it is an animal, each pixel of the building parts image has a label that indicates that it is a building, and each pixel of the building parts image has a label indicating that it is a building. Each pixel of is given a label indicating that it is a road.
In the present embodiment, FIG. 8E is a modified image for learning, which is image data in which each pixel of each part of this input image for learning is labeled.

ラベル付与部１１３は、作業者が選択したパーツ画像（動物パーツ画像、建物パーツ画像、道路パーツ画像）の各々のピクセルに対し、パーツ画像の種別を示すラベルとして、作業者が設定した色を付与する。 The label assigning unit 113 assigns a color set by the worker as a label indicating the type of the part image to each pixel of the part image (animal part image, building part image, road part image) selected by the operator. do.

複合画像生成部１１５は、作業者が画像表示部１１６の表示画面における画像領域１１６Ｓに配置した動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々を合成して、学習用入力画像を生成する。
また、複合画像生成部１１５は、作業者が画像表示部１１６の表示画面における画像領域１１６Ｓに配置した、ラベルを付与した動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々を合成して、学習用変更画像を生成する。
そして、複合画像生成部１１５は、第１の実施形態と同様に、生成した学習用入力画像及び学習用変更画像を組合わせて、学習用データとして学習用画像データ記憶部１２０に対して書き込んで記憶させる。 The composite image generator 115 synthesizes each of the animal part images, the building part images, and the road part images placed in the image area 116S on the display screen of the image display unit 116 by the operator to generate a learning input image.
In addition, the composite image generation unit 115 synthesizes each of the labeled animal part images, the building part images, and the road part images, which are placed in the image area 116S on the display screen of the image display unit 116 by the operator, to perform learning. Generate a modified image for
Then, as in the first embodiment, the composite image generation unit 115 combines the generated learning input image and learning modified image, and writes them as learning data to the learning image data storage unit 120. Memorize.

形状拡張部１１４は、第１の実施形態と同様に、画像領域１１６Ｓに配置された動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々のデータ拡張処理を、作業者の入力にしたがって行なう。
ここで、複合画像生成部１１５は、形状拡張部１１４がデータ拡張処理をした画像領域１１６Ｓに配置された動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々を合成して、学習用データを形成して学習用データとして学習用画像データ記憶部１２０に対して書き込んで記憶させる。 As in the first embodiment, the shape extension unit 114 performs data extension processing for each of the animal part images, building part images, and road part images arranged in the image area 116S according to the operator's input.
Here, the composite image generation unit 115 synthesizes each of the animal part images, the building part images, and the road part images placed in the image area 116S subjected to the data extension processing by the shape extension unit 114 to form learning data. Then, it is written and stored in the learning image data storage unit 120 as learning data.

動物図形データベース１１７は、動物パーツ画像の種類に分類される図形の画像が蓄積されているデータベースである。
建物図形データベース１１８は、建物パーツ画像の種類に分類される図形の画像が蓄積されているデータベースである。
道路図形データベース１１９は、道路パーツ画像の種類に分類される図形の画像が蓄積されているデータベースである。
学習用画像データ記憶部１２０は、学習用入力画像及び学習用変更画像の組である学習用データが書き込まれて記憶されている。 The animal figure database 117 is a database in which images of figures classified into types of animal part images are accumulated.
The building figure database 118 is a database in which images of figures classified into the types of building part images are accumulated.
The road figure database 119 is a database in which images of figures classified into types of road part images are accumulated.
In the learning image data storage unit 120, learning data, which is a set of a learning input image and a learning modified image, is written and stored.

図９は、入力される入力画像におけるパーツ画像の種別を判定し、それぞれのパーツ画像におけるピクセルの各々にラベルを付与する機械学習モデルの学習例を説明する概念図である。生成器５５１は、セマンティックセグメンテーションの機械学習モデルであり、入力される入力画像におけるパーツ画像の種別を判定し、それぞれの判定した種別のパーツ画像におけるピクセルの各々に対してラベルを付与する（所定のルール）。
学習用画像データ記憶部１２０から順次学習用データを読み出し、学習用入力画像４２１を生成器５５１に対して入力し、出力として学習用変更画像４２２が生成されるように学習を行なう。 FIG. 9 is a conceptual diagram illustrating a learning example of a machine learning model that determines the types of parts images in an input image to be input and assigns a label to each pixel in each parts image. The generator 551 is a semantic segmentation machine learning model that determines the type of the parts image in the input image to be input, and assigns a label to each pixel in the determined type of the parts image (predetermined rule).
The learning data is sequentially read from the learning image data storage unit 120, the learning input image 421 is input to the generator 551, and learning is performed so that the learning modified image 422 is generated as an output.

すなわち、学習用入力画像４２１における動物パーツ画像４０１の各々のピクセルが赤色として出力され、建物パーツ画像４０２の各々のピクセルが青色として出力され、道路パーツ画像４０３の各々のピクセルが黄色として出力されるように、作成した学習用データを用いて生成器５５１を学習させる。
これにより、入力画像における画像の種別を判定し、その画像の種別に対応するラベルを、画像を構成するピクセルの各々に付与する生成器５５１を学習させることができる。 That is, each pixel of the animal part image 401 in the learning input image 421 is output as red, each pixel of the building part image 402 is output as blue, and each pixel of the road part image 403 is output as yellow. Thus, the generator 551 is trained using the created learning data.
Thus, it is possible to train the generator 551 that determines the type of image in the input image and assigns a label corresponding to the type of the image to each pixel that constitutes the image.

上述したように、本実施形態によれば、動物図形データベース１１７、建物図形データベース１１８及び道路図形データベース１１９の各々から、動物パーツ画像、建物パーツ画像、道路パーツ画像それぞれを選択し、画像表示部１１６の表示画面における画像領域１１６Ｓに配置する処理により、学習用入力画像と学習用変更画像とを組とした学習用データを生成することができるため、セマンティックセグメンテーションにおける生成器５５１の機械学習モデルの学習に用いる学習データを容易に大量に生成することができる。 As described above, according to this embodiment, an animal part image, a building part image, and a road part image are selected from each of the animal graphic database 117, the building graphic database 118, and the road graphic database 119, and the image display unit 116 By the processing of arranging in the image area 116S on the display screen, it is possible to generate learning data in which the input image for learning and the modified image for learning are combined. It is possible to easily generate a large amount of learning data used for

また、本実施形態によれば、パーツ画像（動物パーツ画像、建物パーツ画像、道路パーツ画像）の各々の種別が判っており、パーツ画像のピクセルに対して一括して色などのラベルを付与することにより、従来のようにピクセル毎にラベルを付与する手間がかかる作業を行なう必要が無く、パーツ画像のピクセルの各々に種別を示すラベルを付与することが容易に行える。 Further, according to this embodiment, the types of each part image (animal part image, building part image, road part image) are known, and a label such as a color is assigned to the pixels of the part image collectively. As a result, it is not necessary to perform the labor-intensive task of labeling each pixel as in the conventional art, and it is possible to easily assign a label indicating the type to each pixel of the parts image.

また、本実施形態によれば、画像表示部１１６の表示画面における画像領域１１６Ｓに動物パーツ画像、建物パーツ画像及び道路パーツ画像の各々を配置して一つの学習用データを作成した後、この配置のレイアウトの変更、あるいは動物パーツ画像、建物パーツ画像、道路パーツ画像のそれぞれのデータ拡張処理を行なうことにより、作成した学習用データの変形バージョンを生成することにより、学習用データのバリエーションを容易に増加させ、容易に大量の学習用データを得ることができる。 Further, according to the present embodiment, after each of the animal part image, the building part image, and the road part image is arranged in the image area 116S on the display screen of the image display unit 116 to create one piece of learning data, this arrangement is performed. By changing the layout of the image, or by performing data extension processing for each of the animal part images, building part images, and road part images, a modified version of the created learning data is generated, making it easy to vary the learning data. can be increased, and a large amount of training data can be obtained easily.

また、図１に示す学習データ生成システム１及び図７に示す学習データ生成システム１０の各々の学習用入力画像及び学習用変更画像からなる学習データの生成の処理を行なう機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより、インフラ設備の不良の発生の予測値の算出及び各指標値による点検の優先度の算出の処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。 1 and the learning data generation system 10 shown in FIG. 7. A program for realizing a function of generating learning data composed of learning input images and learning changed images of each of the learning data generation system 1 and FIG. is recorded on a computer-readable recording medium, and the computer system loads and executes the program recorded on this recording medium to calculate the predicted value of the occurrence of defects in infrastructure equipment and to perform inspections based on each index value. A priority calculation process may be performed. It should be noted that the "computer system" referred to here includes hardware such as an OS and peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 The "computer system" also includes the home page providing environment (or display environment) if the WWW system is used.
The term "computer-readable recording medium" refers to portable media such as flexible discs, magneto-optical discs, ROMs and CD-ROMs, and storage devices such as hard discs incorporated in computer systems. Furthermore, "computer-readable recording medium" refers to a program that dynamically retains programs for a short period of time, like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It also includes those that hold programs for a certain period of time, such as volatile memories inside computer systems that serve as servers and clients in that case. Further, the program may be for realizing part of the functions described above, or may be capable of realizing the functions described above in combination with a program already recorded in the computer system.

以上、この発明の実施形態を図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiments of the present invention have been described in detail above with reference to the drawings, the specific configuration is not limited to these embodiments, and designs and the like are included within the scope of the gist of the present invention.

１，１０…学習データ生成システム
１１，１１１…制御部
１２，１１２…パーツ画像選択部
１３…文字列生成部
１４，１１４…形状拡張部
１５，１１５…複合画像生成部
１６，１１６…画像表示部
１６Ｓ，１１６Ｓ…画像領域
１７…象徴図形データベース
１８…一般図形データベース
１９…文字図形データベース
２０，１２０…学習用画像データ記憶部
１１７…動物図形データベース
１１８…建物図形データベース
１１９…道路図形データベース Reference Signs List 1, 10 Learning data generation system 11, 111 Control unit 12, 112 Part image selection unit 13 Character string generation unit 14, 114 Shape extension unit 15, 115 Composite image generation unit 16, 116 Image display unit 16S, 116S... Image area 17... Symbolic figure database 18... General figure database 19... Character figure database 20, 120... Learning image data storage unit 117... Animal figure database 118... Building figure database 119... Road figure database

Claims

A learning data generation system that generates learning data consisting of a training input image and a learning modified image used for learning a machine learning model that generates a modified image by modifying a predetermined input image according to a predetermined rule. and
a plurality of image databases in which part images that are components of images are stored;
a selection unit that selects the parts image from each of the image databases;
a composite image generation unit that generates a composite image obtained by combining the part images selected by the selection unit and uses it as the learning input image;
A learning data generation system, comprising: a modified image generation unit that generates the modified learning image composed of a specific parts image in the parts image selected by the selection unit.

The image database is
a symbolic figure database in which symbolic part images, which are symbolic part images given when a visual impression of the input image for learning is observed, are accumulated;
2. The general graphic database in which general parts images, which are general parts images having no symbolism per se, are accumulated to supplement the symbolicity of the symbolic parts images, according to claim 1. Learning data generation system.

The image database is
3. The learning data generation system according to claim 2, further comprising a character/graphic database in which character part images, which are part images representing characters, are accumulated.

wherein the input image for learning is composed of the symbol parts image and either or both of the general parts image and the character parts image,
4. The learning data generation system according to claim 3, wherein only the symbol parts image in the input image for learning is arranged in the modified image for learning.

5. The learning data generation system according to any one of claims 1 to 4, further comprising a shape expansion unit that deforms the shape of each part image.

The learning data generation system comprises a generator and a discriminator,
The generator comprises a machine learning model that extracts a predetermined element from the input image for learning and generates a generated image for learning,
The classifier comprises a machine learning model that evaluates learning data composed of the input image for learning and the modified image for learning, or learning data composed of the input image for learning and the generated image for learning.
The learning data generation system according to any one of claims 1 to 5, characterized by:

the machine learning model is semantic segmentation;
2. The learning data according to claim 1, wherein the modified image for learning is obtained by selecting the type of each of the part images in the input image for learning and adding a label indicating the type. generation system.

Learning in which a computer system generates learning data consisting of a training input image and a learning modified image used for learning a machine learning model that generates a modified image by modifying a predetermined input image according to a predetermined rule. A data generation method,
a selection process of selecting a part image from each of a plurality of image databases storing part images that are constituent elements of an image;
A composite image generation step of generating a composite image by combining the part images selected in the selection step and using it as the learning input image;
and a learning data generation method, comprising: a modified image generating step of generating the modified learning image composed of a specific part image in the part images selected in the selecting step.

A learning method in which a computer system learns a machine learning model that generates a modified image by modifying a predetermined input image according to a predetermined rule,
a selection step of selecting the parts image to be used for generating the learning input image, which is the learning data for the predetermined input image, from each of a plurality of image databases in which the parts images constituting the image are stored;
A composite image generating step of generating a composite image by combining the part images selected in the selection step and using it as the learning input image;
a modified image generation process for generating a learning modified image composed of a specific part image in the part images selected in the selection process;
and a learning step of learning the machine learning model outputting the modified learning image by inputting the learning input image.