JP7324475B1

JP7324475B1 - Information processing device, information processing method and information processing program

Info

Publication number: JP7324475B1
Application number: JP2022168392A
Authority: JP
Inventors: 匡都史太田; 賢加藤
Original assignee: Hotarubi
Current assignee: Hotarubi
Priority date: 2022-10-20
Filing date: 2022-10-20
Publication date: 2023-08-10
Anticipated expiration: 2042-10-20
Also published as: JP2024060845A

Abstract

【課題】ユーザに対して躍動感や臨場感等の魅力を提供できる漫画を原画に基づいて容易に編集することができる技術を提供する。【解決手段】本開示の情報処理装置は、漫画の編集を支援する情報処理装置である。この情報処理装置は、漫画の原画画像の一部であって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ部分画像を取得することと、部分画像に対して実行される所定の編集処理の指令を取得することと、を実行する制御部を備え、制御部は、所定の入力画像データの入力を受け付ける入力層と、該入力画像データからコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を表す特徴量を抽出する中間層と、該特徴量に基づく識別結果を出力する出力層と、を有するニューラルネットワークモデルであって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて学習を行うことにより構築された事前学習モデルに、原画画像のデータを入力することで、部分画像を取得する。【選択図】図３Kind Code: A1 To provide a technique for easily editing a comic based on an original image, which can provide a user with attractiveness such as a sense of dynamism and realism. An information processing device according to the present disclosure is an information processing device that supports editing of comics. This information processing device acquires a partial image which is a part of an original image of a comic book and includes frame border information and/or text information and/or character information, and performs predetermined processing on the partial image. and a control unit for executing the above, wherein the control unit comprises an input layer for receiving input of predetermined input image data; / or a neural network model having an intermediate layer for extracting a feature quantity representing character information and an output layer for outputting a recognition result based on the feature quantity, wherein frame border information and/or text information and/or character A partial image is acquired by inputting data of an original image into a pre-learning model constructed by performing learning using image data containing information. [Selection drawing] Fig. 3

Description

本発明は、漫画の編集を支援する情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program that support editing of comics.

従来から、漫画は、単行本や雑誌等の紙面に印刷されて提供されることが多かった。一方で、近年のスマートフォンやタブレット端末等の普及に伴い、これらの電子機器を使用して、デジタルコンテンツとして漫画を読む機会が増加している。 2. Description of the Related Art Conventionally, cartoons have often been provided by being printed on paper such as books and magazines. On the other hand, with the recent spread of smartphones, tablet terminals, and the like, there are increasing opportunities to read comics as digital content using these electronic devices.

ここで、デジタルコンテンツとして漫画を読む場合においても、漫画絵は静止画である。そのため、画像に動きがなく、ユーザが躍動感や臨場感を得ることが困難になり得る。そこで、漫画の原画から新たな複数の画像を書き起こし、動く漫画を生成する技術が知られている。 Here, even when reading comics as digital content, comic pictures are still images. Therefore, there is no motion in the image, and it may be difficult for the user to obtain a sense of dynamism and realism. Therefore, there is known a technique for generating a moving cartoon by transcribing a plurality of new images from the original cartoon image.

例えば、特許文献１には、入力された漫画の原画の画像から、一部分の画像を部分画像として切り出して、それに基づいて新画像を複数生成し、それらを時系列に再生可能とする画像データ生成装置が開示されている。これにより、動きのある画像データを原作の画像を用いて作成することが可能となる。 For example, Japanese Patent Laid-Open No. 2002-100000 discloses that a part of an image is cut out as a partial image from an input original image of a comic book, a plurality of new images are generated based on the partial image, and image data is generated so that the new images can be reproduced in chronological order. An apparatus is disclosed. This makes it possible to create moving image data using the original image.

特開２０１２－１８５４０号公報JP 2012-18540 A

従来からの静止画としての漫画に所定のモーション等を付与することで、ユーザに対してより魅力的なデジタルコンテンツを提供することができる。 By adding a predetermined motion or the like to conventional comics as still images, it is possible to provide users with more attractive digital content.

ここで、このようなモーションコミック（動く漫画）を製作しようとすると、そのデータ作成者は、従来の静止漫画を製作するときよりも多くの画像データを作成しなければならず、漫画の製作工数が大幅に増加するため、製作期間の長期化や製作コストの増加などが問題となっていた。一方で、例えば、特許文献１に記載の技術のように、原作の画像を用いて新画像を複数生成することで、上記の問題を軽減できるようにも思われる。しかしながら、この場合、データ作成者は、原画画像から部分画像を切り出す際に、所定の入力部を用いて切り出す領域を逐次入力する必要があり、やはり、漫画の製作工数が増加してしまう。このように、ユーザに対してより魅力的なデジタルコンテンツを、漫画の原画画像に基づいて容易に編集する技術については、未だ改善の余地を残すものである。 Here, when trying to produce such motion comics (moving comics), the data creator must create more image data than when producing conventional still comics, which increases the man-hours required to create the comics. However, there have been problems such as a prolonged production period and an increase in production costs. On the other hand, it seems that the above problem can be alleviated by generating a plurality of new images using the original image, as in the technique described in Patent Document 1, for example. However, in this case, the data creator needs to use a predetermined input unit to sequentially input regions to be cut out when cutting out partial images from the original image, which again increases the man-hours required to create the comic. As described above, there is still room for improvement in the technology for easily editing digital content that is more attractive to users based on the original cartoon image.

本開示の目的は、ユーザに対して躍動感や臨場感等の魅力を提供できる漫画を原画に基づいて容易に編集することができる技術を提供することにある。 An object of the present disclosure is to provide a technique that enables easy editing of comics based on original drawings that can provide attractiveness such as a sense of dynamism and realism to the user.

本開示の情報処理装置は、漫画の編集を支援する情報処理装置である。そして、この情報処理装置は、漫画の原画画像の一部である部分画像であって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ部分画像を取得することと、前記部分画像に対して実行される所定の編集処理の指令である編集処理指令を取得することと、を実行する制御部を備える。そして、前記制御部は、所定の入力画像データの入力を受け付ける入力層と、該入力画像データからコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を表す特徴量を抽出する中間層と、該特徴量に基づく識別結果を出力する出力層と、を有するニューラルネットワークモデルであって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて学習を行うことにより構築された事前学習モデルに、前記原画画像のデータを入力することで、前記部分画像を取得する。 An information processing device according to the present disclosure is an information processing device that supports editing of comics. The information processing apparatus obtains a partial image which is a part of the original image of the comic book and includes frame border information and/or text information and/or character information; a control unit that acquires an editing process command that is a command for a predetermined editing process to be executed on the . The control unit includes an input layer for receiving input of predetermined input image data, an intermediate layer for extracting a feature amount representing frame border information and/or text information and/or character information from the input image data, and and an output layer that outputs identification results based on feature quantities, and a neural network model constructed by learning using image data containing frame border information and/or text information and/or character information. The partial image is acquired by inputting the data of the original image into the pre-learned model.

上記の情報処理装置では、モーションコミック（動く漫画）のデータ作成者は、事前学習モデルに漫画の原画画像のデータを入力することで、部分画像を簡単に抽出することができる。そのため、原画画像から部分画像を手動で切り出す必要がなくなり、モーションコミック（動く漫画）の製作工数を大幅に削減することができる。このように、以上によれば、読者に対して躍動感や臨場感等の魅力を提供できる漫画を原画に基づいて容易に編集することができる。 In the information processing apparatus described above, a data creator of a motion comic (moving comic) can easily extract a partial image by inputting the data of the original image of the comic into the pre-learning model. Therefore, there is no need to manually cut out a partial image from the original image, and the man-hours for creating motion comics (moving comics) can be greatly reduced. As described above, according to the above, it is possible to easily edit comics that can provide readers with attractiveness such as a sense of dynamism and realism based on the original drawings.

ここで、上記の情報処理装置において、前記編集処理は、前記部分画像を時系列に再生する処理、又は／及び前記部分画像の一部を動作させながら表示する処理であってもよい。また、前記制御部は、前記部分画像として、物体の移動の描写及び／又は光の描写及び／又は漫符及び／又は擬音に関する情報を含んだ演出画像を更に取得してもよい。この場合、前記編集処理は、前記演出画像を動作させながら表示する処理であってもよい。これによれば、モーションコミック（動く漫画）の躍動感や臨場感等が更に高められ得る。つまり、モーションコミック（動く漫画）の演出効果を更に高めることができる。 Here, in the above information processing apparatus, the editing process may be a process of reproducing the partial images in chronological order and/or a process of displaying a part of the partial images while they are being operated. In addition, the control unit may further acquire, as the partial image, an effect image including information on depiction of movement of an object and/or depiction of light and/or comic symbols and/or onomatopoeia. In this case, the editing process may be a process of displaying the effect image while operating it. According to this, the dynamism, realism, etc. of motion comics (moving cartoons) can be further enhanced. In other words, it is possible to further enhance the performance effect of motion comics (moving comics).

また、本開示の情報処理装置では、前記制御部は、前記事前学習モデルに学習させるための教師データであって、コマ枠及びテキスト及びキャラクターに関する画像がランダムに配置された仮想漫画を自動で生成し、前記仮想漫画におけるコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて、前記事前学習モデルに学習を行わせてもよい。これによれば、ランダムな仮想漫画を教師データとして事前学習モデルに学習させることで、機械学習のための作業コストを大幅に削減することができる。そして、この場合、前記制御部は、ランダムな大きさのコマ枠を生成し、予め生成された所定の背景画像の任意の位置に該コマ枠の枠形状を配置し、該枠形状の内側の背景画像を該コマ枠内の第１画像に設定し、予め生成された所定のキャラクター画像を前記第１画像にランダムに重畳表示させた画像を、第２画像として前記コマ枠内に設定し、予め生成された所定のテキスト画像を前記第２画像における前記キャラクター画像に重ならない位置にランダムに重畳表示させた画像を、第３画像として前記コマ枠内に設定することで、前記仮想漫画を自動で生成することができる。 Further, in the information processing apparatus of the present disclosure, the control unit automatically generates a virtual cartoon in which frame frames, text, and images related to characters are randomly arranged, which is teacher data for making the pre-learning model learn. The pre-learning model may be trained using image data generated and containing frame border information and/or text information and/or character information in the virtual cartoon. According to this, by having a pre-learning model learn random virtual cartoons as teacher data, the work cost for machine learning can be greatly reduced. In this case, the control unit generates a frame frame of random size, arranges the frame shape of the frame frame at an arbitrary position of a predetermined background image generated in advance, and places the frame shape inside the frame shape. setting a background image as a first image within the frame frame, and setting an image obtained by randomly superimposing a predetermined character image generated in advance on the first image as a second image within the frame frame; The virtual cartoon is automatically generated by setting an image obtained by randomly superimposing a predetermined text image generated in advance at a position not overlapping the character image in the second image as a third image within the frame frame. can be generated with

また、本開示は、コンピュータによる情報処理方法の側面から捉えることができる。すなわち、本開示の情報処理方法は、漫画の編集を支援する情報処理方法であって、コンピュータが、漫画の原画画像の一部である部分画像であって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ部分画像を取得する第１取得ステップと、前記部分画像に対して実行される所定の編集処理の指令である編集処理指令を取得する第２取得ステップと、を実行する。そして、前記第２取得ステップでは、所定の入力画像データの入力を受け付ける入力層と、該入力画像データからコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を表す特徴量を抽出する中間層と、該特徴量に基づく識別結果を出力する出力層と、を有するニューラルネットワークモデルであって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて学習を行うことにより構築された事前学習モデルに、前記原画画像のデータを入力することで、前記部分画像を取得することを実行する。 Also, the present disclosure can be understood from the aspect of an information processing method by a computer. That is, the information processing method of the present disclosure is an information processing method that supports editing of comics, and a computer processes a partial image that is a part of an original image of a comic, frame border information and/or text information, and /or executing a first obtaining step of obtaining a partial image including character information and a second obtaining step of obtaining an editing process command, which is a command for a predetermined editing process to be executed on the partial image. . In the second acquisition step, an input layer for receiving input of predetermined input image data, and an intermediate layer for extracting a feature amount representing frame border information and/or text information and/or character information from the input image data. , and an output layer that outputs identification results based on the feature amount, and a neural network model that performs learning using image data containing frame border information and/or text information and/or character information By inputting the data of the original image into the constructed pre-learning model, the partial image is acquired.

また、本開示は、情報処理プログラムの側面から捉えることができる。すなわち、本開示の情報処理プログラムは、漫画の編集を支援する情報処理プログラムであって、コンピュータに、漫画の原画画像の一部である部分画像であって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ部分画像を取得する第１取得ステップと、前記部分画像に対して実行される所定の編集処理の指令である編集処理指令を取得する第２取得ステップと、を実行させる。そして、前記第２取得ステップでは、所定の入力画像データの入力を受け付ける入力層と、該入力画像データからコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を表す特徴量を抽出する中間層と、該特徴量に基づく識別結果を出力する出力層と、を有するニューラルネットワークモデルであって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて学習を行うことにより構築された事前学習モデルに、前記原画画像のデータを入力することで、前記部分画像を取得することを実行させる。 Also, the present disclosure can be understood from the aspect of an information processing program. That is, the information processing program of the present disclosure is an information processing program that supports editing of comics, and a computer stores a partial image that is a part of an original image of a comic, including frame border information and/or text information and /or to execute a first acquisition step of acquiring a partial image including character information and a second acquisition step of acquiring an editing process command, which is a command for a predetermined editing process to be executed on the partial image. . In the second acquisition step, an input layer for receiving input of predetermined input image data, and an intermediate layer for extracting a feature amount representing frame border information and/or text information and/or character information from the input image data. , and an output layer that outputs identification results based on the feature amount, and a neural network model that performs learning using image data containing frame border information and/or text information and/or character information By inputting the data of the original image into the constructed pre-learning model, acquisition of the partial image is executed.

本開示によれば、ユーザに対して躍動感や臨場感等の魅力を提供できる漫画を原画に基づいて容易に編集することができる。 Advantageous Effects of Invention According to the present disclosure, it is possible to easily edit a comic that can provide a user with attractiveness such as a sense of dynamism and presence, based on the original drawing.

第１実施形態における漫画の編集支援システムの概略構成を示す図である。1 is a diagram showing a schematic configuration of a comic editing support system according to a first embodiment; FIG. 第１実施形態における、漫画の編集支援システムに含まれるサーバの構成要素をより詳細に示すとともに、サーバと通信を行うユーザ端末の構成要素を示した図である。FIG. 2 is a diagram showing in more detail the constituent elements of a server included in the comic editing support system in the first embodiment, and also showing the constituent elements of a user terminal that communicates with the server; 第１実施形態における漫画の編集支援システムの動作の流れを例示する図である。FIG. 3 is a diagram illustrating the flow of operations of the comic editing support system according to the first embodiment; 第１実施形態における事前学習モデルに対する入力から得られる識別結果と、該事前学習モデルを構成するニューラルネットワークを説明するための図である。FIG. 4 is a diagram for explaining a discrimination result obtained from an input to a pre-learning model in the first embodiment, and a neural network that constitutes the pre-learning model; 事前学習モデルによって部分画像として抽出されるコマ枠情報を説明するための図である。FIG. 5 is a diagram for explaining frame border information extracted as a partial image by a pre-learning model; 事前学習モデルによって部分画像として抽出されるテキスト情報を説明するための図である。FIG. 4 is a diagram for explaining text information extracted as a partial image by a pre-learning model; 事前学習モデルによって部分画像として抽出されるキャラクター情報を説明するための図である。FIG. 4 is a diagram for explaining character information extracted as a partial image by a pre-learning model; ユーザが修正情報を入力するために用いられるインタフェースで表示される画面を例示する図である。FIG. 10 is a diagram illustrating a screen displayed in an interface used by a user to input correction information; 修正情報に基づいて修正された部分画像を例示する図である。FIG. 10 is a diagram illustrating a partial image corrected based on correction information; サーバによって部分画像として更に取得され得る演出画像を例示する図である。FIG. 11 illustrates an effect image that can be further acquired as a partial image by the server; 学習部によって生成される仮想漫画を説明するための図である。FIG. 4 is a diagram for explaining a virtual cartoon generated by a learning unit; FIG. 背景画像とキャラクター画像とテキスト画像とがランダムに配置された仮想漫画を例示する図である。FIG. 4 is a diagram illustrating a virtual cartoon in which background images, character images, and text images are randomly arranged;

以下、図面に基づいて、本開示の実施の形態を説明する。以下の実施形態の構成は例示であり、本開示は実施形態の構成に限定されない。 Embodiments of the present disclosure will be described below based on the drawings. The configurations of the following embodiments are examples, and the present disclosure is not limited to the configurations of the embodiments.

＜第１実施形態＞
第１実施形態における漫画の編集支援システムの概要について、図１を参照しながら説明する。図１は、本実施形態における漫画の編集支援システムの概略構成を示す図である。本実施形態に係る編集支援システム１００は、ネットワーク２００と、サーバ３００と、ユーザ端末４００と、を含んで構成される。なお、本開示の編集支援システムは、漫画の編集を支援するシステムであって、漫画の編集支援がサーバ３００によって実行される。 <First embodiment>
An overview of the cartoon editing support system according to the first embodiment will be described with reference to FIG. FIG. 1 is a diagram showing a schematic configuration of a comic editing support system according to the present embodiment. The editing support system 100 according to this embodiment includes a network 200 , a server 300 and a user terminal 400 . The editing support system of the present disclosure is a system for supporting editing of comics, and the editing support for comics is executed by server 300 .

ネットワーク２００は、例えば、ＩＰネットワークである。ネットワーク２００は、ＩＰネットワークであれば、無線であっても有線であっても無線と有線の組み合わせであってもよく、例えば、無線による通信であれば、ユーザ端末４００は、無線ＬＡＮアクセスポイント（不図示）にアクセスし、ＬＡＮやＷＡＮを介してサーバ３００と通信してもよい。また、ネットワーク２００は、これらの例に限られず、例えば、公衆交換電話網や、光回線、ＡＤＳＬ回線、衛星通信網などであってもよい。 Network 200 is, for example, an IP network. As long as the network 200 is an IP network, it may be wireless, wired, or a combination of wireless and wired. (not shown) to communicate with the server 300 via a LAN or WAN. Also, the network 200 is not limited to these examples, and may be, for example, a public switched telephone network, an optical line, an ADSL line, a satellite communication network, or the like.

サーバ３００は、ネットワーク２００を介して、ユーザ端末４００と接続される。なお、図１において、説明を簡単にするために、サーバ３００は１台、ユーザ端末４００は４台示してあるが、これらに限定されないことは言うまでもない。 Server 300 is connected to user terminal 400 via network 200 . Although one server 300 and four user terminals 400 are shown in FIG. 1 for the sake of simplicity, the present invention is not limited to these.

サーバ３００は、データの取得、生成、更新等の演算処理及び加工処理のための処理能力のあるコンピュータ機器であればどの様な電子機器でもよく、例えば、パーソナルコンピュータ、サーバ、メインフレーム、その他電子機器であってもよい。すなわち、サーバ３００は、ＣＰＵやＧＰＵ等のプロセッサ、ＲＡＭやＲＯＭ等の主記憶装置、ＥＰＲＯＭ、ハードディスクドライブ、リムーバブルメディア等の補助記憶装置を有するコンピュータとして構成することができる。なお、リムーバブルメディアは、例えば、ＵＳＢメモリ、あるいは、ＣＤやＤＶＤのようなディスク記録媒体であってもよい。補助記憶装置には、オペレーティングシステム（ＯＳ）、各種プログラム、各種テーブル等が格納されている。 The server 300 may be any electronic device as long as it has a processing capability for arithmetic processing and processing such as data acquisition, generation, and updating. It may be a device. That is, the server 300 can be configured as a computer having a processor such as a CPU or GPU, a main storage device such as a RAM or ROM, an auxiliary storage device such as an EPROM, a hard disk drive, or a removable medium. Note that the removable medium may be, for example, a USB memory or a disk recording medium such as a CD or DVD. The auxiliary storage device stores an operating system (OS), various programs, various tables, and the like.

また、サーバ３００は、本実施形態に係る編集支援システム１００専用のソフトウェアやハードウェア、ＯＳ等を設けずに、クラウドサーバによるＳａａＳ（Software as a Service）、Ｐａａｓ（Platform as a Service）、ＩａａＳ（Infrastructure as a Service）を適宜用いてもよい。 In addition, the server 300 is not provided with dedicated software, hardware, OS, etc. for the editing support system 100 according to the present embodiment, but can be implemented as SaaS (Software as a Service), Paas (Platform as a Service), IaaS (Platform as a Service) by a cloud server. Infrastructure as a Service) may be used as appropriate.

ユーザ端末４００は、編集支援システム１００を利用するユーザが保有する携帯端末等の電子機器であればよく、例えば、携帯端末、タブレット端末、スマートフォン、ウェアラブル端末、パーソナルコンピュータ等、その他端末機器であってもよい。 The user terminal 400 may be an electronic device such as a mobile terminal owned by a user who uses the editing support system 100. For example, the user terminal 400 may be a mobile terminal, a tablet terminal, a smartphone, a wearable terminal, a personal computer, or other terminal equipment. good too.

次に、図２に基づいて、主にサーバ３００の構成要素の詳細な説明を行う。図２は、第１実施形態における、編集支援システム１００に含まれるサーバ３００の構成要素をより詳細に示すとともに、サーバ３００と通信を行うユーザ端末４００の構成要素を示した図である。 Next, based on FIG. 2, a detailed description of mainly the components of the server 300 will be given. FIG. 2 is a diagram showing in more detail the constituent elements of the server 300 included in the editing support system 100 and the constituent elements of the user terminal 400 that communicates with the server 300 in the first embodiment.

サーバ３００は、機能部として通信部３０１、記憶部３０２、制御部３０３を有しており、補助記憶装置に格納されたプログラムを主記憶装置の作業領域にロードして実行し、プログラムの実行を通じて各機能部等が制御されることによって、各機能部における所定の目的に合致した各機能を実現することができる。ただし、一部または全部の機能はＡＳＩＣやＦＰＧＡのようなハードウェア回路によって実現されてもよい。 The server 300 has a communication unit 301, a storage unit 302, and a control unit 303 as functional units. By controlling each functional unit and the like, it is possible to realize each function that meets a predetermined purpose in each functional unit. However, some or all of the functions may be realized by hardware circuits such as ASIC and FPGA.

ここで、通信部３０１は、サーバ３００をネットワーク２００に接続するための通信インタフェースである。通信部３０１は、例えば、ネットワークインタフェースボードや、無線通信のための無線通信回路を含んで構成される。サーバ３００は、通信部３０１を介して、ユーザ端末４００やその他の外部装置と通信可能に接続される。 Here, the communication unit 301 is a communication interface for connecting the server 300 to the network 200 . The communication unit 301 includes, for example, a network interface board and a wireless communication circuit for wireless communication. The server 300 is communicably connected to the user terminal 400 and other external devices via the communication unit 301 .

記憶部３０２は、主記憶装置と補助記憶装置を含んで構成される。主記憶装置は、制御部３０３によって実行されるプログラムや、当該制御プログラムが利用するデータが展開されるメモリである。補助記憶装置は、制御部３０３において実行されるプログラムや、当該制御プログラムが利用するデータが記憶される装置である。なお、サーバ３００は、通信部３０１を介してユーザ端末４００等から送信されたデータを取得し、記憶部３０２には、後述する原画画像が予め記憶される。また、記憶部３０２には、後述する部分画像を取得するための教師データや事前学習モデルが記憶される。 The storage unit 302 includes a main storage device and an auxiliary storage device. The main storage device is a memory in which programs executed by the control unit 303 and data used by the control programs are developed. The auxiliary storage device is a device that stores programs executed by the control unit 303 and data used by the control programs. Note that the server 300 acquires data transmitted from the user terminal 400 or the like via the communication unit 301, and the storage unit 302 stores in advance an original image, which will be described later. The storage unit 302 also stores teacher data and pre-learning models for obtaining partial images, which will be described later.

制御部３０３は、サーバ３００が行う制御を司る機能部である。制御部３０３は、ＣＰＵなどの演算処理装置によって実現することができる。制御部３０３は、更に、取得部３０３１と、編集処理部３０３２と、学習部３０３３と、の３つの機能部を有して構成される。各機能部は、記憶されたプログラムをＣＰＵによって実行することで実現してもよい。なお、学習部３０３３は、機械学習に伴う演算量が多いため、記憶されたプログラムをＧＰＵによって実行することで実現してもよい。このように、ＧＰＵを機械学習に伴う演算処理に利用するようにすると、高速処理できるようになる。また、より高速な処理を行うために、このようなＧＰＵを搭載したコンピュータを複数台用いてコンピュータ・クラスターを構築し、このコンピュータ・クラスターに含まれる複数のコンピュータにて並列処理を行うようにしてもよい。 The control unit 303 is a functional unit that controls the server 300 . The control unit 303 can be realized by an arithmetic processing device such as a CPU. The control unit 303 further includes three functional units: an acquisition unit 3031 , an edit processing unit 3032 and a learning unit 3033 . Each functional unit may be realized by executing a stored program by a CPU. Note that the learning unit 3033 may be realized by executing a stored program with a GPU, since the amount of calculation associated with machine learning is large. By using the GPU for arithmetic processing associated with machine learning in this manner, high-speed processing can be achieved. In addition, in order to perform faster processing, multiple computers equipped with such GPUs are used to construct a computer cluster, and parallel processing is performed by the multiple computers included in this computer cluster. good too.

取得部３０３１は、漫画の原画画像の一部である部分画像を取得する。ここで、漫画の原画画像とは、漫画の原画の画像データであって、編集支援システム１００を利用するユーザは、ユーザ端末４００を用いて予め原画画像をサーバ３００に送信することができる。ユーザは、例えば、ユーザ端末４００に予めインストールされた所定のアプリによって提供されるインタフェース、または所定のウェブサイトによって提供されるインタフェースを介して、原画画像をサーバ３００にアップロードすることができる。そうすると、サーバ３００は、ユーザ端末４００から送信された原画画像を記憶部３０２に記憶させる。そして、取得部３０３１は、記憶部３０２に記憶された原画画像に基づいて、該原画画像の中のコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を抽出することで、これら情報を含んだ部分画像を取得する。 Acquisition unit 3031 acquires a partial image that is a part of the original image of the comic. Here, the original image of the comic is image data of the original image of the comic, and the user using the editing support system 100 can transmit the original image to the server 300 in advance using the user terminal 400 . For example, the user can upload the original image to the server 300 via an interface provided by a predetermined application pre-installed on the user terminal 400 or an interface provided by a predetermined website. Then, the server 300 causes the storage unit 302 to store the original image transmitted from the user terminal 400 . Then, based on the original image stored in the storage unit 302, the acquisition unit 3031 extracts the frame border information and/or the text information and/or the character information in the original image. Get a partial image.

ここで、本実施形態におけるユーザ端末４００は、機能部として通信部４０１、入出力部４０２、記憶部４０３を有している。通信部４０１は、ユーザ端末４００をネットワーク２００に接続するための通信インタフェースであり、例えば、ネットワークインタフェースボードや、無線通信のための無線通信回路を含んで構成される。入出力部４０２は、通信部４０１を介して外部から送信されてきた情報等を表示させたり、通信部４０１を介して外部に情報を送信する際に当該情報を入力したりするための機能部である。記憶部４０３は、サーバ３００の記憶部３０２と同様に主記憶装置と補助記憶装置を含んで構成される。 Here, the user terminal 400 in this embodiment has a communication unit 401, an input/output unit 402, and a storage unit 403 as functional units. The communication unit 401 is a communication interface for connecting the user terminal 400 to the network 200, and includes, for example, a network interface board and a wireless communication circuit for wireless communication. The input/output unit 402 is a functional unit for displaying information or the like transmitted from the outside via the communication unit 401 and inputting the information when transmitting the information to the outside via the communication unit 401. is. The storage unit 403 includes a main storage device and an auxiliary storage device, similar to the storage unit 302 of the server 300 .

入出力部４０２は、更に、表示部４０２１、操作入力部４０２２、画像・音声入出力部４０２３を有している。表示部４０２１は、各種情報を表示する機能を有し、例えば、ＬＣＤ（Liquid Crystal Display）ディスプレイ、ＬＥＤ（Light Emitting Diode）ディスプレイ、ＯＬＥＤ（Organic Light Emitting Diode）ディスプレイ等により実現される。操作入力部４０２２は、ユーザからの操作入力を受け付ける機能を有し、具体的には、タッチパネル等のソフトキーあるいはハードキーにより実現される。画像・音声入出力部４０２３は、静止画や動画等の画像の入力を受け付ける機能を有し、具体的には、Charged-Coupled Devices（ＣＣＤ）、Metal-oxide-semiconductor（ＭＯＳ）あるいはComplementary Metal-Oxide-Semiconductor（ＣＭＯＳ）等のイメージセンサを用いたカメラにより実現される。また、画像・音声入出力部４０２３は、音声の入出力を受け付ける機能を有し、具体的には、マイクやスピーカーにより実現される。 The input/output unit 402 further has a display unit 4021 , an operation input unit 4022 , and an image/audio input/output unit 4023 . The display unit 4021 has a function of displaying various kinds of information, and is realized by, for example, an LCD (Liquid Crystal Display) display, an LED (Light Emitting Diode) display, an OLED (Organic Light Emitting Diode) display, or the like. The operation input unit 4022 has a function of receiving an operation input from the user, and is specifically realized by soft keys or hard keys such as a touch panel. The image/sound input/output unit 4023 has a function of receiving input of images such as still images and moving images. It is realized by a camera using an image sensor such as oxide-semiconductor (CMOS). Also, the image/sound input/output unit 4023 has a function of receiving input/output of sound, and is specifically realized by a microphone or a speaker.

そうすると、上記ユーザは、このように構成されたユーザ端末４００を用いて、原画画像をサーバ３００に送信することができる。 Then, the user can transmit the original image to the server 300 using the user terminal 400 configured as described above.

編集処理部３０３２は、部分画像に対して実行される所定の編集処理の指令である編集処理指令を取得する。ここで、本実施形態における編集処理は、部分画像を時系列に再生する処理、又は／及び部分画像の一部を動作させながら表示する処理である。編集支援システム１００を利用するユーザは、ユーザ端末４００に予めインストールされた所定のアプリによって提供されるインタフェース、または所定のウェブサイトによって提供されるインタフェースを介して、上記の編集処理指令をサーバ３００に送信することができる。そうすると、編集処理部３０３２は、送信された編集処理指令を取得し、部分画像を時系列に再生する編集処理を実行する。 The editing processing unit 3032 acquires an editing processing command that is a command for a predetermined editing processing to be executed on the partial image. Here, the editing process in this embodiment is a process of reproducing partial images in chronological order and/or a process of displaying a part of the partial images while they are being operated. A user using the editing support system 100 sends the above editing processing command to the server 300 via an interface provided by a predetermined application pre-installed on the user terminal 400 or an interface provided by a predetermined website. can be sent. Then, the editing processing unit 3032 acquires the transmitted editing processing command and executes editing processing for reproducing the partial images in time series.

学習部３０３３は、上記の取得部３０３１による処理に用いられる事前学習モデルを構築する機能部であって、その詳細は後述する。 The learning unit 3033 is a functional unit that builds a pre-learning model used in the processing by the acquisition unit 3031, and the details thereof will be described later.

なお、制御部３０３が、取得部３０３１、編集処理部３０３２、および学習部３０３３の処理を実行することで、本開示に係る制御部として機能する。 Note that the control unit 303 functions as a control unit according to the present disclosure by executing the processing of the acquisition unit 3031, the editing processing unit 3032, and the learning unit 3033.

ここで、本実施形態における編集支援システム１００の動作の流れについて説明する。図３は、本実施形態における編集支援システム１００の動作の流れを例示する図である。図３では、本実施形態における編集支援システム１００におけるサーバ３００とユーザ端末４００との間の動作の流れ、およびサーバ３００とユーザ端末４００とが実行する処理を説明する。 Here, the operation flow of the editing support system 100 in this embodiment will be described. FIG. 3 is a diagram illustrating the flow of operations of the editing support system 100 according to this embodiment. FIG. 3 illustrates the flow of operations between the server 300 and the user terminal 400 in the editing support system 100 according to this embodiment, and the processing executed by the server 300 and the user terminal 400. FIG.

本実施形態では、先ず、編集支援システム１００を利用して原画画像を編集することで動く漫画を作成するユーザのユーザ端末４００に、原画画像が入力される（Ｓ１０１）。上述したように、ユーザは、例えば、ユーザ端末４００に予めインストールされた所定のアプリによって提供されるインタフェース、または所定のウェブサイトによって提供されるインタフェースを介して、原画画像をサーバ３００にアップロードすることができる。 In this embodiment, first, an original image is input to the user terminal 400 of a user who creates a moving comic by editing the original image using the editing support system 100 (S101). As described above, the user uploads the original image to the server 300 via, for example, an interface provided by a predetermined application pre-installed on the user terminal 400 or an interface provided by a predetermined website. can be done.

サーバ３００は、ユーザ端末４００から送信された原画画像データを取得する（Ｓ１０２）。そして、サーバ３００は、取得した原画画像を記憶部３０２に格納する。 The server 300 acquires the original image data transmitted from the user terminal 400 (S102). Then, the server 300 stores the acquired original image in the storage unit 302 .

そして、サーバ３００は、原画画像に基づいて部分画像を取得する。これについて、以下に説明する。 Then, the server 300 acquires a partial image based on the original image. This will be explained below.

サーバ３００は、事前学習モデルを呼出す処理を実行する（Ｓ１０３）。ここで、事前学習モデルは、原画画像に基づいて部分画像を抽出するために用いられる機械学習モデルであって、学習部３０３３によって、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて学習を行うことにより事前に構築される。 The server 300 executes processing for calling the pre-learning model (S103). Here, the pre-learning model is a machine learning model used to extract a partial image based on the original image, and includes frame border information and/or text information and/or character information by the learning unit 3033. It is constructed in advance by performing learning using image data.

ここで、図４は、本実施形態における事前学習モデルに対する入力から得られる識別結果と、該事前学習モデルを構成するニューラルネットワークを説明するための図である。本実施形態では、事前学習モデルとして、ディープラーニングにより生成されるニューラルネットワークモデルを用いる。本実施形態における事前学習モデル３０は、入力画像データの入力を受け付ける入力層３１と、入力層３１に入力された該画像データからコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を表す特徴量を抽出する中間層（隠れ層）３２と、特徴量に基づく識別結果を出力する出力層３３とを有する。なお、図４の例では、事前学習モデル３０は、１層の中間層３２を有しており、入力層３１の出力が中間層３２に入力され、中間層３２の出力が出力層３３に入力されている。ただし、中間層３２の数は、１層に限られなくてもよく、事前学習モデル３０は、２層以上の中間層３２を有してもよい。 Here, FIG. 4 is a diagram for explaining the identification result obtained from the input to the pre-learning model in this embodiment and the neural network that constitutes the pre-learning model. In this embodiment, a neural network model generated by deep learning is used as a pre-learning model. The pre-learning model 30 in this embodiment includes an input layer 31 that receives input of input image data, and a feature value representing frame border information and/or text information and/or character information from the image data input to the input layer 31. and an output layer 33 for outputting the identification result based on the feature amount. In the example of FIG. 4, the pre-learning model 30 has one intermediate layer 32, the output of the input layer 31 is input to the intermediate layer 32, and the output of the intermediate layer 32 is input to the output layer 33. It is However, the number of intermediate layers 32 may not be limited to one layer, and the pre-learning model 30 may have two or more intermediate layers 32 .

また、図４によると、各層３１～３３は、１又は複数のニューロンを備えている。例えば、入力層３１のニューロンの数は、入力される画像データに応じて設定することができる。また、出力層３３のニューロンの数は、識別結果である部分画像に応じて設定することができる。 Also according to FIG. 4, each layer 31-33 comprises one or more neurons. For example, the number of neurons in the input layer 31 can be set according to input image data. Also, the number of neurons in the output layer 33 can be set according to the partial image that is the identification result.

そして、隣接する層のニューロン同士は適宜結合され、各結合には重み（結合荷重）が機械学習の結果に基づいて設定される。図４の例では、各ニューロンは、隣接する層の全てのニューロンと結合されているが、ニューロンの結合は、このような例に限定されなくてもよく、適宜設定することができる。 Then, neurons in adjacent layers are appropriately connected, and a weight (connection weight) is set for each connection based on the result of machine learning. In the example of FIG. 4, each neuron is connected to all neurons in adjacent layers, but the connection of neurons need not be limited to such an example and can be set as appropriate.

このような事前学習モデル３０は、例えば、複数の漫画の原画画像と、コマ枠情報及び／又はテキスト情報及び／又はキャラクター情報のラベルと、の組みである教師データを用いて教師あり学習を行うことで構築される。具体的には、特徴量とラベルとの組みをニューラルネットワークに与え、ニューラルネットワークの出力がラベルと同じとなるように、ニューロン同士の結合の重みがチューニングされる。このようにして、教師データの特徴を学習し、入力から結果を推定するための事前学習モデルが帰納的に獲得される。 Such a pre-learning model 30 performs supervised learning using teacher data that is a combination of, for example, a plurality of original cartoon images and labels of frame border information and/or text information and/or character information. It is constructed by Specifically, a combination of a feature amount and a label is given to a neural network, and weights of connections between neurons are tuned so that the output of the neural network is the same as the label. In this way, a pre-learned model is obtained recursively for learning the features of the training data and estimating the result from the input.

図３に戻って、サーバ３００は、上記の事前学習モデルに原画画像のデータを入力することで、部分画像を抽出する（Ｓ１０４）。 Returning to FIG. 3, the server 300 extracts a partial image by inputting data of the original image into the pre-learning model (S104).

ここで、図５は、事前学習モデルによって部分画像として抽出されるコマ枠情報を説明するための図である。図５（ａ）は、原画画像を示していて、図５（ｂ）は、抽出されたコマ枠情報を示している。事前学習モデルによって、原画画像から４角をもつコマ枠（図５（ｂ）に例示するコマ枠Ａ～Ｅ）が抽出される。そして、サーバ３００は、コマ枠情報として、これらコマ枠の左上、右上、右下、左下（例えば、コマ枠ＡのＡ１、Ａ２、Ａ３、Ａ４）のｘ座標とｙ座標を抽出する。 Here, FIG. 5 is a diagram for explaining frame border information extracted as a partial image by the pre-learning model. FIG. 5(a) shows the original image, and FIG. 5(b) shows the extracted frame border information. Frame frames having four corners (frame frames A to E illustrated in FIG. 5B) are extracted from the original image by the pre-learning model. Then, the server 300 extracts the x-coordinates and y-coordinates of the upper left, upper right, lower right, and lower left of these frame borders (for example, A1, A2, A3, and A4 of the frame border A) as frame border information.

また、図６は、事前学習モデルによって部分画像として抽出されるテキスト情報を説明するための図である。図６（ａ）は、原画画像を示していて、図６（ｂ）は、抽出されたテキスト情報を示している。事前学習モデルによって、原画画像から吹き出し内のテキスト領域（図６（ｂ）に例示するテキストＡ～Ｈ）が抽出される。そして、サーバ３００は、テキスト情報として、これらテキスト領域の左上（例えば、テキストＡのＡ１１）のｘ座標とｙ座標を抽出するとともに、該テキスト領域の幅、高さ（例えば、テキストＡのｗ１、ｈ１）を抽出する。 Also, FIG. 6 is a diagram for explaining text information extracted as a partial image by the pre-learning model. FIG. 6(a) shows the original image, and FIG. 6(b) shows the extracted text information. The pre-learning model extracts text regions (texts A to H illustrated in FIG. 6B) in balloons from the original image. Then, the server 300 extracts the x-coordinate and y-coordinate of the upper left of these text areas (for example, A11 of text A) as text information, and also extracts the width and height of the text area (for example, w1, w1 and w1 of text A). h1) is extracted.

また、図７は、事前学習モデルによって部分画像として抽出されるキャラクター情報を説明するための図である。図７（ａ）は、原画画像を示していて、図７（ｂ）は、抽出されたキャラクター情報を示している。事前学習モデルによって、原画画像からキャラクター（図７（ｂ）に例示するキャラＡ、Ｂ）が白色画像として抽出される。このとき、事前学習モデルによって、キャラクターが白色画像でそれ以外が黒色画像のグレースケールデータが生成される。そして、サーバ３００は、上記のグレースケールデータから黒色領域を透過することで白色領域を切り取り、それをキャラクター情報として抽出する。なお、図７（ｂ）に示す例では、後述するように、事前学習モデルでは中段のコマ枠のキャラクターが抽出できていないものとする。 FIG. 7 is a diagram for explaining character information extracted as partial images by the pre-learning model. FIG. 7(a) shows the original image, and FIG. 7(b) shows the extracted character information. Characters (characters A and B illustrated in FIG. 7B) are extracted as white images from the original image by the pre-learning model. At this time, the pre-learning model generates grayscale data in which the character is a white image and the rest is a black image. Then, the server 300 cuts out the white region from the above grayscale data by transmitting the black region, and extracts it as character information. In the example shown in FIG. 7B, as will be described later, it is assumed that the pre-learning model has not been able to extract the character in the middle frame frame.

そして、図３に戻って、サーバ３００は、抽出した部分画像をユーザ端末４００に送信し、ユーザ端末４００は、その情報を取得する（Ｓ１０５）。 Then, returning to FIG. 3, the server 300 transmits the extracted partial image to the user terminal 400, and the user terminal 400 acquires the information (S105).

そうすると、編集支援システム１００を利用するユーザは、ユーザ端末４００を介して、送信された部分画像を確認し、必要に応じて部分画像を修正するための修正情報を入力することができる（Ｓ１０６）。 Then, the user using the editing support system 100 can check the transmitted partial image via the user terminal 400 and input correction information for correcting the partial image as necessary (S106). .

ここで、図８は、ユーザが修正情報を入力するために用いられるインタフェースで表示される画面を例示する図である。なお、図８に例示するインタフェースは、キャラクター情報を修正するために用いられるものである。図８に例示する画面ＳＣ１はユーザのユーザ端末４００の表示部４０２１に表示され、画面ＳＣ１には、抽出された部分画像ＳＣ１１、修正情報の入力フィールドＳＣ１２、送信ボタンＳＣ１３が示される。そして、ユーザは、修正情報の入力フィールドＳＣ１２に、操作入力部４０２２（タッチパネル）を用いて修正情報を入力し（図８に示す例では、事前学習モデルでは抽出できていなかった中段のコマ枠のキャラクターが存在する領域ＳＣ１２１が、ユーザによってタッチパネルで囲われる。）、送信ボタンＳＣ１３を押下することで修正情報をサーバ３００に送信することができる。 Here, FIG. 8 is a diagram illustrating a screen displayed on the interface used for the user to input correction information. Note that the interface illustrated in FIG. 8 is used to correct character information. The screen SC1 illustrated in FIG. 8 is displayed on the display unit 4021 of the user's user terminal 400, and the screen SC1 shows an extracted partial image SC11, an input field SC12 for correction information, and a send button SC13. Then, the user inputs correction information in the correction information input field SC12 using the operation input unit 4022 (touch panel) (in the example shown in FIG. The area SC121 where the character exists is surrounded by the touch panel by the user.), and the correction information can be sent to the server 300 by pressing the send button SC13.

そして、図３に戻って、サーバ３００は、ユーザ端末４００から送信された修正情報を取得する（Ｓ１０７）。そうすると、サーバ３００は、修正情報に基づいて部分画像を修正することができる。図９は、修正情報に基づいて修正された部分画像を例示する図であって、図９（ａ）は、原画画像を示していて、図９（ｂ）は、修正された部分画像を示している。本実施形態では、上記の図７（ｂ）に示した部分画像が修正されることで、図９（ｂ）では、図７（ｂ）では抽出できていなかったキャラクターが部分画像に追加されている。そして、このようにして、サーバ３００は、部分画像の取得を完了する（Ｓ１０８）。 Then, returning to FIG. 3, the server 300 acquires the correction information transmitted from the user terminal 400 (S107). Then, server 300 can correct the partial image based on the correction information. 9A and 9B are diagrams illustrating partial images that have been modified based on the modification information. FIG. 9A shows the original image, and FIG. 9B shows the modified partial image. ing. In this embodiment, by correcting the partial image shown in FIG. 7(b), the characters that could not be extracted in FIG. 7(b) are added to the partial image in FIG. 9(b). there is Then, in this manner, the server 300 completes acquisition of partial images (S108).

図３に示すフローでは、次に、ユーザのユーザ端末４００に、部分画像に対して実行される編集処理に関する編集情報が入力される（Ｓ１０９）。上述したように、ユーザは、例えば、ユーザ端末４００に予めインストールされた所定のアプリによって提供されるインタフェース、または所定のウェブサイトによって提供されるインタフェースを介して、上記の編集情報をサーバ３００にアップロードすることができる。詳しくは、ユーザは、上記のインタフェースにおいて、例えば、部分画像に対して実行する編集処理指令として、部分画像を時系列に再生する処理、又は部分画像の一部を動作させながら表示する処理、又はこれら両方の処理を選択することで、編集情報をサーバ３００にアップロードすることができる。 In the flow shown in FIG. 3, next, edit information regarding edit processing to be executed on the partial image is input to the user's user terminal 400 (S109). As described above, the user uploads the above editing information to the server 300 via, for example, an interface provided by a predetermined application pre-installed on the user terminal 400 or an interface provided by a predetermined website. can do. Specifically, in the above interface, for example, as an editing process command to be executed on a partial image, the user can perform a process of reproducing the partial image in chronological order, a process of displaying a part of the partial image while operating it, or Editing information can be uploaded to the server 300 by selecting both of these processes.

そうすると、サーバ３００は、ユーザ端末４００から送信された編集処理指令を取得し（Ｓ１１０）、編集処理を実行する（Ｓ１１１）。これにより、モーションコミック（動く漫画）が実現されることになる。なお、サーバ３００によって実行される編集処理はユーザ端末４００に送信され、ユーザは、この情報を取得したユーザ端末４００を介して、編集処理を確認することができる（Ｓ１１２）。 Then, the server 300 acquires the edit processing command transmitted from the user terminal 400 (S110), and executes the edit processing (S111). As a result, motion comics (moving cartoons) will be realized. The editing process executed by the server 300 is transmitted to the user terminal 400, and the user can confirm the editing process via the user terminal 400 that has acquired this information (S112).

そして、以上に述べた処理によれば、モーションコミック（動く漫画）のデータ作成者は、事前学習モデルに漫画の原画画像のデータを入力することで、部分画像を簡単に抽出することができる。そのため、原画画像から部分画像を手動で切り出す必要がなくなり、モーションコミック（動く漫画）の製作工数を大幅に削減することができる。このように、以上によれば、読者に対して躍動感や臨場感等の魅力を提供できる漫画を原画に基づいて容易に編集することができる。 According to the processing described above, the motion comic (moving comic) data creator can easily extract the partial image by inputting the data of the original image of the comic into the pre-learning model. Therefore, there is no need to manually cut out a partial image from the original image, and the man-hours for creating motion comics (moving comics) can be greatly reduced. As described above, according to the above, it is possible to easily edit comics that can provide readers with attractiveness such as a sense of dynamism and realism based on the original drawings.

なお、上記のモーションコミック（動く漫画）では、漫画の原画画像から抽出された部分画像に対して、該部分画像を時系列に再生する処理、又は／及び該部分画像の一部を動作させながら表示する処理が編集処理として実行されるが、本実施形態における編集処理は、これらに限定されない。 In the above motion comic (moving comic), partial images extracted from the original image of the comic are reproduced in chronological order, and/or while some of the partial images are operated. The display process is executed as the editing process, but the editing process in this embodiment is not limited to this.

本実施形態では、上記のモーションコミック（動く漫画）において、演出画像を動作させながら表示する処理が編集処理として実行されてもよい。 In the present embodiment, in the above-described motion comic (moving comic), a process of displaying an effect image while moving may be executed as an editing process.

ここで、上記の演出画像とは、物体の移動の描写及び／又は光の描写及び／又は漫符及び／又は擬音に関する情報を含んだ画像であって、サーバ３００によって、部分画像として取得され得る。この場合、事前学習モデル３０では、物体の移動の描写及び／又は光の描写及び／又は漫符及び／又は擬音に関する情報を含んだ画像データを用いて学習が行われ、事前学習モデル３０の中間層（隠れ層）３２は、物体の移動の描写及び／又は光の描写及び／又は漫符及び／又は擬音に関する情報を表す特徴量を抽出することになる。 Here, the above effect image is an image containing information on depiction of object movement and/or depiction of light and/or comic symbols and/or onomatopoeia, and can be acquired as a partial image by server 300. . In this case, in the pre-learning model 30, learning is performed using image data containing information about depiction of movement of objects and / or depiction of light and / or comic symbols and / or onomatopoeia. A layer (hidden layer) 32 extracts features that represent information about descriptions of movement of objects and/or descriptions of lights and/or comics and/or onomatopoeia.

そして、図１０は、サーバ３００によって部分画像として更に取得され得る演出画像を例示する図である。本実施形態では、物体の移動の描写に関する情報を含んだ演出画像として、煙の描写の画像が取得され得る。また、図１０に示すように、漫符や擬音の画像も取得され得る。 FIG. 10 is a diagram illustrating an effect image that can be further acquired as a partial image by the server 300. As shown in FIG. In the present embodiment, an image depicting smoke can be acquired as the effect image containing information on depicting the movement of the object. In addition, as shown in FIG. 10, images of comic symbols and onomatopoeia can also be obtained.

このように、部分画像として更に演出画像が取得され、該演出画像が動作しながら表示されることによれば、モーションコミック（動く漫画）の躍動感や臨場感等が更に高められ得る。つまり、モーションコミック（動く漫画）の演出効果を更に高めることができる。 In this way, by further acquiring a performance image as a partial image and displaying the performance image while moving, the motion comic (moving comic) can have a more lively feeling and realism. In other words, it is possible to further enhance the performance effect of motion comics (moving comics).

また、本実施形態では、サーバ３００が、事前学習モデル３０に学習させるための教師データを自動で生成してもよい。詳しくは、サーバ３００の制御部３０３が有する学習部３０３３が、上記の教師データとして、コマ枠及びテキスト及びキャラクターに関する画像がランダムに配置された仮想漫画を自動で生成する。そして、学習部３０３３は、上記の仮想漫画におけるコマ枠情報及び／又はテキスト情報及び／又はキャラクター情報を含んだ画像データを用いて、事前学習モデル３０に学習を行わせる。 Further, in this embodiment, the server 300 may automatically generate teacher data for the pre-learning model 30 to learn. Specifically, the learning unit 3033 of the control unit 303 of the server 300 automatically generates a virtual cartoon in which frame borders, text, and images of characters are randomly arranged as the teacher data. Then, the learning unit 3033 causes the pre-learning model 30 to learn using the image data including the frame border information and/or the text information and/or the character information in the virtual cartoon.

具体的には、学習部３０３３は、ランダムな大きさのコマ枠を生成し、予め生成された所定の背景画像の任意の位置に該コマ枠の枠形状を配置し、該枠形状の内側の背景画像を該コマ枠内の第１画像に設定する。そして、予め生成された所定のキャラクター画像を上記の第１画像にランダムに重畳表示させた画像を、第２画像としてコマ枠内に設定する。更に、予め生成された所定のテキスト画像を上記の第２画像におけるキャラクター画像に重ならない位置にランダムに重畳表示させた画像を、第３画像としてコマ枠内に設定することで、仮想漫画を自動で生成する。これについて、図１１に基づいて説明する。 Specifically, the learning unit 3033 generates a frame frame of random size, arranges the frame shape of the frame frame at an arbitrary position of a predetermined background image generated in advance, and places the frame shape inside the frame shape. The background image is set to the first image within the frame frame. Then, an image obtained by randomly superimposing a predetermined character image generated in advance on the first image is set as the second image within the frame frame. Further, an image obtained by randomly superimposing a predetermined text image generated in advance at a position that does not overlap the character image in the second image is set as the third image in the frame frame, thereby automatically generating the virtual cartoon. Generate with This will be described with reference to FIG.

図１１は、学習部３０３３によって生成される仮想漫画を説明するための図である。学習部３０３３は、先ず、図１１（ａ）に示すように、ランダムに生成されたコマ枠を、予め生成された任意の背景画像の任意の位置（これは、背景画像内においてコマ枠が収まるランダムな位置である。）に配置する。なお、予めランダムに生成されたコマ枠がはみ出さない大きさに、背景画像がリサイズされてもよい。そして、学習部３０３３は、配置したコマ枠の枠形状の内側の背景画像を切り取り、切り取られた該コマ枠内の背景画像を第１画像に設定する。なお、学習部３０３３は、上記のコマ枠の生成において、頁を１～３の乱数で決定された行数にランダムな高さで分割し、分割された各行の列方向を１～３の乱数で決定された数にランダムな幅で分割する。このようにして生成されたコマ枠について、更に、学習部３０３３は、一定以上の高さ・幅を有するコマ枠を抽出し、それをランダムに分割することでコマ枠を生成することができる。 FIG. 11 is a diagram for explaining the virtual comics generated by the learning unit 3033. As shown in FIG. First, as shown in FIG. 11A, the learning unit 3033 places a randomly generated frame border at an arbitrary position of an arbitrary background image generated in advance (this is a position where the frame border fits within the background image). It is a random position.). Note that the background image may be resized to a size in which the randomly generated frame border does not protrude. Then, the learning unit 3033 cuts out the background image inside the frame shape of the arranged frame border, and sets the cut background image in the frame border as the first image. In generating the frame border, the learning unit 3033 divides the page into rows determined by random numbers from 1 to 3 at random heights, and divides each row in the column direction by random numbers from 1 to 3. Divide by a random width into a number determined by . From the frame borders thus generated, the learning unit 3033 can generate frame borders by extracting frame borders having heights and widths equal to or greater than a certain level and dividing them at random.

次に、学習部３０３３は、図１１（ｂ）に示すように、予め生成された所定のキャラクター画像を上記の第１画像にランダムに重畳表示させる。なお、重畳表示させるキャラクター画像がコマ枠内に収まるように、該キャラクター画像がリサイズされてもよい。また、重畳表示させるキャラクター画像がコマ枠内に収まらない場合、コマ枠からはみ出す部分が削除されてもよい。そして、このようにして、コマ枠内に背景画像とキャラクター画像とがランダムに配置された画像を、第２画像としてコマ枠内に設定する。なお、上記のキャラクター画像は、例えば、複数の任意のキャラクター画像が記憶されたデータベースからランダムに抽出された画像であって、第１画像として描写される背景にかかわらずランダムに抽出され得る。 Next, the learning unit 3033 randomly superimposes a predetermined character image generated in advance on the first image, as shown in FIG. 11(b). Note that the character image may be resized so that the character image to be superimposed and displayed fits within the frame frame. Further, when the character image to be superimposed does not fit within the frame frame, the portion protruding from the frame frame may be deleted. Then, in this manner, an image in which the background image and the character image are randomly arranged within the frame frame is set as the second image within the frame frame. Note that the character image described above is, for example, an image randomly extracted from a database in which a plurality of arbitrary character images are stored, and can be randomly extracted regardless of the background depicted as the first image.

次に、学習部３０３３は、図１１（ｃ）に示すように、予め生成された所定のテキスト画像を上記の第２画像にランダムに重畳表示させる。このとき、テキスト画像は、第２画像におけるキャラクター画像に重ならない位置にランダムに配置され得る。なお、上記のテキスト画像は、例えば、吹き出しとともに描写される任意のテキストや、擬音である。そして、このようにして、コマ枠内に背景画像とキャラクター画像とテキスト画像とがランダムに配置された画像を、第３画像としてコマ枠内に設定する。 Next, the learning unit 3033 randomly superimposes a predetermined text image generated in advance on the second image, as shown in FIG. 11(c). At this time, the text image can be randomly arranged at a position that does not overlap the character image in the second image. Note that the above text image is, for example, arbitrary text drawn with a speech balloon or an onomatopoeia. Then, an image in which the background image, the character image, and the text image are randomly arranged within the frame frame is set as the third image within the frame frame.

そして、図１２は、背景画像とキャラクター画像とテキスト画像とがランダムに配置された仮想漫画を例示する図である。図１２に示すようなランダムな仮想漫画が学習部３０３３によって生成され、それを教師データとして事前学習モデル３０に学習させることで、機械学習のための作業コストを大幅に削減することができる。つまり、学習部３０３３によって仮想漫画が自動的に大量に生成されることで、機械学習において、大量の教師データを手動で作成する必要がなくなる。 FIG. 12 is a diagram illustrating a virtual cartoon in which background images, character images, and text images are randomly arranged. Random virtual cartoons as shown in FIG. 12 are generated by the learning unit 3033 and used as teacher data for the pre-learning model 30 to learn, thereby significantly reducing the work cost for machine learning. In other words, by automatically generating a large amount of virtual cartoons by the learning unit 3033, there is no need to manually create a large amount of teacher data in machine learning.

以上に述べた編集支援システム１００によれば、読者に対して躍動感や臨場感等の魅力を提供できる漫画を原画に基づいて容易に編集することができる。 According to the above-described editing support system 100, it is possible to easily edit comics that can provide readers with attractiveness such as a sense of dynamism and realism based on original drawings.

＜その他の変形例＞
上記の実施形態はあくまでも一例であって、本開示はその要旨を逸脱しない範囲内で適宜変更して実施しうる。例えば、本開示において説明した処理や手段は、技術的な矛盾が生じない限りにおいて、自由に組み合わせて実施することができる。 <Other Modifications>
The above-described embodiment is merely an example, and the present disclosure can be modified as appropriate without departing from the scope of the present disclosure. For example, the processes and means described in the present disclosure can be freely combined and implemented as long as there is no technical contradiction.

また、１つの装置が行うものとして説明した処理が、複数の装置によって分担して実行されてもよい。例えば、編集処理部３０３２をサーバ３００とは別の演算処理装置に形成してもよい。このとき当該別の演算処理装置はサーバ３００と好適に協働可能に構成される。また、異なる装置が行うものとして説明した処理が、１つの装置によって実行されても構わない。コンピュータシステムにおいて、各機能をどのようなハードウェア構成（サーバ構成）によって実現するかは柔軟に変更可能である。 Also, the processing described as being performed by one device may be shared and performed by a plurality of devices. For example, the editing processing unit 3032 may be formed in an arithmetic processing device separate from the server 300 . At this time, the other processing unit is preferably configured to cooperate with the server 300 . Also, the processes described as being performed by different devices may be performed by one device. In a computer system, it is possible to flexibly change the hardware configuration (server configuration) to implement each function.

本開示は、上記の実施形態で説明した機能を実装したコンピュータプログラムをコンピュータに供給し、当該コンピュータが有する１つ以上のプロセッサがプログラムを読み出して実行することによっても実現可能である。このようなコンピュータプログラムは、コンピュータのシステムバスに接続可能な非一時的なコンピュータ可読記憶媒体によってコンピュータに提供されてもよいし、ネットワークを介してコンピュータに提供されてもよい。非一時的なコンピュータ可読記憶媒体は、例えば、磁気ディスク（フロッピー（登録商標）ディスク、ハードディスクドライブ（ＨＤＤ）等）、光ディスク（ＣＤ－ＲＯＭ、ＤＶＤディスク・ブルーレイディスク等）など任意のタイプのディスク、読み込み専用メモリ（ＲＯＭ）、ランダムアクセスメモリ（ＲＡＭ）、ＥＰＲＯＭ、ＥＥＰＲＯＭ、磁気カード、フラッシュメモリ、光学式カード、電子的命令を格納するために適した任意のタイプの媒体を含む。 The present disclosure can also be implemented by supplying a computer program implementing the functions described in the above embodiments to a computer, and reading and executing the program by one or more processors of the computer. Such a computer program may be provided to the computer by a non-transitory computer-readable storage medium connectable to the system bus of the computer, or may be provided to the computer via a network. Non-transitory computer-readable storage media include, for example, magnetic disks (floppy (registered trademark) disks, hard disk drives (HDD), etc.), optical disks (CD-ROMs, DVD disks, Blu-ray disks, etc.), any type of disk, Including read only memory (ROM), random access memory (RAM), EPROM, EEPROM, magnetic cards, flash memory, optical cards, any type of medium suitable for storing electronic instructions.

１００・・・編集支援システム
２００・・・ネットワーク
３００・・・サーバ
３０１・・・通信部
３０２・・・記憶部
３０３・・・制御部
４００・・・ユーザ端末 DESCRIPTION OF SYMBOLS 100... Editing support system 200... Network 300... Server 301... Communication part 302... Storage part 303... Control part 400... User terminal

Claims

A partial image that is a part of the original image of the manga and that contains frame border information and/or text information and/or character information, constructed by learning using predetermined image data Acquiring by inputting the data of the original image into a pre-learning model;
Acquiring an editing process command that is a command for a predetermined editing process to be executed on the partial image;
with a control unit that executes
The control unit
Automatically generate a virtual cartoon in which frame borders, text, and images related to characters are randomly arranged, and which is teacher data for making the pre-learning model learn, and using the image data contained in the virtual cartoon, training the pre-trained model;
When automatically generating the virtual cartoon,
A frame frame having a random size is generated, the frame shape of the frame frame is arranged at an arbitrary position of a predetermined background image generated in advance, and the background image inside the frame shape is the first image within the frame frame. set to image,
setting an image obtained by randomly superimposing a predetermined character image generated in advance on the first image as a second image within the frame frame;
setting an image in which a predetermined text image generated in advance is randomly superimposed and displayed in a position not overlapping the character image in the second image as a third image within the frame frame;
Information processing equipment.

the computer
A partial image that is a part of the original image of the manga and that contains frame border information and/or text information and/or character information, constructed by learning using predetermined image data a first obtaining step obtained by inputting data of the original image into a pre-learning model;
a second acquiring step of acquiring an editing process command that is a command for a predetermined editing process to be executed on the partial image;
The computer
Automatically generate a virtual cartoon in which frame borders, text, and images related to characters are randomly arranged, and which is teacher data for making the pre-learning model learn, and using the image data contained in the virtual cartoon, performing training on the pre-trained model;
When automatically generating the virtual cartoon,
A frame frame having a random size is generated, the frame shape of the frame frame is arranged at an arbitrary position of a predetermined background image generated in advance, and the background image inside the frame shape is the first image within the frame frame. set to image,
setting an image obtained by randomly superimposing a predetermined character image generated in advance on the first image as a second image within the frame frame;
setting an image obtained by randomly superimposing a pre-generated predetermined text image on the second image at a position that does not overlap the character image as a third image within the frame frame;
Information processing methods.

to the computer,
A partial image that is a part of the original image of the manga and that contains frame border information and/or text information and/or character information, constructed by learning using predetermined image data a first obtaining step obtained by inputting data of the original image into a pre-learning model;
a second acquiring step of acquiring an editing process command that is a command for a predetermined editing process to be executed on the partial image;
to said computer;
Automatically generate a virtual cartoon in which frame borders, text, and images related to characters are randomly arranged, and which is teacher data for making the pre-learning model learn, and using the image data contained in the virtual cartoon, causing the pre-trained model to perform learning;
When automatically generating the virtual cartoon,
A frame frame having a random size is generated, the frame shape of the frame frame is arranged at an arbitrary position of a predetermined background image generated in advance, and the background image inside the frame shape is the first image within the frame frame. set to image,
setting an image obtained by randomly superimposing a predetermined character image generated in advance on the first image as a second image within the frame frame;
setting an image obtained by randomly superimposing a predetermined text image generated in advance at a position not overlapping the character image in the second image as a third image within the frame frame;
Information processing program.