JP2020129357A

JP2020129357A - Moving image editing server and program

Info

Publication number: JP2020129357A
Application number: JP2019170076A
Authority: JP
Inventors: 雄康高松; Yuko Takamatsu; 孝弘坪野; Takahiro Tsubono; 尚武石橋; Naotake Ishibashi
Original assignee: Open8 Inc
Current assignee: Open8 Inc
Priority date: 2019-09-19
Filing date: 2019-09-19
Publication date: 2020-08-27

Abstract

To provide a server and a program that enable convenient creation of moving image content.SOLUTION: Provided is a moving image editing server for creating a moving image to be distributed to user terminals, comprising: a recommended article extraction unit for extracting recommended articles including a material moving image on the basis of preference profile information of a user; a template creation unit for creating a template in which multiple scenes provided with scene labels are specified; and a composite moving image creation unit for creating a composite moving image on the basis of the material moving image and the template. The composite moving image creation unit comprises: a moving image splitting unit for splitting the material moving image into multiple split material moving images; a first classifier for adding scene labels to the split material moving images; and a split moving image combining unit for creating a composite moving image by extracting and combining the split material moving images that match the respective scenes in the template on the basis of the scene labels. Also provided is a program therefor.SELECTED DRAWING: Figure 2

Description

本発明は、ユーザ端末に配信する合成動画を自動で生成するサーバおよびプログラムに関する。 The present invention relates to a server and a program that automatically generate a composite moving image to be distributed to a user terminal.

従来、動画を複数のチャプタに分割すると共に、メタデータを付加することが行われている。
例えば、特許文献１には、複数のチャプタを有する動画から所望の瞬間のシーン画像を効率的に検索する動画処理装置であって、動画を所定の単位時間毎に複数の大ブロックに分割する大ブロック分割部と、各大ブロックの画像変化の複雑さを数値化する複雑さ分析部と、各大ブロックに係る再生時間を、複雑さに係る数値により複数の小ブロックにそれぞれ分割する小ブロック分割部と、複数の小ブロックを時系列に沿って所定数毎に順次区切ることでチャプタを作成するチャプタ作成部と、を備えてなる動画処理装置が提案されている。 Conventionally, a moving image is divided into a plurality of chapters and metadata is added.
For example, Patent Document 1 discloses a moving image processing apparatus that efficiently retrieves a scene image at a desired moment from a moving image having a plurality of chapters, and divides the moving image into a plurality of large blocks at predetermined unit time intervals. A block division unit, a complexity analysis unit that digitizes the complexity of image change of each large block, and a small block division that divides the playback time of each large block into a plurality of small blocks according to the numerical value of the complexity. There has been proposed a moving image processing apparatus including a section and a chapter creating section that creates a chapter by sequentially dividing a plurality of small blocks into a predetermined number in a time series.

特開２０１１−１３０００７号公報JP, 2011-130007, A

動画コンテンツを作成することには多大な手間がかかるため、自動で動画コンテンツを作成することができるシステムの提供が求められていた。また、多大な手間をかけて作成した動画コンテンツであっても、ユーザの嗜好に合った内容でなければ、最後まで再生されないという問題もある。
また、社内のマニュアルや商品カタログなどの既存資料を動画化したいとのニーズもあるが、動画サーバ、専門技術を持った編集者などの動画編集環境を社内に構築することの敷居は高く、他方で外注した場合には外注コストが嵩むという問題がある。 Since it takes a lot of time and effort to create moving image content, it has been required to provide a system capable of automatically creating moving image content. There is also a problem that even a moving image content created with a great deal of effort will not be played back to the end unless the content matches the user's taste.
There is also a need to animate existing materials such as in-house manuals and product catalogs, but there is a high threshold for building an in-house video editing environment for video servers and editors with specialized technology. However, there is a problem in that the outsourcing cost increases when the outsourcing is performed in.

そこで、本発明では、上記課題を解決するべく、動画コンテンツを自動で生成するサーバおよびプログラムを提供することを目的とする。 Therefore, it is an object of the present invention to provide a server and a program that automatically generate moving image content in order to solve the above problems.

本発明の動画編集サーバは、ユーザ端末に配信するための動画を作成するサーバにおいて、ユーザの嗜好プロファイル情報に基づいて材料動画を含む推薦記事を抽出する推薦記事抽出部と、シーンラベルが付された複数のシーンが規定されたテンプレートを作成するテンプレート作成部と、前記材料動画および前記テンプレートに基づき合成動画を作成する合成動画作成部と、を備え、前記合成動画作成部が、材料動画を複数の分割材料動画に分割する動画分割部と、前記分割材料動画にシーンラベルを付する第１分類器と、前記シーンラベルに基づき前記テンプレートの各シーンに合致する分割材料動画を抽出し、結合することで合成動画を作成する分割動画結合部と、を備えることを特徴とする。 The video editing server of the present invention is a server that creates a video to be distributed to a user terminal, and is provided with a scene label and a recommended article extraction unit that extracts a recommended article including a material video based on user preference profile information. A template creating unit that creates a template in which a plurality of scenes are defined, and a composite moving image creating unit that creates a composite moving image based on the material moving image and the template. Of the divided material moving images, a first classifier that gives a scene label to the divided material moving images, and a divided material moving image that matches each scene of the template based on the scene label is extracted and combined. Accordingly, a divided moving image combining unit that creates a combined moving image is provided.

上記動画編集サーバにおいて、前記合成動画作成部が、前記分割材料動画に表示された物体名を表す物体名ラベルを付する第２分類器を備えることを特徴としてもよい。
上記動画編集サーバにおいて、前記分割動画結合部が、前記シーンラベルおよび前記物体名ラベルに基づき、前記テンプレートの各シーンに合致する分割材料動画を抽出することを特徴としてもよい。
上記動画編集サーバにおいて、前記推薦記事抽出部が、前記推薦記事を類似度情報に基づくリコメンド値を付して抽出すること、前記分割動画結合部が、リコメンド値の高い推薦記事から得られた分割材料動画を優先して採用することを特徴としてもよい。
上記動画編集サーバにおいて、前記合成動画作成部が、前記合成動画にテロップを挿入するテロップ挿入部を備えることを特徴としてもよい。
上記動画編集サーバにおいて、前記テロップ挿入部が、前記推薦記事に記載された文字情報を要約して作成したテロップを前記合成動画に挿入することを特徴としてもよい。
上記動画編集サーバにおいて、前記テンプレートにおいて、各シーンの許容時間範囲が定義されており、前記分割動画結合部が、前記許容時間範囲に合致する分割材料動画を抽出することを特徴としてもよい。
上記動画編集サーバにおいて、前記動画分割部が、前記材料動画を複数の分割材料動画に分割するための材料用動画分割機能に加え、前記第１分類器の学習データを作成するための学習用動画分割機能を備えることを特徴としてもよい。 In the moving image editing server, the combined moving image creating unit may include a second classifier that attaches an object name label indicating an object name displayed in the divided material moving image.
In the moving image editing server, the divided moving image combining unit may extract a divided material moving image that matches each scene of the template based on the scene label and the object name label.
In the moving image editing server, the recommended article extracting unit extracts the recommended articles with a recommendation value based on similarity information, and the divided moving image combining unit divides the recommended articles obtained from the recommended articles having high recommendation values. The feature may be that the material moving image is preferentially adopted.
In the moving image editing server, the synthetic moving image creating unit may include a telop inserting unit that inserts a telop into the synthetic moving image.
In the moving image editing server, the telop inserting unit may insert a telop created by summarizing the character information described in the recommended article into the synthetic moving image.
In the moving image editing server, the template may define an allowable time range of each scene, and the divided moving image combining unit may extract a divided material moving image that matches the allowable time range.
In the moving image editing server, the moving image dividing unit has a learning moving image for creating learning data of the first classifier in addition to a material moving image dividing function for dividing the material moving image into a plurality of divided material moving images. It may be characterized by having a dividing function.

本発明の動画編集サーバ用プログラムは、インターネットを介してアクセスしたユーザ端末に動画コンテンツを配信するサーバ用の動画編集プログラムにおいて、前記サーバを、ユーザの嗜好プロファイル情報に基づいて材料動画を含む推薦記事を抽出する推薦記事抽出部、シーンラベルが付された複数のシーンが規定されたテンプレートを作成するテンプレート作成部、および、前記材料動画および前記テンプレートに基づき合成動画を作成する合成動画作成部、として機能させること、前記合成動画作成部が、材料動画を複数の分割材料動画に分割する動画分割部と、前記分割材料動画にシーンラベルを付する第１分類器と、前記シーンラベルに基づき前記テンプレートの各シーンに合致する分割材料動画を抽出し、結合することで合成動画を作成する分割動画結合部と、を備えることを特徴とする。 A video editing server program of the present invention is a video editing program for a server that distributes video content to a user terminal accessed via the Internet. As a recommended article extraction unit that extracts a template, a template creation unit that creates a template in which a plurality of scenes with scene labels are defined, and a composite video creation unit that creates a composite video based on the material video and the template Functioning, the composite moving image creating unit divides a material moving image into a plurality of divided material moving images, a first classifier that gives a scene label to the divided material moving images, and the template based on the scene label. And a divided moving image combining unit that creates a combined moving image by extracting and combining divided material moving images that match each scene.

本発明によれば、ユーザの嗜好に合った動画コンテンツを自動で生成するサーバおよびプログラムを提供することが可能となる。また、動画編集環境を社内に構築しなくとも、簡便に社内のマニュアルや商品カタログなどの既存資料を動画化することが可能となる。 According to the present invention, it is possible to provide a server and a program that automatically generate moving image content that matches a user's taste. In addition, it is possible to easily animate existing materials such as in-house manuals and product catalogs without constructing an animation editing environment in-house.

実施形態例に係る動画編集システムの構成図である。It is a block diagram of the video editing system which concerns on the example of embodiment. 実施形態例に係る動画編集サーバの構成図である。It is a block diagram of the video editing server which concerns on the example of embodiment. テンプレートの一例を説明する図である。It is a figure explaining an example of a template. 動画作成部の構成図である。It is a block diagram of a moving image creation unit. 各シーンの条件に合致する分割材料動画の抽出を説明する図である。It is a figure explaining extraction of the division material animation which matches the conditions of each scene. 動画編集処理のフローである。It is a flow of a moving image editing process. 動画分割処理の説明図である。It is explanatory drawing of a moving image division process. 分割動画結合処理の説明図である。It is explanatory drawing of a division moving image combining process. 要約文作成処理のフローである。It is a flow of a summary sentence creation process.

＜構成＞
実施形態例に係る本発明の動画編集システムは、図１に示すように、動画編集サーバ１と、管理者端末２と、複数台のユーザ端末３とを備えて構成される。なお、図１の例では、動画編集サーバ１を１台で構成する例を説明したが、複数台のサーバにより動画編集サーバ１を実現することも可能であり、ＧＰＵを搭載したハイスペックな装置を別途設けて、後述の分類器作成などの作業を行うことが作業効率の観点からは好ましい。 <Structure>
As shown in FIG. 1, the moving image editing system of the present invention according to the embodiment includes a moving image editing server 1, an administrator terminal 2, and a plurality of user terminals 3. In addition, in the example of FIG. 1, the example in which the video editing server 1 is configured by one unit has been described, but it is also possible to realize the video editing server 1 by a plurality of servers, and a high-spec device equipped with a GPU. From the viewpoint of work efficiency, it is preferable to separately provide and perform work such as creating a later-described classifier.

動画編集サーバ１は、ＣＰＵを有する処理部、ＨＤＤ等の記憶装置を有する記憶部、および、ＬＡＮポートを有する通信部を備えたサーバ装置に動画編集ソフトウェアおよびデータベースソフトウェアをインストールして構築されている。動画編集ソフトウェアは、閲覧履歴保存部１１と、推薦記事抽出部１２と、テンプレート作成部１３と、分類器作成部１４と、合成動画作成部１５と、を備えている。データベースソフトウェアは、閲覧記録ＤＢ２１と、推薦記事ＤＢ２２と、テンプレートＤＢ２３と、学習データＤＢ２４と、合成動画ＤＢ２５と、ラベルＤＢ２６と、を管理している。 The moving image editing server 1 is constructed by installing moving image editing software and database software in a server device including a processing unit having a CPU, a storage unit having a storage device such as an HDD, and a communication unit having a LAN port. .. The moving image editing software includes a browsing history storage unit 11, a recommended article extracting unit 12, a template creating unit 13, a classifier creating unit 14, and a composite moving image creating unit 15. The database software manages a browsing record DB 21, a recommended article DB 22, a template DB 23, a learning data DB 24, a synthetic moving image DB 25, and a label DB 26.

閲覧履歴保存部１１は、ユーザがログイン状態で閲覧したＷｅｂ上の記事の閲覧記録を、閲覧記録ＤＢに記録する。ユーザが閲覧したＷｅｂ上の記事には一意の記事ＩＤが付与されており、ユーザＩＤおよび閲覧時間と紐付けて記録される。これとは異なり、匿名ユーザとして閲覧記録を記録する仕組みを取り入れてもよい。 The browsing history storage unit 11 records, in the browsing record DB, browsing records of articles on the Web that the user has browsed while logged in. A unique article ID is given to the article on the Web that the user browsed, and the article is recorded in association with the user ID and the browsing time. Alternatively, a mechanism for recording the browsing record as an anonymous user may be adopted.

推薦記事抽出部１２は、ユーザの嗜好プロファイルに基づいて推薦記事をＷｅｂ上から収集する作業を定期的に（例えば、毎週）実行する。実施形態例では、閲覧記録ＤＢ２１からユーザの閲覧履歴情報を取得し、ユーザが閲覧した記事に類似する動画付き記事を内容ベースフィルタリングにより抽出し、類似度情報に基づくリコメンド値を付して推薦記事ＤＢ２２に記録している。これとは異なり、協調フィルタリングにより類似度を判定してもよいし、内容ベースフィルタリングと協調フィルタリングを併用したハイブリッドフィルタリングを採用してもよい。推薦記事ＤＢ２２には、リコメンド値が高い数十件ないし百数十件の推薦記事が常時格納されている。 The recommended article extracting unit 12 regularly (for example, weekly) performs a work of collecting recommended articles from the Web based on the user's preference profile. In the example of the embodiment, the browsing history information of the user is acquired from the browsing record DB 21, the article with the moving image similar to the article browsed by the user is extracted by the content-based filtering, and the recommended value is added with the recommendation value based on the similarity information. It is recorded in DB22. Alternatively, the degree of similarity may be determined by collaborative filtering, or hybrid filtering that uses both content-based filtering and collaborative filtering may be employed. The recommended article DB 22 constantly stores dozens or hundreds of dozens of recommended articles with high recommendation values.

テンプレート作成部１３は、動画を構成するシーンに付する条件（ラベル、順序および各シーンの許容時間範囲）を定義するテンプレートを作成し、テンプレートＤＢ２３に格納する。テンプレートは、複数のシーンから構成され、各シーンには分割動画で表示したい対象物情報とその動きの特徴が定義されている。換言すれば、例えば、「建物廊下を移動」−＞「建物部屋のズームアウト」−＞「外観」といった具合に動画コンテンツの流れを定義するものであり、各シーンがつながった際に違和感がないことを意識して対象物情報とその動きの特徴を定義することが重要である。テンプレート作成部１３によるテンプレート作成は、定期（例えば、年数回）または必要時に実行される。図３は、４つのシーンからなるテンプレートの例であり、シーン１〜２には「調理中」のシーンラベルが付され、シーン３には「調理後」のシーンラベルが付され、シーン４には「内観／外観」のシーンラベルが付されている。なお、テンプレートの各シーンに後述の物体名ラベルを付することもできるが、物体名ラベルの上位概念の対象物情報を定義することで、多くの物体を適用可能とすることが好ましい。また、各シーンに挿入するテロップの許容文字数範囲が条件として設定されている場合もある。 The template creating unit 13 creates a template that defines conditions (label, order, and allowable time range of each scene) attached to the scenes forming the moving image, and stores the template in the template DB 23. The template is composed of a plurality of scenes, and in each scene, target object information to be displayed as a divided moving image and characteristics of its movement are defined. In other words, for example, the flow of the video content is defined as "move the building corridor" -> "zoom out of the building room" -> "appearance", and there is no discomfort when the scenes are connected. It is important to define the characteristics of the object information and its movement with this in mind. The template creation by the template creating unit 13 is executed regularly (for example, several times a year) or when necessary. FIG. 3 is an example of a template consisting of four scenes. Scenes 1 and 2 are labeled with a “cooking” scene label, scene 3 is labeled with a “after cooking” scene label, and scene 4 is labeled with a scene label. Has a scene label of “inside view/appearance”. Although each scene of the template can be labeled with an object name label, which will be described later, it is preferable that many objects can be applied by defining object information that is a superordinate concept of the object name label. Further, the allowable number of characters range of the telop to be inserted in each scene may be set as a condition.

分類器作成部１４は、学習データを学習データＤＢ２４から取得し、機械学習させることで、学習済モデルである第１分類器１５３および第２分類器１５４を作成する。第１分類器の作成は動画のデータセットを利用し、各シーンの時系列の特徴の依存関係を学習する。第２分類器は画像のデータセットをもとに、物体の特徴を学習する。分類器作成部１４による学習済モデルである分類器の作成は、例えば、年に数回程度行われる。学習データは、インターネットから収集したデータや自社のデータにラベルをつけたものを利用してもよいし、ラベルのついたデータセットを調達して利用してもよい。 The classifier creating unit 14 creates the first classifier 153 and the second classifier 154 that are the learned models by acquiring the learning data from the learning data DB 24 and performing machine learning. The first classifier is created by using a moving image data set and learning the dependency relationship of the time-series features of each scene. The second classifier learns the features of the object based on the image data set. The classifier creation unit 14 creates a classifier that is a learned model, for example, about several times a year. As the learning data, data collected from the Internet or data labeled by the company may be used, or a labeled data set may be procured and used.

合成動画作成部１５は、図４に示すように、推薦記事読込部１５１と、動画分割部１５２と、第１分類器１５３と、第２分類器１５４と、分割動画結合部１５５と、音楽挿入部１５６と、テロップ挿入部１５７とを備えている。
推薦記事読込部１５１は、合成動画を作成する際の材料となる材料動画を含む推薦記事を推薦記事ＤＢ２２から取得する。 As shown in FIG. 4, the composite moving image creating unit 15 includes a recommended article reading unit 151, a moving image dividing unit 152, a first classifier 153, a second classifying unit 154, a divided moving image combining unit 155, and music insertion. A portion 156 and a telop insertion portion 157 are provided.
The recommended article reading unit 151 acquires, from the recommended article DB 22, a recommended article including a material moving image that is a material for creating a composite moving image.

動画分割部１５２は、材料動画をフレームに分割し、直前のフレームから色彩が大きく変化する箇所を区間とすることで分割位置を判定し、分割材料動画作成する（図７参照）。より詳細には、例えば、フレームの画像の全てのピクセルの色を数値化し、その全てのピクセルの色の平均色を算出し、前のフレームの平均色と比較して大きく変化した箇所で分割して分割材料動画を作成する。このような分割手法は、見た目のつながりがよいシーンを１つの分割動画としてまとめる際に役に立ち、比較的長く繋がった分割材料動画が生成される。
動画分割部１５２は、学習データとして利用される細かく分割された動画を作るためにも利用される。より詳細には、例えば、フレームの画像の全てのピクセルの色を数値化し、そのピクセルの明るさから黒色の色彩度の変化を判定し、ある程度暗くなると分割することで学習データ用分割動画を作成する。
このように、本実施形態の動画分割部１５２は、目的に応じて動画分割技術を使い分けている。 The moving image dividing unit 152 divides the material moving image into frames, determines a dividing position by setting a section where the color greatly changes from the immediately preceding frame as a section, and creates a divided material moving image (see FIG. 7 ). More specifically, for example, the colors of all the pixels in the image of the frame are digitized, the average color of the colors of all the pixels is calculated, and the average color of the pixels of the previous frame is divided, and the average color is divided. To create a split material movie. Such a division method is useful when collecting scenes that have good visual connection as one divided moving image, and a divided material moving image that is connected for a relatively long time is generated.
The moving image dividing unit 152 is also used to create a finely divided moving image used as learning data. More specifically, for example, the divided video for learning data is created by digitizing the color of all pixels of the frame image, judging the change in the color saturation of black from the brightness of the pixel, and dividing when it becomes dark to some extent. To do.
As described above, the moving image dividing unit 152 according to the present embodiment selectively uses the moving image dividing technique according to the purpose.

第１分類器１５３は、リカレントニューラルネットワークを利用した分類器であり、動画を入力すると、動きも判定した分類結果としてラベルを出力し、ラベルＤＢ２６に格納する。より詳細には、第１分類器１５３は、分割材料動画に、料理中、店内の内観などといったシーンの状況を表す単語（アノテーション単語）をシーンラベルとして付するために利用される。
第２分類器１５４は、畳み込みニューラルネットワークを利用した分類器であり、動画または画像を入力すると、動画または画像に映る物体名ラベルＤＢ２６に格納する出力し、ラベルＤＢ２６に格納する。より詳細には、第２分類器１５４は、分割材料動画に、魚介、焼肉、人物、家具などといった物体名を表す単語（アノテーション単語）を物体名ラベルとして付するために利用される。第２分類器１５４により得られたメタデータをもとに抽出する分割動画が選択される。例えば１つ目の動画から抽出された分割動画に魚介が含まれていると判定した場合、２つ目、３つ目の動画からも、魚介が映り込む分割動画を優先的に抽出する。 The first classifier 153 is a classifier that uses a recurrent neural network. When a moving image is input, the first classifier 153 outputs a label as a classification result in which motion is also determined and stores it in the label DB 26. More specifically, the first classifier 153 is used to attach a word (annotation word) representing a situation of a scene such as during cooking and inside the store as a scene label to the divided material moving image.
The second classifier 154 is a classifier that uses a convolutional neural network. When a moving image or an image is input, the second classifier 154 stores the object name in the moving image or the image in the label DB 26 and outputs it to the label DB 26. More specifically, the second classifier 154 is used to attach a word (annotation word) representing an object name such as seafood, grilled meat, a person, furniture, etc. to the divided material moving image as an object name label. The divided moving image to be extracted is selected based on the metadata obtained by the second classifier 154. For example, when it is determined that the divided moving image extracted from the first moving image contains seafood, the divided moving images in which the seafood is reflected are preferentially extracted from the second and third moving images.

分割動画結合部１５５は、材料動画に最も親和性の高いテンプレートを選択し、選択されたテンプレートの各シーンの条件に合致する分割材料動画を抽出し、抽出した分割材料動画を結合する。図５は、各シーンの条件に合致する分割材料動画の抽出を説明する図であり、１０に分割された分割材料動画のうち、開始から６〜９番目の分割材料動画が条件に合致するとして抽出されている。
分割動画結合部１５５は、抽出された複数の推薦記事について、リコメンド値の高いものから順に分割材料動画の抽出を行う。最初の推薦記事によりテンプレートの全シーンに合致する分割材料動画が抽出されなかった場合には、次にリコメンド値の高い推薦記事について、歯抜けとなったシーンの条件に合致する分割材料動画のマッチングを行うことで、テンプレートで必要とされる全シーンを満たすための分割材料動画を抽出する。この際、各記事の動画から抽出した分割動画から、各シーンで類似した物体が映り込んだものが選択されるように第２分類器１５４から得られたメタデータが利用される。 The divided moving image combining unit 155 selects the template having the highest affinity for the material moving images, extracts the divided material moving images that match the conditions of each scene of the selected template, and combines the extracted divided material moving images. FIG. 5 is a diagram for explaining extraction of a divided material moving image that matches the conditions of each scene. It is assumed that among the divided material moving images divided into 10, the sixth to ninth divided material moving images from the start match the condition. It has been extracted.
The divided moving image combining unit 155 extracts divided material moving images from the plurality of extracted recommended articles in descending order of the recommendation value. If no split material videos that match all the scenes in the template were extracted by the first recommended article, then match the split material videos that match the conditions of the missing scene for the recommended article with the next highest recommendation value. By doing so, a segmented material moving image for satisfying all the scenes required by the template is extracted. At this time, the metadata obtained from the second classifier 154 is used so that the one in which a similar object is reflected in each scene is selected from the divided videos extracted from the videos of each article.

音楽挿入部１５６は、分割動画結合部１５５が結合した合成動画（音楽無し）に音楽を挿入する。
テロップ挿入部１５７は、分割動画結合部１５５が結合した合成動画（テロップ無し）にテロップを挿入する。各シーンに挿入するテロップの許容文字数範囲が条件として設定されている場合は、当該条件を満たす文字数のテロップを挿入する。 The music inserting unit 156 inserts music into the combined moving image (without music) combined by the divided moving image combining unit 155.
The telop insertion unit 157 inserts a telop into the composite moving image (without telop) combined by the divided moving image combining unit 155. When the allowable number of characters range of the telop to be inserted in each scene is set as a condition, the telop having the number of characters satisfying the condition is inserted.

管理者端末２およびユーザ端末３は、入力部、表示部、処理部、記憶部および通信部を備えたコンピュータであり、例えば、スマートフォン、タブレット端末（タブレットＰＣ）、ノートパソコン、デスクトップパソコンなどのＷｅｂブラウザが搭載されたコンピュータである。
管理者は、管理者端末２により動画編集サーバ１の設定変更やデータベースの運用管理などを行う。
ユーザは、ユーザ端末３により動画編集サーバ１にアクセスして、自動生成された動画コンテンツを閲覧することができる。 The administrator terminal 2 and the user terminal 3 are computers provided with an input unit, a display unit, a processing unit, a storage unit, and a communication unit. A computer with a browser.
The administrator uses the administrator terminal 2 to change the settings of the video editing server 1 and manage the operation of the database.
The user can access the moving image editing server 1 using the user terminal 3 and browse the automatically generated moving image content.

＜動作＞
図６を参照しながら、動画編集処理のフローを説明する。
動画生成エージェントは、合成動画作成部１５による動画自動生成プロセスを定期的に（例えば、週に１回）実行する（ＳＴＥＰ１）。合成動画作成部１５の推薦記事読込部１５１は、推薦記事ＤＢ２２に格納された推薦記事を、リコメンド値が高い方から順に読み込む（ＳＴＥＰ２）。
合成動画作成部１５の動画分割部１５２は、処理対象となる推薦記事に含まれる動画（材料動画）を複数の動画（分割材料動画）に分割する（ＳＴＥＰ３）。合成動画作成部１５は、第１分類器１５３および第２分類器１５４により各分割材料動画にラベルを付する（ＳＴＥＰ４）。なお、材料動画の分割位置は、推薦記事ＤＢ２２および／または学習データＤＢ２４に記憶するようにしてもよい。合成動画作成部１５は、選択されたテンプレートのラベルおよび許容時間に合致する分割材料動画を抽出する（ＳＴＥＰ５）。例えば、図８では、テンプレートの３番目のシーン条件が「調理後」、「２〜４秒」、４番目のシーン条件が「内観／外観」、「２〜４秒」とされているところ、これらの条件を満たす２つの分割材料動画が抽出されている。 <Operation>
The flow of the moving image editing process will be described with reference to FIG.
The moving image generation agent periodically (for example, once a week) executes the moving image automatic generation process by the synthetic moving image generation unit 15 (STEP 1). The recommended article reading unit 151 of the composite moving image creating unit 15 reads the recommended articles stored in the recommended article DB 22 in order from the highest recommended value (STEP 2).
The moving image dividing unit 152 of the combined moving image creating unit 15 divides the moving image (material moving image) included in the recommended article to be processed into a plurality of moving images (divided material moving images) (STEP 3). The composite moving image creating unit 15 labels each divided material moving image by the first classifier 153 and the second classifier 154 (STEP 4). The division position of the material moving image may be stored in the recommended article DB 22 and/or the learning data DB 24. The synthetic|combination moving image production part 15 extracts the division|segmentation material moving image which corresponds to the label and permissible time of the selected template (STEP5). For example, in FIG. 8, the third scene condition of the template is “after cooking”, “2 to 4 seconds”, and the fourth scene condition is “inside view/appearance” and “2 to 4 seconds”. Two divided material moving images satisfying these conditions are extracted.

全てのシーンについて分割材料動画が取得されなかった場合は、次の推薦記事に対して、ＳＴＥＰ２〜５の作業が繰り返される（ＳＴＥＰ６、ＳＴＥＰ７）。例えば、図８では、１番目および２番目のシーン条件を満たす分割材料動画が抽出されていないため、次の推薦記事にシーン条件を満たす分割材料動画を引き続き検索することが必要である。 When the divided material moving images are not acquired for all the scenes, the work of STEPs 2 to 5 is repeated for the next recommended article (STEP 6, STEP 7). For example, in FIG. 8, since the divided material moving image that satisfies the first and second scene conditions is not extracted, it is necessary to continuously search for the divided material moving image that satisfies the scene condition for the next recommended article.

全てのシーンについて分割材料動画が取得されると、合成動画作成部１５の分割動画結合部１５５は、抽出された分割材料動画を結合して合成動画を作成する（ＳＴＥＰ８）。合成動画作成部１５の音楽挿入部１５６は、予め用意されたＢＧＭを合成動画に挿入する（ＳＴＥＰ９）。合成動画作成部１５のテロップ挿入部１５７は、合成動画にテロップ挿入する（ＳＴＥＰ１０）。音楽およびテロップが挿入された動画は、完成動画として合成動画ＤＢ２５に格納され、ユーザは自己のユーザＩＤの権限内で完成動画を閲覧することが可能となる（ＳＴＥＰ１１）。 When the divided material moving images are acquired for all the scenes, the divided moving image combining unit 155 of the combined moving image creating unit 15 combines the extracted divided material moving images to create a combined moving image (STEP 8). The music inserting unit 156 of the synthetic moving image creating unit 15 inserts BGM prepared in advance into the synthetic moving image (STEP 9). The telop insertion unit 157 of the composite moving image creating unit 15 inserts a telop into the composite moving image (STEP 10). The moving image in which the music and the telop are inserted is stored in the synthesized moving image DB 25 as a completed moving image, and the user can browse the completed moving image within the authority of his/her user ID (STEP 11).

以上に説明した実施形態例の動画編集システムによれば、ユーザの嗜好プロファイルに基づいてユーザの嗜好に合った動画コンテンツを自動で生成し、お勧め動画として定期的に配信することが可能となる。また、動画編集用ソフト、動画サーバ、専門技術を持った編集者などを自前で揃えなくとも、動画広告や動画プレスリリースを作成することも、マニュアルや商品カタログを動画化することも可能となる。 According to the video editing system of the exemplary embodiment described above, it becomes possible to automatically generate video content that matches the user's taste based on the user's taste profile, and periodically deliver it as a recommended video. .. It is also possible to create video advertisements, video press releases, and animate manuals and product catalogs without having to prepare video editing software, video servers, editors with specialized skills, etc. ..

＜変形例＞
好ましい態様の動画編集サーバ１は、テロップとして挿入するための要約文書作成機能を備えている。図９を参照しながら要約文書作成のフローを説明する。
ＳＴＥＰ９１：段落分割・文書分割
テロップ挿入部１５７は、入力された文書を段落に分割し、各段落内の文書を文書に分割する。また、動画のテロップとして１シーンで表示すると長すぎて可読性を落とす文章については、特定の品詞、表記等の条件を満たす箇所で、さらに複数の文章に分割する。
ＳＴＥＰ９２：文書の形態素解析
テロップ挿入部１３６は、各文を形態素解析にかけ、構文解析の最小単位となるトークンを取り出す。
ＳＴＥＰ９３：不要語・不要段落の削除
テロップ挿入部１５７は、予め定義された無効な文の判定ルールより、無効と定義される文、段落を削除する。例えば、「■」、「▼」などの特定記号から始まる行、特定記号で囲まれた行、ＵＲＬ、メールアドレス、住所・電話番号などを削除する。 <Modification>
The moving image editing server 1 in a preferable mode has a summary document creating function for inserting as a telop. The flow of creating a summary document will be described with reference to FIG.
STEP 91: Paragraph division/document division The telop insertion unit 157 divides the input document into paragraphs, and divides the documents in each paragraph into documents. Further, a sentence that is too long to be displayed as a telop of a moving image and deteriorates readability is further divided into a plurality of sentences at a portion satisfying a condition such as a specific part of speech or notation.
STEP 92: Morphological Analysis of Document The telop insertion unit 136 subjects each sentence to morphological analysis and extracts a token that is the minimum unit of syntactic analysis.
STEP 93: Deletion of Unnecessary Words/Unnecessary Paragraphs The telop insertion unit 157 deletes sentences and paragraphs defined as invalid according to the predefined invalid sentence determination rule. For example, a line starting from a specific symbol such as “■” or “▼”, a line surrounded by the specific symbol, a URL, a mail address, an address/telephone number, etc. are deleted.

ＳＴＥＰ９４：ストップワード等の削除
テロップ挿入部１５７は、トークンから「に」、「から」、「これ」、「さん」などのあまり意味としては重要でないワード（ストップワード）や助詞などの特定品詞を削除する。
ＳＴＥＰ９５：トークンバイグラムの作成
特定の条件（例えば、予め定義された品詞条件）を満たす複数のトークンを繋げ、トークンバイグラムを得る。例えば、「２０１４年」（名詞、固有名詞、一般）と「６月」（名詞、固有名詞、一般）を繋げて「２０１４年６月」としたり、「「ヴェルディ」（固有名詞）と「協賛」（普通名詞）を繋げ、「ヴェルディ協賛」としたりする。 STEP 94: Deletion of Stop Words, etc. The telop insertion unit 157 removes words such as “ni”, “kara”, “kore”, and “san” that are not so significant words (stop words) and specific parts of speech such as particles. delete.
STEP95: Creation of token bigram A token bigram is obtained by connecting a plurality of tokens satisfying a specific condition (for example, a predefined part-of-speech condition). For example, “2014” (noun, proper noun, general) and “June” (noun, proper noun, general) are connected to be “June 2014”, or “Verdi” (proper noun) and “sponsorship” "(Common noun) is connected and it is called "Verdi sponsorship".

ＳＴＥＰ９６：重要文の抽出
トークンおよびトークンバイグラムを元にＴＦ−ＩＤＦスコア単語の重要度を評価する指標から特徴語となるトークンおよびトークンバイアグラムを抽出し、前述の単語類似度判定からセンテンスのセグメンテーションを行い、各セグメントから重要文を抽出することで要約とする。
ＳＴＥＰ９７：テンプレートへの当てはめ
要約（重要文）を構文解析にかけ、文節と構文木に別ける。上述のテンプレートは各シーンに求める文字数が定義されているところ、文節間の修飾関係から、文章として自然な区間が各テンプレートに収まるように文を切り、テンプレートに当てはめる。
以上に説明した要約文作成機能は、日本語のみならず、英語はじめとする多言語に対応が可能である。 STEP96: Extraction of Important Sentences TF-IDF score based on tokens and token bigrams Tokens and token viagrams that are characteristic words are extracted from an index that evaluates the importance of words, and sentence segmentation is performed based on the aforementioned word similarity determination. The summary is done by extracting important sentences from each segment.
STEP97: Fitting to template The summary (important sentences) is subjected to syntactic analysis and divided into clauses and syntactic trees. In the above-mentioned template, the number of characters required for each scene is defined. Based on the modification relation between clauses, the sentence is cut so that a natural section as a sentence fits in each template and the template is applied.
The above-described summary sentence creating function can support not only Japanese but also multiple languages such as English.

以上、本発明の好ましい実施形態例について説明したが、本発明の技術的範囲は上記実施形態の記載に限定されるものではない。上記実施形態例には様々な変更・改良を加えることが可能であり、そのような変更または改良を加えた形態のものも本発明の技術的範囲に含まれる。 Although the preferred embodiments of the present invention have been described above, the technical scope of the present invention is not limited to the description of the above embodiments. Various modifications and improvements can be added to the above-described embodiment, and such modifications and improvements are also included in the technical scope of the present invention.

１動画編集サーバ
２管理者端末
３ユーザ端末
１１閲覧履歴保存部
１２推薦記事抽出部
１３テンプレート作成部
１４分類器作成部
１５合成動画作成部
２１閲覧記録ＤＢ
２２推薦記事ＤＢ
２３テンプレートＤＢ
２４学習データＤＢ
２５合成動画ＤＢ
２６ラベルＤＢ
１５１推薦記事読込部
１５２動画分割部
１５３第１分類器
１５４第２分類器
１５５分割動画結合部
１５６音楽挿入部
１５７テロップ挿入部 1 Video Editing Server 2 Administrator Terminal 3 User Terminal 11 Browsing History Saving Section 12 Recommended Article Extracting Section 13 Template Creating Section 14 Classifier Creating Section 15 Synthetic Video Creating Section 21 Viewing Record DB
22 Recommended Article DB
23 Template DB
24 Learning data DB
25 Synthetic video DB
26 Label DB
151 Recommended Article Reading Unit 152 Video Dividing Unit 153 First Classifier 154 Second Classifier 155 Divided Video Combining Unit 156 Music Inserting Unit 157 Telop Inserting Unit

Claims

On the server that creates the video for distribution to the user terminal,
A recommended article extracting unit that extracts recommended articles including material videos based on user preference profile information;
A template creation unit that creates a template in which multiple scenes with scene labels are specified,
A composite moving image creating unit that creates a composite moving image based on the material moving image and the template,
A moving picture dividing section for dividing the material moving picture into a plurality of divided material moving pictures;
A first classifier for attaching a scene label to the divided material moving image;
A moving image editing server, comprising: a divided moving image combining unit that extracts combined material moving images that match each scene of the template based on the scene label and combines the divided material moving images to create a combined moving image.

The moving image editing server according to claim 1, wherein the synthetic moving image creating unit includes a second classifier that attaches an object name label indicating an object name displayed in the divided material moving image.

The moving image editing server according to claim 2, wherein the divided moving image combining unit extracts a divided material moving image that matches each scene of the template based on the scene label and the object name label.

The recommended article extraction unit extracts the recommended article with a recommendation value based on similarity information,
4. The moving image editing server according to claim 1, wherein the divided moving image combining unit preferentially adopts a divided material moving image obtained from a recommended article having a high recommendation value.

The moving image editing server according to claim 1, wherein the synthetic moving image creating unit includes a telop inserting unit that inserts a telop into the synthetic moving image.

The moving image editing server according to claim 5, wherein the telop insertion unit inserts a telop created by summarizing the character information described in the recommended article into the composite moving image.

In the template, the allowable time range of each scene is defined,
7. The moving image editing server according to claim 1, wherein the divided moving image combining unit extracts a divided material moving image that matches the allowable time range.

In addition to the material moving image dividing function for dividing the material moving image into a plurality of divided material moving images, the moving image dividing unit has a learning moving image dividing function for creating learning data of the first classifier. The moving image editing server according to claim 1, wherein the moving image editing server is a moving image editing server.

In the video editing program for the server that delivers the video content to the user terminal accessed via the Internet,
The server
A recommended article extraction unit that extracts recommended articles including material videos based on user preference profile information,
A template creation unit that creates a template in which a plurality of scenes with scene labels are defined, and
Functioning as a composite moving image creating unit that creates a composite moving image based on the material moving image and the template,
A moving picture dividing section for dividing the material moving picture into a plurality of divided material moving pictures;
A first classifier for attaching a scene label to the divided material moving image;
A divided moving image combining unit that creates a combined moving image by extracting divided material moving images that match each scene of the template based on the scene label and combining them;
A program for a video editing server, comprising: