JP6603929B1

JP6603929B1 - Movie editing server and program

Info

Publication number: JP6603929B1
Application number: JP2019020726A
Authority: JP
Inventors: 雄康高松; 孝弘坪野; 尚武石橋
Original assignee: Open8 Inc
Current assignee: Open8 Inc
Priority date: 2019-02-07
Filing date: 2019-02-07
Publication date: 2019-11-13
Anticipated expiration: 2039-02-07
Also published as: JP2020129189A

Abstract

【課題】動画コンテンツを簡便に作成することを可能とするサーバおよびプログラムの提供。【解決手段】ユーザ端末に配信するための動画を作成するサーバにおいて、ユーザの嗜好プロファイル情報に基づいて材料動画を含む推薦記事を抽出する推薦記事抽出部と、シーンラベルが付された複数のシーンが規定されたテンプレートを作成するテンプレート作成部と、前記材料動画および前記テンプレートに基づき合成動画を作成する合成動画作成部と、を備え、前記合成動画作成部が、材料動画を複数の分割材料動画に分割する動画分割部と、前記分割材料動画にシーンラベルを付する第１分類器と、前記シーンラベルに基づき前記テンプレートの各シーンに合致する分割材料動画を抽出し、結合することで合成動画を作成する分割動画結合部と、を備えることを特徴とする動画編集サーバ及びそのプログラム。【選択図】図2Provided is a server and a program capable of easily creating moving image content. In a server for creating a moving image to be distributed to a user terminal, a recommended article extracting unit that extracts a recommended article including a material moving image based on user preference profile information, and a plurality of scenes with scene labels A template creation unit that creates a template in which a material video is defined, and a composite video creation unit that creates a composite video based on the material video and the template, wherein the composite video creation unit converts the material video into a plurality of divided material videos A moving image dividing unit that divides the divided material moving image, a first classifier that adds a scene label to the divided material moving image, and a divided moving image that matches each scene of the template based on the scene label A moving image editing server, and a program for the same. [Selection] Figure 2

Description

本発明は、ユーザ端末に配信する合成動画を自動で生成するサーバおよびプログラムに関する。 The present invention relates to a server and a program for automatically generating a synthetic moving image to be distributed to a user terminal.

従来、動画を複数のチャプタに分割すると共に、メタデータを付加することが行われている。
例えば、特許文献１には、複数のチャプタを有する動画から所望の瞬間のシーン画像を効率的に検索する動画処理装置であって、動画を所定の単位時間毎に複数の大ブロックに分割する大ブロック分割部と、各大ブロックの画像変化の複雑さを数値化する複雑さ分析部と、各大ブロックに係る再生時間を、複雑さに係る数値により複数の小ブロックにそれぞれ分割する小ブロック分割部と、複数の小ブロックを時系列に沿って所定数毎に順次区切ることでチャプタを作成するチャプタ作成部と、を備えてなる動画処理装置が提案されている。 Conventionally, a moving image is divided into a plurality of chapters and metadata is added.
For example, Patent Document 1 discloses a moving image processing apparatus that efficiently searches for a scene image at a desired moment from a moving image having a plurality of chapters. The moving image is divided into a plurality of large blocks every predetermined unit time. A block division unit, a complexity analysis unit that quantifies the complexity of the image change of each large block, and a small block division that divides the playback time of each large block into a plurality of small blocks according to the numerical value related to the complexity There has been proposed a moving image processing apparatus including a section and a chapter creating section that creates chapters by sequentially dividing a plurality of small blocks into a predetermined number along a time series.

特開２０１１−１３０００７号公報JP 2011-130007 A

動画コンテンツを作成することには多大な手間がかかるため、自動で動画コンテンツを作成することができるシステムの提供が求められていた。また、多大な手間をかけて作成した動画コンテンツであっても、ユーザの嗜好に合った内容でなければ、最後まで再生されないという問題もある。
また、社内のマニュアルや商品カタログなどの既存資料を動画化したいとのニーズもあるが、動画サーバ、専門技術を持った編集者などの動画編集環境を社内に構築することの敷居は高く、他方で外注した場合には外注コストが嵩むという問題がある。 Since it takes a lot of time and effort to create video content, it has been required to provide a system that can automatically create video content. In addition, even video content created with a great deal of effort is not reproduced until the end unless the content matches the user's preference.
In addition, there is a need to animate existing materials such as in-house manuals and product catalogs, but there is a high threshold for building a video editing environment such as video servers and editors with specialized technology. In the case of outsourcing, there is a problem that the outsourcing cost increases.

そこで、本発明では、上記課題を解決するべく、動画コンテンツを自動で生成するサーバおよびプログラムを提供することを目的とする。 Therefore, an object of the present invention is to provide a server and a program for automatically generating moving image content in order to solve the above-described problems.

本発明の動画編集サーバは、ユーザ端末に配信するための動画を作成するサーバにおいて、ユーザの嗜好プロファイル情報に基づいて材料動画を含む推薦記事を抽出する推薦記事抽出部と、シーンラベルが付された複数のシーンが規定されたテンプレートを作成するテンプレート作成部と、前記材料動画および前記テンプレートに基づき合成動画を作成する合成動画作成部と、を備え、前記合成動画作成部が、材料動画を複数の分割材料動画に分割する動画分割部と、前記分割材料動画にシーンラベルを付する第１分類器と、前記シーンラベルに基づき前記テンプレートの各シーンに合致する分割材料動画を抽出し、結合することで合成動画を作成する分割動画結合部と、を備えることを特徴とする。 The video editing server of the present invention is a server that creates a video for distribution to a user terminal, and a recommended article extraction unit that extracts a recommended article including a material video based on the user's preference profile information, and a scene label. A template creation unit that creates a template in which a plurality of scenes are defined, and a composite video creation unit that creates a composite video based on the material video and the template, and the composite video creation unit includes a plurality of material video A video dividing unit that divides the video into divided material videos, a first classifier that assigns a scene label to the divided material video, and a split material video that matches each scene of the template based on the scene label and combines them. And a divided moving image combining unit for creating a combined moving image.

上記動画編集サーバにおいて、前記合成動画作成部が、前記分割材料動画に表示された物体の物体名を表す物体名ラベルを付する第２分類器を備えることを特徴としてもよい。
上記動画編集サーバにおいて、前記テンプレートの各シーンには、前記物体名ラベルの上位概念として定義された対象物情報が付されており、前記分割動画結合部が、前記シーンラベルおよび前記物体名ラベルに基づき、前記テンプレートの各シーンに合致する分割材料動画を抽出することを特徴としてもよい。
上記動画編集サーバにおいて、前記推薦記事抽出部が、前記推薦記事をユーザが閲覧した記事との類似度情報に基づくリコメンド値を付して抽出すること、前記分割動画結合部が、リコメンド値の高い推薦記事から得られた分割材料動画を優先して採用することを特徴としてもよい。
上記動画編集サーバにおいて、前記合成動画作成部が、前記合成動画にテロップを挿入するテロップ挿入部を備えることを特徴としてもよい。
上記動画編集サーバにおいて、前記テロップ挿入部が、前記推薦記事に記載された文字情報を要約して作成したテロップを前記合成動画に挿入することを特徴としてもよい。
上記動画編集サーバにおいて、前記テンプレートにおいて、各シーンの許容時間範囲が定義されており、前記分割動画結合部が、前記許容時間範囲に合致する分割材料動画を抽出することを特徴としてもよい。
上記動画編集サーバにおいて、前記動画分割部が、前記材料動画を複数の分割材料動画に分割するための材料用動画分割機能に加え、前記第１分類器の学習データを作成するための学習用動画分割機能を備えることを特徴としてもよい。 In the moving image editing server, the synthesized moving image creating unit may include a second classifier that attaches an object name label representing an object name of an object displayed in the divided material moving image.
In the moving image editing server, each scene of the template is attached with object information defined as a superordinate concept of the object name label, and the divided moving image combining unit adds the scene label and the object name label. On the basis of this, a divided material moving image that matches each scene of the template may be extracted.
In the video editing server, the recommended article extraction unit extracts the recommended article with a recommendation value based on similarity information with an article viewed by a user, and the divided video combining unit has a high recommendation value. A feature may be that the divided material video obtained from the recommended article is preferentially adopted.
In the moving image editing server, the synthesized moving image creating unit may include a telop inserting unit that inserts a telop into the synthesized moving image.
In the moving image editing server, the telop insertion unit may insert a telop created by summarizing character information described in the recommended article into the synthesized moving image.
In the moving image editing server, an allowable time range of each scene is defined in the template, and the divided moving image combining unit extracts a divided material moving image that matches the allowable time range.
In the moving image editing server, the moving image dividing unit has a moving image for learning for generating learning data of the first classifier in addition to a moving image dividing function for material for dividing the material moving image into a plurality of divided material moving images. A division function may be provided.

本発明の動画編集サーバ用プログラムは、インターネットを介してアクセスしたユーザ端末に動画コンテンツを配信するサーバ用の動画編集プログラムにおいて、前記サーバを、ユーザの嗜好プロファイル情報に基づいて材料動画を含む推薦記事を抽出する推薦記事抽出部、シーンラベルが付された複数のシーンが規定されたテンプレートを作成するテンプレート作成部、および、前記材料動画および前記テンプレートに基づき合成動画を作成する合成動画作成部、として機能させること、前記合成動画作成部が、材料動画を複数の分割材料動画に分割する動画分割部と、前記分割材料動画にシーンラベルを付する第１分類器と、前記シーンラベルに基づき前記テンプレートの各シーンに合致する分割材料動画を抽出し、結合することで合成動画を作成する分割動画結合部と、を備えることを特徴とする。 The video editing server program according to the present invention is a server video editing program that distributes video content to a user terminal accessed via the Internet. The server includes a recommended article including a material video based on user preference profile information. A recommended article extraction unit that extracts a template, a template creation unit that creates a template in which a plurality of scenes with scene labels are defined, and a composite video creation unit that creates a composite video based on the material video and the template, Functioning, wherein the synthesized moving image creating unit divides the material moving image into a plurality of divided material moving images, a first classifier for assigning a scene label to the divided material moving image, and the template based on the scene label Extract and combine the segmented material videos that match each scene Characterized in that it comprises a divided moving coupling portion to create a picture, a.

本発明によれば、ユーザの嗜好に合った動画コンテンツを自動で生成するサーバおよびプログラムを提供することが可能となる。また、動画編集環境を社内に構築しなくとも、簡便に社内のマニュアルや商品カタログなどの既存資料を動画化することが可能となる。 ADVANTAGE OF THE INVENTION According to this invention, it becomes possible to provide the server and program which produce | generate automatically the moving image content according to a user preference. In addition, it is possible to easily animate existing materials such as in-house manuals and product catalogs without building a video editing environment in the company.

実施形態例に係る動画編集システムの構成図である。1 is a configuration diagram of a moving image editing system according to an embodiment. 実施形態例に係る動画編集サーバの構成図である。It is a block diagram of the moving image editing server which concerns on the example of embodiment. テンプレートの一例を説明する図である。It is a figure explaining an example of a template. 動画作成部の構成図である。It is a block diagram of a moving image creation part. 各シーンの条件に合致する分割材料動画の抽出を説明する図である。It is a figure explaining extraction of the division material animation which matches the conditions of each scene. 動画編集処理のフローである。It is a flow of a moving image editing process. 動画分割処理の説明図である。It is explanatory drawing of a moving image division | segmentation process. 分割動画結合処理の説明図である。It is explanatory drawing of a division | segmentation moving image combination process. 要約文作成処理のフローである。It is a flow of a summary sentence creation process.

＜構成＞
実施形態例に係る本発明の動画編集システムは、図１に示すように、動画編集サーバ１と、管理者端末２と、複数台のユーザ端末３とを備えて構成される。なお、図１の例では、動画編集サーバ１を１台で構成する例を説明したが、複数台のサーバにより動画編集サーバ１を実現することも可能であり、ＧＰＵを搭載したハイスペックな装置を別途設けて、後述の分類器作成などの作業を行うことが作業効率の観点からは好ましい。 <Configuration>
As shown in FIG. 1, the moving image editing system according to the embodiment of the present invention includes a moving image editing server 1, an administrator terminal 2, and a plurality of user terminals 3. In the example of FIG. 1, the example in which the moving image editing server 1 is configured as one unit has been described. However, the moving image editing server 1 can be realized by a plurality of servers, and is a high-spec device equipped with a GPU. It is preferable from the viewpoint of work efficiency to provide a separate and perform operations such as creating a classifier described later.

動画編集サーバ１は、ＣＰＵを有する処理部、ＨＤＤ等の記憶装置を有する記憶部、および、ＬＡＮポートを有する通信部を備えたサーバ装置に動画編集ソフトウェアおよびデータベースソフトウェアをインストールして構築されている。動画編集ソフトウェアは、閲覧履歴保存部１１と、推薦記事抽出部１２と、テンプレート作成部１３と、分類器作成部１４と、合成動画作成部１５と、を備えている。データベースソフトウェアは、閲覧記録ＤＢ２１と、推薦記事ＤＢ２２と、テンプレートＤＢ２３と、学習データＤＢ２４と、合成動画ＤＢ２５と、ラベルＤＢ２６と、を管理している。 The moving image editing server 1 is constructed by installing moving image editing software and database software in a server device including a processing unit having a CPU, a storage unit having a storage device such as an HDD, and a communication unit having a LAN port. . The video editing software includes a browsing history storage unit 11, a recommended article extraction unit 12, a template creation unit 13, a classifier creation unit 14, and a composite video creation unit 15. The database software manages a browsing record DB 21, a recommended article DB 22, a template DB 23, a learning data DB 24, a synthesized moving image DB 25, and a label DB 26.

閲覧履歴保存部１１は、ユーザがログイン状態で閲覧したＷｅｂ上の記事の閲覧記録を、閲覧記録ＤＢに記録する。ユーザが閲覧したＷｅｂ上の記事には一意の記事ＩＤが付与されており、ユーザＩＤおよび閲覧時間と紐付けて記録される。これとは異なり、匿名ユーザとして閲覧記録を記録する仕組みを取り入れてもよい。 The browsing history storage unit 11 records browsing records of articles on the Web browsed by the user in a logged-in state in the browsing record DB. A unique article ID is assigned to an article on the Web viewed by the user, and is recorded in association with the user ID and the viewing time. Unlike this, a mechanism for recording browsing records as an anonymous user may be adopted.

推薦記事抽出部１２は、ユーザの嗜好プロファイルに基づいて推薦記事をＷｅｂ上から収集する作業を定期的に（例えば、毎週）実行する。実施形態例では、閲覧記録ＤＢ２１からユーザの閲覧履歴情報を取得し、ユーザが閲覧した記事に類似する動画付き記事を内容ベースフィルタリングにより抽出し、類似度情報に基づくリコメンド値を付して推薦記事ＤＢ２２に記録している。これとは異なり、協調フィルタリングにより類似度を判定してもよいし、内容ベースフィルタリングと協調フィルタリングを併用したハイブリッドフィルタリングを採用してもよい。推薦記事ＤＢ２２には、リコメンド値が高い数十件ないし百数十件の推薦記事が常時格納されている。 The recommended article extraction unit 12 periodically (for example, weekly) performs a task of collecting recommended articles from the Web based on the user's preference profile. In the embodiment, the user's browsing history information is acquired from the browsing record DB 21, an article with a moving image similar to the article browsed by the user is extracted by content-based filtering, and a recommended article is attached with a recommendation value based on similarity information. It is recorded in DB22. Unlike this, the similarity may be determined by collaborative filtering, or hybrid filtering using both content-based filtering and collaborative filtering may be employed. In the recommended article DB 22, dozens or hundreds of recommended articles with high recommendation values are always stored.

テンプレート作成部１３は、動画を構成するシーンに付する条件（ラベル、順序および各シーンの許容時間範囲）を定義するテンプレートを作成し、テンプレートＤＢ２３に格納する。テンプレートは、複数のシーンから構成され、各シーンには分割動画で表示したい対象物情報とその動きの特徴が定義されている。換言すれば、例えば、「建物廊下を移動」−＞「建物部屋のズームアウト」−＞「外観」といった具合に動画コンテンツの流れを定義するものであり、各シーンがつながった際に違和感がないことを意識して対象物情報とその動きの特徴を定義することが重要である。テンプレート作成部１３によるテンプレート作成は、定期（例えば、年数回）または必要時に実行される。図３は、４つのシーンからなるテンプレートの例であり、シーン１〜２には「調理中」のシーンラベルが付され、シーン３には「調理後」のシーンラベルが付され、シーン４には「内観／外観」のシーンラベルが付されている。なお、テンプレートの各シーンに後述の物体名ラベルを付することもできるが、物体名ラベルの上位概念の対象物情報を定義することで、多くの物体を適用可能とすることが好ましい。また、各シーンに挿入するテロップの許容文字数範囲が条件として設定されている場合もある。 The template creation unit 13 creates a template that defines conditions (label, order, and allowable time range of each scene) attached to the scenes that make up the moving image, and stores them in the template DB 23. The template is composed of a plurality of scenes, and each scene defines object information to be displayed as a divided moving image and characteristics of its movement. In other words, for example, the flow of the moving image content is defined as “moving the building corridor”-> “zoom-out of the building room”-> “appearance”, and there is no sense of incongruity when the scenes are connected. It is important to define the object information and the characteristics of its movement. Template creation by the template creation unit 13 is executed regularly (for example, several times a year) or when necessary. FIG. 3 shows an example of a template composed of four scenes. Scenes 1 and 2 are assigned a “cooking” scene label, scene 3 is assigned a “after cooking” scene label, Is labeled with a scene label of “interior / appearance”. Note that an object name label (to be described later) can be attached to each scene of the template, but it is preferable that many objects can be applied by defining the object information of the superordinate concept of the object name label. In some cases, the allowable number of characters for the telop inserted in each scene is set as a condition.

分類器作成部１４は、学習データを学習データＤＢ２４から取得し、機械学習させることで、学習済モデルである第１分類器１５３および第２分類器１５４を作成する。第１分類器の作成は動画のデータセットを利用し、各シーンの時系列の特徴の依存関係を学習する。第２分類器は画像のデータセットをもとに、物体の特徴を学習する。分類器作成部１４による学習済モデルである分類器の作成は、例えば、年に数回程度行われる。学習データは、インターネットから収集したデータや自社のデータにラベルをつけたものを利用してもよいし、ラベルのついたデータセットを調達して利用してもよい。 The classifier creating unit 14 creates the first classifier 153 and the second classifier 154 that are learned models by acquiring learning data from the learning data DB 24 and performing machine learning. The creation of the first classifier uses a moving image data set, and learns the dependency of time-series features of each scene. The second classifier learns the feature of the object based on the image data set. The classifier that is the learned model by the classifier creating unit 14 is created several times a year, for example. The learning data may be data collected from the Internet or in-house data with a label, or a data set with a label may be procured and used.

合成動画作成部１５は、図４に示すように、推薦記事読込部１５１と、動画分割部１５２と、第１分類器１５３と、第２分類器１５４と、分割動画結合部１５５と、音楽挿入部１５６と、テロップ挿入部１５７とを備えている。
推薦記事読込部１５１は、合成動画を作成する際の材料となる材料動画を含む推薦記事を推薦記事ＤＢ２２から取得する。 As shown in FIG. 4, the synthesized video creating unit 15 includes a recommended article reading unit 151, a video dividing unit 152, a first classifier 153, a second classifier 154, a divided video combining unit 155, and music insertion. Part 156 and a telop insertion part 157.
The recommended article reading unit 151 acquires from the recommended article DB 22 a recommended article that includes a material video that is a material for creating a composite video.

動画分割部１５２は、材料動画をフレームに分割し、直前のフレームから色彩が大きく変化する箇所を区間とすることで分割位置を判定し、分割材料動画作成する（図７参照）。より詳細には、例えば、フレームの画像の全てのピクセルの色を数値化し、その全てのピクセルの色の平均色を算出し、前のフレームの平均色と比較して大きく変化した箇所で分割して分割材料動画を作成する。このような分割手法は、見た目のつながりがよいシーンを１つの分割動画としてまとめる際に役に立ち、比較的長く繋がった分割材料動画が生成される。
動画分割部１５２は、学習データとして利用される細かく分割された動画を作るためにも利用される。より詳細には、例えば、フレームの画像の全てのピクセルの色を数値化し、そのピクセルの明るさから黒色の色彩度の変化を判定し、ある程度暗くなると分割することで学習データ用分割動画を作成する。
このように、本実施形態の動画分割部１５２は、目的に応じて動画分割技術を使い分けている。 The moving image dividing unit 152 divides the material moving image into frames, determines a dividing position by setting a portion where the color greatly changes from the immediately preceding frame as a section, and creates a divided material moving image (see FIG. 7). More specifically, for example, the color of all the pixels in the image of the frame is digitized, the average color of the colors of all the pixels is calculated, and the image is divided at a location that has changed significantly compared to the average color of the previous frame. To create a split material video. Such a dividing method is useful when a scene having a good visual connection is grouped as one divided moving image, and a divided material moving image connected for a relatively long time is generated.
The moving image dividing unit 152 is also used to create a finely divided moving image used as learning data. More specifically, for example, the color of all the pixels in the image of the frame is digitized, the change in the color saturation of black is determined from the brightness of the pixel, and when it becomes dark to some extent, it is divided to create a divided video for learning data To do.
As described above, the moving image dividing unit 152 according to the present embodiment uses the moving image dividing technique depending on the purpose.

第１分類器１５３は、リカレントニューラルネットワークを利用した分類器であり、動画を入力すると、動きも判定した分類結果としてラベルを出力し、ラベルＤＢ２６に格納する。より詳細には、第１分類器１５３は、分割材料動画に、料理中、店内の内観などといったシーンの状況を表す単語（アノテーション単語）をシーンラベルとして付するために利用される。
第２分類器１５４は、畳み込みニューラルネットワークを利用した分類器であり、動画または画像を入力すると、動画または画像に映る物体名ラベルを出力し、ラベルＤＢ２６に格納する。より詳細には、第２分類器１５４は、分割材料動画に、魚介、焼肉、人物、家具などといった物体名を表す単語（アノテーション単語）を物体名ラベルとして付するために利用される。第２分類器１５４により得られたメタデータをもとに抽出する分割動画が選択される。例えば１つ目の動画から抽出された分割動画に魚介が含まれていると判定した場合、２つ目、３つ目の動画からも、魚介が映り込む分割動画を優先的に抽出する。
The first classifier 153 is a classifier using a recurrent neural network. When a moving image is input, the first classifier 153 outputs a label as a classification result in which movement is also determined and stores the label in the label DB 26. More specifically, the first classifier 153 is used to attach a word (annotation word) representing a scene state such as during cooking or in-store interior to a divided material moving image as a scene label.
The second classifier 154 is a classifier using a convolutional neural network. When a moving image or image is input, an object name label appearing in the moving image or image is output and stored in the label DB 26. More specifically, the second classifier 154 is used to attach a word (annotation word) representing an object name such as seafood, yakiniku, a person, furniture, or the like to the divided material moving image as an object name label. A divided moving image to be extracted based on the metadata obtained by the second classifier 154 is selected. For example, when it is determined that seafood is included in the divided video extracted from the first video, the split video in which the seafood is reflected is also extracted preferentially from the second and third videos.

分割動画結合部１５５は、材料動画に最も親和性の高いテンプレートを選択し、選択されたテンプレートの各シーンの条件に合致する分割材料動画を抽出し、抽出した分割材料動画を結合する。図５は、各シーンの条件に合致する分割材料動画の抽出を説明する図であり、１０に分割された分割材料動画のうち、開始から６〜９番目の分割材料動画が条件に合致するとして抽出されている。
分割動画結合部１５５は、抽出された複数の推薦記事について、リコメンド値の高いものから順に分割材料動画の抽出を行う。最初の推薦記事によりテンプレートの全シーンに合致する分割材料動画が抽出されなかった場合には、次にリコメンド値の高い推薦記事について、歯抜けとなったシーンの条件に合致する分割材料動画のマッチングを行うことで、テンプレートで必要とされる全シーンを満たすための分割材料動画を抽出する。この際、各記事の動画から抽出した分割動画から、各シーンで類似した物体が映り込んだものが選択されるように第２分類器１５４から得られたメタデータが利用される。 The divided moving image combining unit 155 selects a template having the highest affinity for the material moving image, extracts a divided material moving image that matches the conditions of each scene of the selected template, and combines the extracted divided material moving images. FIG. 5 is a diagram for explaining the extraction of the divided material moving images that match the conditions of each scene. Of the divided material moving images divided into 10, the sixth to ninth divided material moving images from the start meet the conditions. Has been extracted.
The divided moving image combining unit 155 extracts divided material moving images in order from the highest recommended value for the plurality of extracted recommended articles. If the segmented material video that matches all the scenes of the template is not extracted by the first recommended article, matching the segmented material video that matches the conditions of the missing scene for the recommended article with the next highest recommendation value By performing the above, a divided material moving image for extracting all scenes required by the template is extracted. At this time, the metadata obtained from the second classifier 154 is used so that a divided moving image extracted from the moving image of each article is selected so that a similar object is reflected in each scene.

音楽挿入部１５６は、分割動画結合部１５５が結合した合成動画（音楽無し）に音楽を挿入する。
テロップ挿入部１５７は、分割動画結合部１５５が結合した合成動画（テロップ無し）にテロップを挿入する。各シーンに挿入するテロップの許容文字数範囲が条件として設定されている場合は、当該条件を満たす文字数のテロップを挿入する。 The music insertion unit 156 inserts music into the synthesized moving image (no music) combined by the divided moving image combining unit 155.
The telop insertion unit 157 inserts a telop into the combined moving image (no telop) combined by the divided moving image combining unit 155. If the allowable character count range of telops to be inserted into each scene is set as a condition, a telop having the number of characters satisfying the condition is inserted.

管理者端末２およびユーザ端末３は、入力部、表示部、処理部、記憶部および通信部を備えたコンピュータであり、例えば、スマートフォン、タブレット端末（タブレットＰＣ）、ノートパソコン、デスクトップパソコンなどのＷｅｂブラウザが搭載されたコンピュータである。
管理者は、管理者端末２により動画編集サーバ１の設定変更やデータベースの運用管理などを行う。
ユーザは、ユーザ端末３により動画編集サーバ１にアクセスして、自動生成された動画コンテンツを閲覧することができる。 The administrator terminal 2 and the user terminal 3 are computers including an input unit, a display unit, a processing unit, a storage unit, and a communication unit. For example, a web such as a smartphone, a tablet terminal (tablet PC), a notebook computer, or a desktop personal computer. A computer with a browser.
The administrator uses the administrator terminal 2 to change the settings of the video editing server 1 and manage the operation of the database.
The user can access the moving image editing server 1 through the user terminal 3 and browse the automatically generated moving image content.

＜動作＞
図６を参照しながら、動画編集処理のフローを説明する。
動画生成エージェントは、合成動画作成部１５による動画自動生成プロセスを定期的に（例えば、週に１回）実行する（ＳＴＥＰ１）。合成動画作成部１５の推薦記事読込部１５１は、推薦記事ＤＢ２２に格納された推薦記事を、リコメンド値が高い方から順に読み込む（ＳＴＥＰ２）。
合成動画作成部１５の動画分割部１５２は、処理対象となる推薦記事に含まれる動画（材料動画）を複数の動画（分割材料動画）に分割する（ＳＴＥＰ３）。合成動画作成部１５は、第１分類器１５３および第２分類器１５４により各分割材料動画にラベルを付する（ＳＴＥＰ４）。なお、材料動画の分割位置は、推薦記事ＤＢ２２および／または学習データＤＢ２４に記憶するようにしてもよい。合成動画作成部１５は、選択されたテンプレートのラベルおよび許容時間に合致する分割材料動画を抽出する（ＳＴＥＰ５）。例えば、図８では、テンプレートの３番目のシーン条件が「調理後」、「２〜４秒」、４番目のシーン条件が「内観／外観」、「２〜４秒」とされているところ、これらの条件を満たす２つの分割材料動画が抽出されている。 <Operation>
The flow of the moving image editing process will be described with reference to FIG.
The moving image generation agent periodically (for example, once a week) executes the automatic moving image generation process by the composite moving image creating unit 15 (STEP 1). The recommended article reading unit 151 of the synthesized video creating unit 15 reads the recommended articles stored in the recommended article DB 22 in order from the highest recommended value (STEP 2).
The moving image dividing unit 152 of the synthesized moving image creating unit 15 divides the moving image (material moving image) included in the recommended article to be processed into a plurality of moving images (divided material moving image) (STEP 3). The synthesized moving image creating unit 15 labels each divided material moving image with the first classifier 153 and the second classifier 154 (STEP 4). Note that the division position of the material moving image may be stored in the recommended article DB 22 and / or the learning data DB 24. The synthetic moving image creating unit 15 extracts a divided material moving image that matches the label and allowable time of the selected template (STEP 5). For example, in FIG. 8, the third scene condition of the template is “after cooking”, “2 to 4 seconds”, and the fourth scene condition is “interior / appearance” and “2 to 4 seconds”. Two divided material moving images satisfying these conditions are extracted.

全てのシーンについて分割材料動画が取得されなかった場合は、次の推薦記事に対して、ＳＴＥＰ２〜５の作業が繰り返される（ＳＴＥＰ６、ＳＴＥＰ７）。例えば、図８では、１番目および２番目のシーン条件を満たす分割材料動画が抽出されていないため、次の推薦記事にシーン条件を満たす分割材料動画を引き続き検索することが必要である。 When the divided material moving images are not acquired for all the scenes, the operations of STEP 2 to 5 are repeated for the next recommended article (STEP 6 and STEP 7). For example, in FIG. 8, since the divided material moving images satisfying the first and second scene conditions are not extracted, it is necessary to continuously search for the divided material moving images satisfying the scene conditions for the next recommended article.

全てのシーンについて分割材料動画が取得されると、合成動画作成部１５の分割動画結合部１５５は、抽出された分割材料動画を結合して合成動画を作成する（ＳＴＥＰ８）。合成動画作成部１５の音楽挿入部１５６は、予め用意されたＢＧＭを合成動画に挿入する（ＳＴＥＰ９）。合成動画作成部１５のテロップ挿入部１５７は、合成動画にテロップ挿入する（ＳＴＥＰ１０）。音楽およびテロップが挿入された動画は、完成動画として合成動画ＤＢ２５に格納され、ユーザは自己のユーザＩＤの権限内で完成動画を閲覧することが可能となる（ＳＴＥＰ１１）。 When the divided material moving images are acquired for all the scenes, the divided moving image combining unit 155 of the combined moving image creating unit 15 combines the extracted divided material moving images to generate a combined moving image (STEP 8). The music insertion unit 156 of the synthetic video creation unit 15 inserts a BGM prepared in advance into the synthetic video (STEP 9). The telop insertion unit 157 of the synthetic video creation unit 15 inserts a telop into the synthetic video (STEP 10). The moving image in which music and telop are inserted is stored in the synthesized moving image DB 25 as a completed moving image, and the user can view the completed moving image within the authority of his user ID (STEP 11).

以上に説明した実施形態例の動画編集システムによれば、ユーザの嗜好プロファイルに基づいてユーザの嗜好に合った動画コンテンツを自動で生成し、お勧め動画として定期的に配信することが可能となる。また、動画編集用ソフト、動画サーバ、専門技術を持った編集者などを自前で揃えなくとも、動画広告や動画プレスリリースを作成することも、マニュアルや商品カタログを動画化することも可能となる。 According to the video editing system of the embodiment described above, it is possible to automatically generate video content that matches the user's preference based on the user's preference profile, and periodically distribute it as a recommended video. . It is also possible to create video advertisements and video press releases, and to animate manuals and product catalogs without having to prepare video editing software, video servers, editors with specialized technology, etc. .

＜変形例＞
好ましい態様の動画編集サーバ１は、テロップとして挿入するための要約文書作成機能を備えている。図９を参照しながら要約文書作成のフローを説明する。
ＳＴＥＰ９１：段落分割・文書分割
テロップ挿入部１５７は、入力された文書を段落に分割し、各段落内の文書を文書に分割する。また、動画のテロップとして１シーンで表示すると長すぎて可読性を落とす文章については、特定の品詞、表記等の条件を満たす箇所で、さらに複数の文章に分割する。
ＳＴＥＰ９２：文書の形態素解析
テロップ挿入部１３６は、各文を形態素解析にかけ、構文解析の最小単位となるトークンを取り出す。
ＳＴＥＰ９３：不要語・不要段落の削除
テロップ挿入部１５７は、予め定義された無効な文の判定ルールより、無効と定義される文、段落を削除する。例えば、「■」、「▼」などの特定記号から始まる行、特定記号で囲まれた行、ＵＲＬ、メールアドレス、住所・電話番号などを削除する。 <Modification>
The moving image editing server 1 according to a preferred aspect has a summary document creation function for insertion as a telop. A summary document creation flow will be described with reference to FIG.
STEP 91: Paragraph Division / Document Division The telop insertion unit 157 divides the input document into paragraphs, and divides the documents in each paragraph into documents. In addition, a sentence that is too long to be displayed in one scene as a moving picture telop is further divided into a plurality of sentences at a location that satisfies a specific part of speech or notation.
STEP 92: Document Morphological Analysis The telop insertion unit 136 subjects each sentence to morphological analysis, and extracts a token that is a minimum unit of syntax analysis.
STEP 93: Delete Unnecessary Words / Unnecessary Paragraphs The telop insertion unit 157 deletes sentences and paragraphs that are defined as invalid based on a predefined invalid sentence determination rule. For example, a line starting from a specific symbol such as “■” or “▼”, a line surrounded by the specific symbol, a URL, a mail address, an address / phone number, or the like is deleted.

ＳＴＥＰ９４：ストップワード等の削除
テロップ挿入部１５７は、トークンから「に」、「から」、「これ」、「さん」などのあまり意味としては重要でないワード（ストップワード）や助詞などの特定品詞を削除する。
ＳＴＥＰ９５：トークンバイグラムの作成
特定の条件（例えば、予め定義された品詞条件）を満たす複数のトークンを繋げ、トークンバイグラムを得る。例えば、「２０１４年」（名詞、固有名詞、一般）と「６月」（名詞、固有名詞、一般）を繋げて「２０１４年６月」としたり、「「ヴェルディ」（固有名詞）と「協賛」（普通名詞）を繋げ、「ヴェルディ協賛」としたりする。 STEP 94: Delete Stop Words etc. The telop insertion unit 157 adds specific parts of speech such as words (stop words) and particles that are not so important as tokens such as “ni”, “kara”, “this”, “san” from tokens. delete.
STEP 95: Creation of a token bigram A token bigram is obtained by connecting a plurality of tokens satisfying a specific condition (for example, a predefined part of speech condition). For example, “2014” (nouns, proper nouns, general) and “June” (nouns, proper nouns, general) are connected to “June 2014”, or “Verdi” (proper nouns) and “sponsorship” ”(Common nouns) and“ Verdi sponsorship ”.

ＳＴＥＰ９６：重要文の抽出
トークンおよびトークンバイグラムを元にＴＦ−ＩＤＦスコア単語の重要度を評価する指標から特徴語となるトークンおよびトークンバイアグラムを抽出し、前述の単語類似度判定からセンテンスのセグメンテーションを行い、各セグメントから重要文を抽出することで要約とする。
ＳＴＥＰ９７：テンプレートへの当てはめ
要約（重要文）を構文解析にかけ、文節と構文木に別ける。上述のテンプレートは各シーンに求める文字数が定義されているところ、文節間の修飾関係から、文章として自然な区間が各テンプレートに収まるように文を切り、テンプレートに当てはめる。
以上に説明した要約文作成機能は、日本語のみならず、英語はじめとする多言語に対応が可能である。 STEP 96: Extraction of important sentences TF-IDF score based on tokens and token bigrams Tokens and token viagrams as feature words are extracted from an index for evaluating the importance of words, and sentence segmentation is performed from the above word similarity determination. The summary is obtained by extracting important sentences from each segment.
STEP 97: Fitting to template The summary (important sentence) is subjected to parsing, and separated into clauses and syntax trees. In the template described above, the number of characters required for each scene is defined. From the modification relationship between clauses, the sentence is cut so that a natural section as a sentence fits in each template and applied to the template.
The summary sentence creation function described above can be applied not only to Japanese but also to multiple languages such as English.

以上、本発明の好ましい実施形態例について説明したが、本発明の技術的範囲は上記実施形態の記載に限定されるものではない。上記実施形態例には様々な変更・改良を加えることが可能であり、そのような変更または改良を加えた形態のものも本発明の技術的範囲に含まれる。 The preferred embodiments of the present invention have been described above, but the technical scope of the present invention is not limited to the description of the above embodiments. Various modifications and improvements can be added to the above-described embodiment, and forms with such modifications or improvements are also included in the technical scope of the present invention.

１動画編集サーバ
２管理者端末
３ユーザ端末
１１閲覧履歴保存部
１２推薦記事抽出部
１３テンプレート作成部
１４分類器作成部
１５合成動画作成部
２１閲覧記録ＤＢ
２２推薦記事ＤＢ
２３テンプレートＤＢ
２４学習データＤＢ
２５合成動画ＤＢ
２６ラベルＤＢ
１５１推薦記事読込部
１５２動画分割部
１５３第１分類器
１５４第２分類器
１５５分割動画結合部
１５６音楽挿入部
１５７テロップ挿入部 1 video editing server 2 administrator terminal 3 user terminal 11 browsing history storage unit 12 recommended article extraction unit 13 template creation unit 14 classifier creation unit 15 composite video creation unit 21 browsing record DB
22 Recommended Article DB
23 Template DB
24 learning data DB
25 Synthetic Movie DB
26 Label DB
151 Recommended Article Reading Unit 152 Video Dividing Unit 153 First Classifier 154 Second Classifier 155 Divided Video Combining Unit 156 Music Inserting Unit 157 Telop Inserting Unit

Claims

In a server that creates a video for distribution to a user terminal,
A recommended article extraction unit that extracts recommended articles including material videos based on user preference profile information;
A template creation unit for creating a template in which a plurality of scenes with scene labels are defined;
A composite video creation unit that creates a composite video based on the material video and the template,
The synthetic video creation unit is a video dividing unit that divides a material video into a plurality of divided material videos
A first classifier for attaching a scene label to the divided material animation;
A moving image editing server comprising: a divided moving image combining unit that extracts divided material moving images that match each scene of the template based on the scene label and combines them to create a combined moving image.

The moving image editing server according to claim 1, wherein the synthetic moving image creating unit includes a second classifier that attaches an object name label representing an object name of an object displayed in the divided material moving image.

Each scene of the template is attached with object information defined as a superordinate concept of the object name label,
The moving image editing server according to claim 2, wherein the divided moving image combining unit extracts a divided material moving image that matches each scene of the template based on the scene label and the object name label.

The recommended article extraction unit extracts the recommended article with a recommendation value based on similarity information with an article viewed by a user ;
4. The moving image editing server according to claim 1, wherein the divided moving image combining unit preferentially adopts a divided material moving image obtained from a recommended article having a high recommendation value.

5. The moving image editing server according to claim 1, wherein the synthesized moving image creating unit includes a telop inserting unit that inserts a telop into the synthesized moving image.

6. The moving image editing server according to claim 5, wherein the telop insertion unit inserts a telop created by summarizing character information described in the recommended article into the synthesized moving image.

In the template, the allowable time range of each scene is defined,
7. The moving image editing server according to claim 1, wherein the divided moving image combining unit extracts a divided material moving image that matches the allowable time range.

The moving image dividing unit has a learning moving image dividing function for creating learning data of the first classifier in addition to a material moving image dividing function for dividing the material moving image into a plurality of divided material moving images. 8. The moving image editing server according to claim 1, wherein

In a video editing program for a server that distributes video content to a user terminal accessed via the Internet,
The server,
A recommended article extractor that extracts recommended articles including material videos based on user preference profile information;
A template creation unit for creating a template in which a plurality of scenes with scene labels are defined; and
Functioning as a composite video creation unit that creates a composite video based on the material video and the template;
The synthetic video creation unit is a video dividing unit that divides a material video into a plurality of divided material videos
A first classifier for attaching a scene label to the divided material animation;
A divided moving image combining unit that extracts a divided material moving image that matches each scene of the template based on the scene label and creates a combined moving image by combining the moving images;
A program for a moving image editing server, comprising: