JP2020140624A

JP2020140624A - Attribute estimation device, attribute estimation method, and program

Info

Publication number: JP2020140624A
Application number: JP2019037552A
Authority: JP
Inventors: 永田　尚志; Hisashi Nagata; 尚志永田
Original assignee: Nippon Telegraph and Telephone West Corp
Current assignee: Nippon Telegraph and Telephone West Corp
Priority date: 2019-03-01
Filing date: 2019-03-01
Publication date: 2020-09-03
Anticipated expiration: 2039-03-01
Also published as: JP6701400B1

Abstract

To estimate attributes of users using a content distribution service using information available to a content distribution company.SOLUTION: A reproduction frequency count unit of an attribute estimation device counts, for each content provider, the reproduction frequency of a content on a terminal of a content user, using provider information indicating a content provider of each content and log data indicating a content reproduced on the terminal of the content user. A model application unit inputs input data indicating the reproduction frequency for each content provider that has been counted, to an estimation model that is a neural network that inputs the reproduction frequency for each content provider and outputs a value indicating the probability that the attribute of the content user is each attribute value. An estimation unit estimates the attribute value of the content user based on a result output by the estimation model in response to the input of the input data.SELECTED DRAWING: Figure 5

Description

本発明は、属性推定装置、属性推定方法及びプログラムに関する。 The present invention relates to an attribute estimation device, an attribute estimation method and a program.

インターネットを介してコンテンツを配信するサービスを提供しているコンテンツ配信業者は、サービスを利用するユーザの端末に対して、お知らせや広告等（以下、「広告等」と記載）を配信している。ユーザの属性にマッチした広告等は、マッチしていない広告等と比較して高い効果が見込まれる。 Content distributors that provide services that distribute content via the Internet distribute notifications, advertisements, etc. (hereinafter referred to as "advertisements, etc.") to the terminals of users who use the service. Advertisements that match the attributes of the user are expected to be more effective than advertisements that do not match.

一方で、ＳＮＳ（ソーシャルネットワーキングサービス）に用いられているユーザのプロフィール画像から、ユーザの嗜好や性格に関する情報の推定を行う技術がある（例えば、非特許文献１参照）。 On the other hand, there is a technique for estimating information on a user's taste and personality from a user's profile image used in an SNS (social networking service) (see, for example, Non-Patent Document 1).

山下雄大、森純一郎、「深層学習を用いたＳＮＳプロフィール画像からの投稿者属性推定」、［online］、２０１６年、２０１６年度人工知能学会全国大会（第３０回）論文集、[平成３１年２月２１日検索］、インターネット〈URL：https://www.jstage.jst.go.jp/article/pjsai/JSAI2016/0/JSAI2016_4K14/_pdf〉Yudai Yamashita, Junichiro Mori, "Estimating poster attributes from SNS profile images using deep learning", [online], 2016, 2016 National Conference of the Japanese Society for Artificial Intelligence (30th) Proceedings, [2019 2 Search on 21st of March], Internet <URL: https://www.jstage.jst.go.jp/article/pjsai/JSAI2016/0/JSAI2016_4K14/_pdf>

ユーザに配信する広告等を選択するためにそのユーザの属性を取得する方法として、ユーザに属性に関する情報を含んだ会員登録を行ってもらうことが考えられる。しかしながら、会員登録にかかる煩雑な手順を行わずに気軽にサービスを利用したいというユーザも多数いる。また、ユーザの属性を取得する方法として、属性を推定できるようなサイトへのアクセスログやサイトへの入力情報を取得することも考えられる。しかし、コンテンツ配信業者が、自サイト以外のサイトに関してそのような情報を幅広く収集することは難しい。非特許文献１の技術では、会員登録の際にユーザのプロフィール画像も登録してもらうか、他のサイトのプロフィール画像を参照可能とする必要がある。 As a method of acquiring the attributes of the user in order to select an advertisement or the like to be delivered to the user, it is conceivable to have the user register as a member including information on the attributes. However, there are many users who want to easily use the service without performing the complicated procedure for membership registration. In addition, as a method of acquiring the user's attribute, it is also conceivable to acquire the access log to the site or the input information to the site so that the attribute can be estimated. However, it is difficult for content distributors to collect a wide range of such information about sites other than their own. In the technology of Non-Patent Document 1, it is necessary to have the user's profile image registered at the time of membership registration, or to make it possible to refer to the profile image of another site.

上記事情に鑑み、本発明は、コンテンツ配信サービスを利用しているユーザの属性をコンテンツ配信業者が利用可能な情報を用いて推定することができる属性推定装置、属性推定方法及びプログラムを提供することを目的としている。 In view of the above circumstances, the present invention provides an attribute estimation device, an attribute estimation method, and a program capable of estimating the attributes of a user using the content distribution service using information available to the content distribution company. It is an object.

本発明の一態様は、複数のコンテンツそれぞれを提供したコンテンツ提供者を示す提供者情報と、属性推定対象のコンテンツ利用者の端末において再生された前記コンテンツを示すログデータとを用いて、属性推定対象の前記コンテンツ利用者の前記端末における前記コンテンツの再生回数を前記コンテンツ提供者毎に計数する再生回数計数部と、前記コンテンツ提供者毎の再生回数を入力してコンテンツ利用者の属性が各属性値である確率を表す値を出力するニューラルネットワークである推定モデルに、前記再生回数計数部が計数した前記コンテンツ提供者毎の前記再生回数を示す入力データを入力するモデル適用部と、前記入力データの入力に応じて前記推定モデルが出力した結果に基づいて属性推定対象の前記コンテンツ利用者の属性値を推定する推定部と、を備える属性推定装置である。 One aspect of the present invention is attribute estimation using provider information indicating a content provider who has provided each of a plurality of contents and log data indicating the content reproduced on the terminal of the content user to be attribute estimation target. A playback count counting unit that counts the number of times the content has been played on the terminal of the target content user for each content provider, and a content user attribute for each attribute by inputting the number of times played for each content provider. A model application unit that inputs input data indicating the number of times of reproduction for each content provider counted by the number of times of reproduction into an estimation model that is a neural network that outputs a value representing a probability of being a value, and the input data. This is an attribute estimation device including an estimation unit that estimates the attribute value of the content user to be attribute estimation based on the result output by the estimation model in response to the input of.

本発明の一態様は、上述の属性推定装置であって、前記ログデータは、前記コンテンツの再生日時の情報を含み、前記モデル適用部は、前記コンテンツ利用者の前記ログデータが示す第一期間における前記コンテンツ提供者毎の再生回数をそれぞれ前記第一期間よりも短い第二期間における再生回数に変換し、変換後の前記第二期間における前記コンテンツ提供者毎の再生回数から所定の再生回数分を選択したときの前記コンテンツ提供者毎の再生回数を前記入力データとする。 One aspect of the present invention is the attribute estimation device described above, in which the log data includes information on the playback date and time of the content, and the model application unit is the first period indicated by the log data of the content user. The number of playbacks for each content provider in the above is converted into the number of playbacks in the second period shorter than the first period, and the number of playbacks for each content provider in the second period after conversion is equal to the predetermined number of playbacks. The number of playbacks for each content provider when is selected is used as the input data.

本発明の一態様は、上述の属性推定装置であって、複数の前記コンテンツ利用者の前記端末における前記コンテンツの再生回数を前記コンテンツ提供者毎に計数し、計数した前記再生回数に基づいて前記コンテンツ提供者に順位を付与する順位決定部をさらに備え、前記モデル適用部は、順位が所定よりも高い前記コンテンツ提供者それぞれの前記再生回数と、順位が所定以下の前記コンテンツ提供者それぞれの前記再生回数を合計した合計再生回数とを前記入力データとする。 One aspect of the present invention is the attribute estimation device described above, in which the number of times the content is played on the terminal of a plurality of the content users is counted for each content provider, and the number of times the content is played is based on the counted number of times played. The model application unit further includes a ranking determination unit that assigns a ranking to the content providers, and the model application unit includes the number of times of playback of each of the content providers whose ranking is higher than a predetermined value, and the said of each of the content providers having a ranking of a predetermined value or less. The total number of playbacks, which is the total number of playbacks, is used as the input data.

本発明の一態様は、上述の属性推定装置であって、前記提供者情報及び学習用の複数のコンテンツ利用者それぞれの前記ログデータを用いて、学習用の複数の前記コンテンツ利用者それぞれの前記端末における前記コンテンツの再生回数を前記コンテンツ提供者毎に計数する学習用再生回数計数部と、学習用の複数の前記コンテンツ利用者それぞれの正解の属性値と、学習用の複数の前記コンテンツ利用者それぞれの前記コンテンツ提供者毎の再生回数を示す入力データとを用いて前記推定モデルを学習する学習処理部とをさらに備える。 One aspect of the present invention is the above-mentioned attribute estimation device, which uses the provider information and the log data of each of the plurality of content users for learning to use the log data of each of the plurality of content users for learning. A learning play count counting unit that counts the number of times the content has been played on the terminal for each content provider, correct attribute values of each of the plurality of learning content users, and a plurality of the learning content users. A learning processing unit that learns the estimation model using input data indicating the number of times of reproduction for each content provider is further provided.

本発明の一態様は、上述の属性推定装置であって、前記コンテンツは、動画であり、前記属性は、年齢である。 One aspect of the present invention is the attribute estimation device described above, wherein the content is a moving image, and the attribute is age.

本発明の一態様は、属性推定装置が実行する属性推定方法であって、前記属性推定装置が、複数のコンテンツそれぞれを提供したコンテンツ提供者を示す提供者情報と、属性推定対象のコンテンツ利用者の端末において再生された前記コンテンツを示すログデータとを用いて、属性推定対象の前記コンテンツ利用者の前記端末における前記コンテンツの再生回数を前記コンテンツ提供者毎に計数する再生回数計数ステップと、前記属性推定装置が、前記コンテンツ提供者毎の再生回数を入力してコンテンツ利用者の属性が各属性値である確率を表す値を出力するニューラルネットワークである推定モデルに、前記再生回数計数ステップにおいて計数された前記コンテンツ提供者毎の前記再生回数を示す入力データを入力するモデル適用ステップと、前記入力データの入力に応じて前記推定モデルが出力した結果に基づいて属性推定対象の前記コンテンツ利用者の属性値を推定する推定ステップと、を有する。 One aspect of the present invention is an attribute estimation method executed by an attribute estimation device, in which the attribute estimation device provides provider information indicating a content provider who has provided each of a plurality of contents, and a content user to be attribute-estimated. Using the log data indicating the content played on the terminal, the number-of-playback counting step of counting the number of times the content has been played on the terminal of the content user whose attributes are estimated for each content provider, and the above-mentioned The attribute estimation device counts in the playback count counting step in the estimation model, which is a neural network that inputs the number of playbacks for each content provider and outputs a value representing the probability that the attribute of the content user is each attribute value. Based on the model application step of inputting the input data indicating the number of times of playback for each content provider and the result output by the estimation model in response to the input of the input data, the content user whose attributes are estimated. It has an estimation step for estimating an attribute value.

本発明の一態様は、コンピュータを、上述したいずれかの属性推定装置として機能させるためのプログラムである。 One aspect of the present invention is a program for causing a computer to function as any of the above-mentioned attribute estimation devices.

本発明により、コンテンツ配信サービスを利用しているユーザの属性をコンテンツ配信業者が利用可能な情報を用いて推定することが可能となる。 INDUSTRIAL APPLICABILITY According to the present invention, it is possible to estimate the attributes of a user who is using a content distribution service by using information available to a content distribution company.

本発明の一実施形態による属性推定システムの構成図である。It is a block diagram of the attribute estimation system by one Embodiment of this invention. 同実施形態による動画視聴端末の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image viewing terminal by this embodiment. 同実施形態による動画視聴サーバの構成を示すブロック図である。It is a block diagram which shows the structure of the moving image viewing server by the same embodiment. 同実施形態によるデータ収集サーバの構成を示すブロック図である。It is a block diagram which shows the structure of the data collection server by the same embodiment. 同実施形態によるディープラーニング用サーバの構成を示すブロック図である。It is a block diagram which shows the structure of the server for deep learning by the same embodiment. 同実施形態による推定モデルの例を示す図である。It is a figure which shows the example of the estimation model by the same embodiment. 同実施形態による推定モデルへの入力データを示す図である。It is a figure which shows the input data to the estimation model by the same embodiment. 同実施形態による動画視聴回数の取得方法を示す図である。It is a figure which shows the acquisition method of the moving image viewing count by the same embodiment. 同実施形態による属性推定システムの推定モデル学習処理を示すフロー図である。It is a flow diagram which shows the estimation model learning process of the attribute estimation system by the same embodiment. 同実施形態による属性推定システムの推定処理を示すフロー図である。It is a flow chart which shows the estimation process of the attribute estimation system by the same embodiment. 同実施形態によるオーバーサンプリングの例を示す図である。It is a figure which shows the example of oversampling by the same embodiment.

以下、図面を参照しながら本発明の実施形態を詳細に説明する。本実施形態の属性推定装置は、コンテンツ配信業者が提供するコンテンツ配信サービスを利用しているユーザの属性を推定する。以下では、コンテンツ配信業者が配信するコンテンツは動画データ（以下、単に「動画」と記載）であり、推定するユーザの属性は年齢である場合を例に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. The attribute estimation device of the present embodiment estimates the attributes of the user who is using the content distribution service provided by the content distribution company. In the following, the content distributed by the content distributor is video data (hereinafter, simply referred to as “video”), and the case where the estimated user attribute is age will be described as an example.

図１は、本実施形態による属性推定システム１の構成図である。属性推定システム１は、動画視聴端末３と、動画視聴サーバ４と、データ収集サーバ５と、ディープラーニング用サーバ６とを備える。動画視聴端末３、動画視聴サーバ４及びデータ収集サーバ５は、ネットワーク９に接続される。ネットワーク９は、例えば、インターネットである。同図では、動画視聴サーバ４及びデータ収集サーバ５をそれぞれ１台ずつ示しているが、動画視聴サーバ４及びデータ収集サーバ５はそれぞれ複数台でもよい。また、同図では、データ収集サーバ５とディープラーニング用サーバ６とが直接接続されているが、ネットワーク９を介して接続されてもよい。また、動画視聴サーバ４、データ収集サーバ５及びディープラーニング用サーバ６の一部又は全てが、同一のコンピュータサーバにより実現されてもよい。 FIG. 1 is a configuration diagram of an attribute estimation system 1 according to the present embodiment. The attribute estimation system 1 includes a moving image viewing terminal 3, a moving image viewing server 4, a data collecting server 5, and a deep learning server 6. The video viewing terminal 3, the video viewing server 4, and the data collection server 5 are connected to the network 9. The network 9 is, for example, the Internet. In the figure, one video viewing server 4 and one data collection server 5 are shown, but a plurality of video viewing server 4 and data collection server 5 may be provided. Further, in the figure, the data collection server 5 and the deep learning server 6 are directly connected, but they may be connected via the network 9. Further, a part or all of the video viewing server 4, the data collection server 5, and the deep learning server 6 may be realized by the same computer server.

図２は、動画視聴端末３の構成を示すブロック図である。動画視聴端末３は、動画視聴者が利用するパーソナルコンピュータ、タブレット端末、スマートフォンなどのコンピュータ端末である。動画視聴者は、動画配信サービスを利用するユーザである。動画視聴端末３は、入力部３１と、動画視聴機能部３２と、出力部３３とを備える。入力部３１は、動画視聴者の操作を入力する。動画視聴機能部３２は、入力部３１を用いた動画視聴者の操作に従って、端末ＩＤ及び動画配信要求を動画視聴サーバ４に送信する。端末ＩＤは、動画視聴端末３を一意に識別する情報である。端末ＩＤには、動画視聴端末３に割り当てられたアドレス情報を用いることができる。また、端末ＩＤとして、ユーザＩＤを用いてもよい。ユーザＩＤは、動画視聴者を一意に識別する情報である。動画配信要求は、動画視聴者が配信を要求する対象の動画を特定する情報を含む。動画視聴機能部３２は、動画配信要求に対応して動画視聴サーバ４から配信された動画を出力部３３に出力する。出力部３３は、ディスプレイ及びスピーカである。 FIG. 2 is a block diagram showing the configuration of the moving image viewing terminal 3. The video viewing terminal 3 is a computer terminal such as a personal computer, a tablet terminal, or a smartphone used by a video viewer. The video viewer is a user who uses the video distribution service. The moving image viewing terminal 3 includes an input unit 31, a moving image viewing function unit 32, and an output unit 33. The input unit 31 inputs the operation of the moving image viewer. The video viewing function unit 32 transmits the terminal ID and the video distribution request to the video viewing server 4 according to the operation of the video viewer using the input unit 31. The terminal ID is information that uniquely identifies the moving image viewing terminal 3. The address information assigned to the moving image viewing terminal 3 can be used as the terminal ID. Moreover, you may use a user ID as a terminal ID. The user ID is information that uniquely identifies the video viewer. The video distribution request includes information that identifies the target video for which the video viewer requests distribution. The video viewing function unit 32 outputs the video distributed from the video viewing server 4 to the output unit 33 in response to the video distribution request. The output unit 33 is a display and a speaker.

図３は、動画視聴サーバ４の構成を示すブロック図である。動画視聴サーバ４は、動画配信業者が提供する動画配信サービスに用いられる。動画視聴サーバ４は、記憶部４１と、配信部４２と、ログ生成部４３とを備える。 FIG. 3 is a block diagram showing the configuration of the moving image viewing server 4. The video viewing server 4 is used for a video distribution service provided by a video distribution company. The video viewing server 4 includes a storage unit 41, a distribution unit 42, and a log generation unit 43.

記憶部４１は、投稿動画情報を記憶する。投稿動画情報は、動画ＩＤと、動画と、動画投稿者情報と、動画付加情報とを含む。動画ＩＤは、動画を一意に特定する情報である。動画投稿者情報は、動画投稿者ＩＤ、動画投稿者名などの情報を含む。動画投稿者ＩＤは、動画投稿者を一意に特定する情報である。動画投稿者は、動画配信サービスにおいて配信する動画を提供した個人、企業、団体等である。動画投稿者名と動画投稿者ＩＤとが同一であってもよい。動画付加情報は、動画に関する情報を示す。動画付加情報は、動画のタイトル、検索に用いられる動画の説明やキーワードなどを含む。 The storage unit 41 stores the posted video information. The posted video information includes a video ID, a video, video poster information, and video additional information. The moving image ID is information that uniquely identifies the moving image. The video contributor information includes information such as a video contributor ID and a video contributor name. The video contributor ID is information that uniquely identifies the video contributor. Video contributors are individuals, companies, groups, etc. who provided the videos to be distributed in the video distribution service. The video contributor name and the video contributor ID may be the same. The video additional information indicates information related to the video. The video additional information includes the title of the video, the description of the video used for the search, keywords, and the like.

配信部４２は、動画視聴端末３から端末ＩＤ及び動画配信要求を受信する。配信部４２は、動画配信要求により配信が要求された動画を記憶部４１から読み出し、動画配信要求の送信元の動画視聴端末３へ配信する。ログ生成部４３は、動画視聴ログを生成する。動画視聴ログは、動画視聴端末３の端末ＩＤと、動画視聴端末３に配信した動画の動画投稿者情報に含まれる動画ＩＤ、動画投稿者情報及び動画のタイトルと、視聴開始時刻及び視聴終了時刻とを含むデータである。ログ生成部４３は、生成した動画視聴ログをデータ収集サーバ５に送信する。 The distribution unit 42 receives the terminal ID and the video distribution request from the video viewing terminal 3. The distribution unit 42 reads the moving image requested to be distributed by the moving image distribution request from the storage unit 41 and distributes the moving image to the moving image viewing terminal 3 which is the transmission source of the moving image distribution request. The log generation unit 43 generates a moving image viewing log. The video viewing log includes the terminal ID of the video viewing terminal 3, the video ID included in the video poster information of the video distributed to the video viewing terminal 3, the video poster information and the video title, and the viewing start time and viewing end time. It is data including and. The log generation unit 43 transmits the generated video viewing log to the data collection server 5.

図４は、データ収集サーバ５の構成を示すブロック図である。データ収集サーバ５は、動画視聴ログを収集し、ディープラーニング用サーバ６に提供する。データ収集サーバ５は、ログ収集機能部５１と、ログ抽出機能部５２とを備える。ログ収集機能部５１は、動画視聴サーバ４から動画視聴ログを収集し、記憶する。ログ抽出機能部５２は、ログ収集機能部５１が記憶している動画視聴ログを読み出し、ディープラーニング用サーバ６へ出力する。 FIG. 4 is a block diagram showing the configuration of the data collection server 5. The data collection server 5 collects the video viewing log and provides it to the deep learning server 6. The data collection server 5 includes a log collection function unit 51 and a log extraction function unit 52. The log collection function unit 51 collects and stores the video viewing log from the video viewing server 4. The log extraction function unit 52 reads the moving image viewing log stored in the log collection function unit 51 and outputs it to the deep learning server 6.

図５は、ディープラーニング用サーバ６の構成を示すブロック図である。ディープラーニング用サーバ６は、ディープラーニングにより学習した推定モデルを用いて、動画視聴者の年代を推定する。推定モデルは、隠れ層が多段のニューラルネットワークである。入力層の各ノードは、全ての動画視聴端末３における動画視聴回数の順に動画投稿者を並べたときの順位に対応する。出力層の各ノードは、年代に対応する。 FIG. 5 is a block diagram showing the configuration of the deep learning server 6. The deep learning server 6 estimates the age of the video viewer using the estimation model learned by deep learning. The estimation model is a neural network with multiple hidden layers. Each node of the input layer corresponds to the order when the video contributors are arranged in the order of the number of times the video is viewed on all the video viewing terminals 3. Each node in the output layer corresponds to an age.

ディープラーニング用サーバ６は、ランキング作成処理部６１と、学習データ作成処理部６２と、ディープラーニング学習処理部６３と、学習モデル機能部６４と、年代推定機能部６５とを備える。ランキング作成処理部６１は、全ての動画視聴端末３の動画視聴ログに基づいて、全ての動画視聴端末３において各動画投稿者の動画が再生された回数である再生回数を計数する。なお、ランキング作成処理部６１は、全ての動画視聴端末３のうち一部をサンプリングし、サンプリングした動画視聴端末３の動画視聴ログに基づいて各動画投稿者の再生回数を計数してもよい。ランキング作成処理部６１は、再生回数が多い順に動画投稿者に順位を付与する。この順位を「ランキング」と記載する。学習データ作成処理部６２は、推定モデルのディープラーニングに用いるための学習データを作成する。ディープラーニング学習処理部６３は、学習データ作成処理部６２が作成した学習データを用いてディープラーニングにより推定モデルを学習する。学習モデル機能部６４は、年代推定対象の動画視聴者の入力データを生成し、学習済みの推定モデルに入力する。年代推定機能部６５は、推定モデルの出力結果に基づいて動画視聴者の年代を推定し、推定結果を出力する。 The deep learning server 6 includes a ranking creation processing unit 61, a learning data creation processing unit 62, a deep learning learning processing unit 63, a learning model function unit 64, and an age estimation function unit 65. The ranking creation processing unit 61 counts the number of playbacks, which is the number of times the video of each video contributor has been played on all the video viewing terminals 3, based on the video viewing logs of all the video viewing terminals 3. The ranking creation processing unit 61 may sample a part of all the video viewing terminals 3 and count the number of playbacks of each video contributor based on the sampled video viewing log of the video viewing terminal 3. The ranking creation processing unit 61 assigns rankings to video contributors in descending order of the number of playbacks. This ranking is referred to as "ranking". The training data creation processing unit 62 creates training data for use in deep learning of the estimation model. The deep learning learning processing unit 63 learns an estimation model by deep learning using the learning data created by the learning data creation processing unit 62. The learning model function unit 64 generates input data of the video viewer to be age-estimated and inputs it to the trained estimation model. The age estimation function unit 65 estimates the age of the moving image viewer based on the output result of the estimation model, and outputs the estimation result.

ディープラーニング用サーバ６を、ネットワークに接続される複数のコンピュータ装置により実現してもよい。この場合、ディープラーニング用サーバ６の各機能部を、これら複数のコンピュータ装置のいずれにより実現するかは任意とすることができる。例えば、ランキング作成処理部６１、学習データ作成処理部６２及びディープラーニング学習処理部６３と、学習モデル機能部６４及び年代推定機能部６５とをそれぞれ異なるコンピュータ装置により実現してもよい。 The deep learning server 6 may be realized by a plurality of computer devices connected to the network. In this case, which of these plurality of computer devices is used to realize each functional unit of the deep learning server 6 can be arbitrary. For example, the ranking creation processing unit 61, the learning data creation processing unit 62 and the deep learning learning processing unit 63, and the learning model function unit 64 and the age estimation function unit 65 may be realized by different computer devices.

図６は、推定モデルの例を示す図である。推定モデルとして用いられるニューラルネットワークは、入力層、隠れ層及び出力層からなる。図６に示す隠れ層は３層であるが、隠れ層の層の数は２層又は４層以上でもよい。 FIG. 6 is a diagram showing an example of an estimation model. The neural network used as an estimation model consists of an input layer, a hidden layer and an output layer. The number of hidden layers shown in FIG. 6 is three, but the number of hidden layers may be two or four or more.

入力層はＭ個のノードからなる。入力層の各ノードは、１からＭまでのラベルが付与される。ラベルの番号は、ランキングに対応している。ラベルｍ（ｍは１以上Ｍ−１以下の整数）のノードに入力される値は、動画視聴者がランキングｍの動画投稿者の動画を視聴した回数である。ラベルＭのノードに入力される値は、動画視聴者がランキングＭ以下の動画投稿者の動画を視聴した回数である。 The input layer consists of M nodes. Each node in the input layer is labeled from 1 to M. The label numbers correspond to the rankings. The value input to the node of the label m (m is an integer of 1 or more and M-1 or less) is the number of times the video viewer has viewed the video of the video contributor of the ranking m. The value input to the node of the label M is the number of times the video viewer has viewed the video of the video contributor having the ranking M or lower.

出力層はＫ個のノードからなる。出力層の各ノードは年代に対応する。例えば、出力層の各ノードは、１０歳未満、１０代、２０代、…、に対応する。なお、出力層の１つのノードが対応する年齢の幅は、１歳、２歳、５歳など任意とすることができる。また、各ノードが対応する年齢の幅は同じでもよく、一部又は全部が異なってもよい。出力層の各ノードの値は、ノードに対応する年代である確率を表す。 The output layer consists of K nodes. Each node in the output layer corresponds to the age. For example, each node in the output layer corresponds to under 10 years old, teens, 20s, .... The range of ages corresponding to one node of the output layer can be arbitrary, such as 1 year old, 2 years old, and 5 years old. In addition, the range of ages corresponding to each node may be the same, and some or all of them may be different. The value of each node in the output layer represents the probability of the age corresponding to the node.

図６に示す推定モデルにおいて、隠れ層の各層及び出力層は、全結合層である。入力層を第１層、隠れ層を第２〜４層、出力層を５層とした場合、ｌ層（ｌは２以上５以下の整数）の各ノードは（ｌ−１）層の全てのノード全てと結合している。ディープラーニングによる推定モデルの学習では、ｌ層の各ノードと、（ｌ−１）層の各ノードとの間の結合強度（重み）を決定する。 In the estimation model shown in FIG. 6, each layer of the hidden layer and the output layer are fully connected layers. When the input layer is the first layer, the hidden layer is the second to fourth layers, and the output layer is five layers, each node of the l layer (l is an integer of 2 or more and 5 or less) is all of the (l-1) layer. Combined with all nodes. In the learning of the estimation model by deep learning, the bond strength (weight) between each node of the l layer and each node of the (l-1) layer is determined.

図７は、推定モデルへの入力データを示す図である。ランキング作成処理部６１は、全ての動画視聴端末３の動画視聴ログを参照して、動画投稿者別の動画再生回数を集計する。ランキング作成処理部６１は、動画再生回数が多い順に動画投稿者にランキングを付与する。入力データとして、全ての動画投稿者それぞれの動画視聴回数を用いることが好ましいが、動画投稿者の数が大多数の場合、ディープラーニング用サーバ６のリソースが大量に消費されてしまう。また、動画視聴回数が多い動画投稿者の情報は主に大衆識別に有用であるが、動画視聴回数が少ない動画投稿者の情報はニッチな嗜好に密接に関係しており、個人識別に有用であると考えられる。そこで、ディープラーニング学習処理部６３は、推定モデルの学習に用いる入力データを生成する際、ランキングがしきい値Ｍ以下の動画投稿者にまとめてランキングＭを付与する。同図では、Ｍ＝１００１の例を示している。なお、しきい値Ｍは、任意に決定することができる。例えば、異なるしきい値Ｍ毎に推定モデルを学習し、正答率が所定以上の中で最小のＭを用いてもよい。しきい値Ｍは、全投稿者数に対する割合によって決まる値でもよく、固定値でもよい。 FIG. 7 is a diagram showing input data to the estimation model. The ranking creation processing unit 61 refers to the video viewing logs of all the video viewing terminals 3 and totals the number of video playbacks for each video contributor. The ranking creation processing unit 61 assigns rankings to video contributors in descending order of the number of times the video is played. It is preferable to use the number of times the video is viewed by all the video contributors as the input data, but when the number of video contributors is large, a large amount of resources of the deep learning server 6 are consumed. In addition, the information of video contributors who have a large number of video views is mainly useful for mass identification, but the information of video contributors who have a small number of video views is closely related to niche tastes and is useful for personal identification. It is believed that there is. Therefore, when the deep learning learning processing unit 63 generates the input data used for learning the estimation model, the deep learning learning processing unit 63 collectively assigns the ranking M to the video contributors whose ranking is the threshold value M or less. In the figure, an example of M = 1001 is shown. The threshold value M can be arbitrarily determined. For example, the estimation model may be trained for each different threshold value M, and the smallest M having a correct answer rate equal to or higher than a predetermined value may be used. The threshold value M may be a value determined by the ratio to the total number of contributors, or may be a fixed value.

上述したように、推定モデルへの入力データは、動画視聴者がランキング１〜Ｍそれぞれの動画投稿者の動画を視聴した回数である。ディープラーニング学習処理部６３及び学習モデル機能部６４は、動画視聴者の動画視聴ログに基づいて得られた各ランキングの動画投稿者の動画視聴回数を取得し、入力データとして用いる。すなわち、ランキングｍ（ｍは１以上Ｍ以下の整数）の動画投稿者の動画視聴回数をＰｍとしたとき、入力データは、（Ｐ１，Ｐ２，…，ＰＭ）と表される。Ｐｍは、入力層におけるラベルｍのノードへの入力値である。例えば、図７に示すように、ランキング１、２、３、…、９９８、９９９、１０００の動画投稿者がＡ、Ｂ、Ｃ、…、Ｘ、Ｙ、Ｚであるとする。そして、ある動画視聴端末３である端末ａの動画視聴ログから、動画投稿者Ａ、Ａ、Ａ、Ａ、Ａ、Ｂ、Ｂ、…、Ｘ、Ｘ、Ｘ、Ｙ、Ｚ、Ｚの動画を視聴したことが得られたとする。この場合、端末ａを用いて動画を視聴した動画視聴者の年代を推定するときの入力データは、（５，２，０，…，０，３，１，２，０）となる。 As described above, the input data to the estimation model is the number of times that the video viewer has viewed the video of each of the video contributors in the rankings 1 to M. The deep learning learning processing unit 63 and the learning model function unit 64 acquire the number of times the video poster of each ranking is viewed based on the video viewing log of the video viewer, and use it as input data. That is, the input data is represented as (P1, P2, ..., PM), where Pm is the number of times the video contributor in the ranking m (m is an integer of 1 or more and M or less) has viewed the video. Pm is an input value to the node of the label m in the input layer. For example, as shown in FIG. 7, it is assumed that the video contributors of rankings 1, 2, 3, ..., 998, 999, 1000 are A, B, C, ..., X, Y, Z. Then, from the video viewing log of the terminal a, which is a certain video viewing terminal 3, the videos of the video contributors A, A, A, A, A, B, B, ..., X, X, X, Y, Z, Z are displayed. Suppose you get what you watched. In this case, the input data for estimating the age of the moving image viewer who viewed the moving image using the terminal a is (5,2,0, ..., 0,3,1,2,0).

図８は、入力データに用いる動画視聴回数の取得方法を示す図である。図８（ａ）は、過去に動画視聴者が視聴した動画の動画投稿者を示す図である。同図では、実際には動画を視聴していない期間であっても、動画視聴者が同一の動画投稿者の動画を視聴した時間に挟まれている期間については、その動画投稿者の動画を視聴した期間に含めている。一般的に、動画視聴者は、同じ動画投稿者の動画を連続して見る傾向にある。例えば、動画視聴者は、１０話分の動画からなるドラマを視聴する場合、それら動画を連続して１日〜数日かけて連続して視聴することがある。そのため、動画視聴ログから過去１日〜数日分の動画視聴回数を取得して入力データを生成した場合、特定の動画投稿者の動画視聴回数が多くなってしまい、年代推定の精度が低下する可能性がある。 FIG. 8 is a diagram showing a method of acquiring the number of times of viewing a moving image used for input data. FIG. 8A is a diagram showing video contributors of videos viewed by video viewers in the past. In the figure, even if the video is not actually watched, the video of the video contributor is shown for the period between the time when the video viewer watched the video of the same video contributor. It is included in the viewing period. In general, video viewers tend to watch videos of the same video contributor in succession. For example, when a video viewer watches a drama composed of 10 episodes of video, the video viewer may continuously watch the video over one to several days. Therefore, when the number of times of video viewing for the past one to several days is acquired from the video viewing log and the input data is generated, the number of times of video viewing by a specific video contributor increases, and the accuracy of dating estimation deteriorates. there is a possibility.

そこで、ディープラーニング学習処理部６３及び学習モデル機能部６４は、例えば、長期間（期間Ｔ１とする）における動画視聴ログを、動画視聴回数を得る対象期間（期間Ｔ２＜期間Ｔ１）の単位にシャッフルする。図８（ｂ）は、シャッフル後の動画視聴ログを示す図である。シャッフルとは、期間Ｔ１の動画視聴ログに基づいて各動画投稿者の動画視聴回数の割合を算出し、算出した割合を維持したまま各動画投稿者の動画視聴回数を期間Ｔ２毎に分配することである。換言すれば、シャッフルとは、長期間（例えば、１年、半年、数か月などの期間Ｔ１）における各動画投稿者の動画視聴回数を、対象期間（例えば、１日、１週間などの期間Ｔ２）あたりの動画視聴回数に変換することである。 Therefore, the deep learning learning processing unit 63 and the learning model function unit 64 shuffle the video viewing log for a long period (term T1) in units of a target period (period T2 <period T1) for obtaining the number of video views. To do. FIG. 8B is a diagram showing a moving image viewing log after shuffling. Shuffle is to calculate the ratio of the number of video views of each video contributor based on the video viewing log of period T1, and distribute the number of video views of each video contributor for each period T2 while maintaining the calculated ratio. Is. In other words, shuffle refers to the number of video views of each video contributor over a long period (for example, period T1 such as one year, half a year, several months), and the target period (for example, one day, one week, etc.). It is to convert to the number of times of watching a moving image per T2).

例えば、動画投稿者Ａ及びＣの動画を多く視聴する人は３０代、動画投稿者Ｂの動画を多く視聴する人が２０代であるとする。このとき、図８（ａ）に示すシャッフル前の動画視聴ログから過去１日分の期間（期間Ｔ２）に動画視聴者が視聴した動画の情報をランダムに所定回数分取得し、その取得した動画の情報に基づいて入力データを作成したとする。この場合、実際には、動画視聴者は、動画投稿者Ｂの動画を多く視聴しているにもかかわらず、入力データでは動画投稿者Ａの動画の動画視聴回数が多くなってしまう。その結果、正解は３０代であるにも関わらず、２０代と推定されることがある。 For example, it is assumed that a person who watches a lot of videos of video contributors A and C is in his thirties, and a person who watches a lot of videos of video contributors B is in his twenties. At this time, information on the video viewed by the video viewer during the past one day (period T2) is randomly acquired for a predetermined number of times from the video viewing log before shuffling shown in FIG. 8A, and the acquired video is obtained. Suppose that the input data is created based on the information in. In this case, in reality, although the video viewer is watching a lot of the video of the video poster B, the number of times the video of the video poster A is viewed is large in the input data. As a result, the correct answer may be estimated to be in the twenties, even though it is in the thirties.

一方、図８（ｂ）に示すシャッフル後、過去１日分の動画視聴ログは、過去の長期間（期間Ｔ１）にわたる動画視聴回数の割合を反映したものとなる。よって、シャッフル後の動画視聴ログから過去１日分の期間（期間Ｔ２）に動画視聴者が視聴した動画の情報をランダムに所定回数分取得すると、動画投稿者Ａの動画の動画視聴回数よりも動画投稿者Ｂの動画の動画視聴回数が多くなる。この取得した動画の情報に基づいて入力データを作成すると、入力データは長期の視聴傾向を表したものとなるために２０代と推定される。このように、入力データの生成の際に、動画視聴ログの時間的なシャッフルを行うことで、入力データの偏りを失くし、推定精度を向上させることができる。 On the other hand, after shuffling shown in FIG. 8B, the video viewing log for the past one day reflects the ratio of the number of times the video is viewed over the past long period (period T1). Therefore, if the information of the video viewed by the video viewer in the past one day (period T2) is randomly acquired for a predetermined number of times from the video viewing log after shuffling, the number of times the video of the video poster A has been viewed is larger than the number of times the video is viewed. The number of times the video of the video poster B is viewed increases. When input data is created based on the acquired video information, it is estimated that the input data is in the twenties because it represents a long-term viewing tendency. In this way, by temporally shuffling the moving image viewing log when generating the input data, it is possible to eliminate the bias of the input data and improve the estimation accuracy.

図９は、属性推定システム１における推定モデル学習処理を示すフロー図である。学習データ作成処理において、ログ抽出機能部５２は、ログ収集機能部５１が収集し、記憶している全ての（又はサンプリングした）動画視聴端末３の動画視聴ログを取得する。このとき、ログ抽出機能部５２は、過去の所定期間分の動画視聴ログを取得してもよい。ログ抽出機能部５２は、取得した動画視聴ログをディープラーニング用サーバ６に出力する（ステップＳ１０５）。 FIG. 9 is a flow chart showing the estimation model learning process in the attribute estimation system 1. In the learning data creation process, the log extraction function unit 52 acquires the video viewing log of all (or sampled) video viewing terminals 3 collected and stored by the log collecting function unit 51. At this time, the log extraction function unit 52 may acquire the moving image viewing log for the past predetermined period. The log extraction function unit 52 outputs the acquired video viewing log to the deep learning server 6 (step S105).

ディープラーニング用サーバ６のランキング作成処理部６１は、ログ抽出機能部５２から受信した全ての動画視聴ログを参照して、各動画投稿者の動画が動画視聴端末３で視聴された回数を計数する。ランキング作成処理部６１は、計数した動画視聴回数が多い順に動画投稿者に順位を付与してランキングを作成する（ステップＳ１１０）。ランキング作成処理部６１は、動画投稿者の動画投稿者ＩＤとランキングの順位とを対応付けてディープラーニング学習処理部６３に出力する。 The ranking creation processing unit 61 of the deep learning server 6 refers to all the video viewing logs received from the log extraction function unit 52, and counts the number of times the video of each video contributor has been viewed on the video viewing terminal 3. .. The ranking creation processing unit 61 assigns rankings to video contributors in descending order of the number of video views counted, and creates a ranking (step S110). The ranking creation processing unit 61 associates the video poster ID of the video poster with the ranking ranking and outputs the data to the deep learning learning processing unit 63.

学習データ作成処理部６２は、動画視聴ログに対してディープラーニング用のデータ加工を行う（ステップＳ１１５）。例えば、学習データ作成処理部６２は、動画視聴ログからディープラーニングに不要なデータ（例えば、動画のタイトルなど）を削除する。学習データ作成処理部６２は、教師データとなる動画視聴端末３の端末ＩＤに、正解となる年代を表す情報を対応付けてディープラーニング学習処理部６３に出力する（ステップＳ１２０）。各動画視聴端末３の正解の年代の情報は、ディープラーニング用サーバ６の図示しない入力部により入力されてもよい。あるいは、学習データ作成処理部６２は、各動画視聴端末３の正解の年代の情報を、例えば、会員情報を記憶するデータベースサーバなどの他の装置から読み出してもよい。 The learning data creation processing unit 62 processes data for deep learning on the moving image viewing log (step S115). For example, the learning data creation processing unit 62 deletes data (for example, a video title) unnecessary for deep learning from the video viewing log. The learning data creation processing unit 62 associates the terminal ID of the video viewing terminal 3 which is the teacher data with the information indicating the age which is the correct answer, and outputs the data to the deep learning learning processing unit 63 (step S120). Information on the correct age of each moving image viewing terminal 3 may be input by an input unit (not shown) of the deep learning server 6. Alternatively, the learning data creation processing unit 62 may read information on the correct answer age of each moving image viewing terminal 3 from another device such as a database server that stores member information.

学習データ作成処理部６２は、動画視聴端末３毎に、動画ログ取得期間（例えば、１年、半年、数か月など）における動画投稿者別の動画再生回数を集計する。学習データ作成処理部６２は、ディープラーニング用に加工された動画視聴ログに、各動画投稿者の動画投稿者ＩＤと動画再生回数とを対応付けた情報を付加して学習データを生成する（ステップＳ１２５）。学習データを生成する対象の動画視聴端末３は、教師データとなる動画視聴端末３に加え、年代の情報が付与されていない動画視聴端末３も含む。学習データ作成処理部６２は、生成した学習データをディープラーニング学習処理部６３に出力する。 The learning data creation processing unit 62 totals the number of times of video playback for each video contributor during the video log acquisition period (for example, one year, half a year, several months, etc.) for each video viewing terminal 3. The learning data creation processing unit 62 generates learning data by adding information associated with the video contributor ID of each video contributor and the number of times the video has been played to the video viewing log processed for deep learning (step). S125). The video viewing terminal 3 for which the learning data is generated includes, in addition to the video viewing terminal 3 which is the teacher data, the video viewing terminal 3 to which the age information is not given. The learning data creation processing unit 62 outputs the generated learning data to the deep learning learning processing unit 63.

モデル作成処理において、ディープラーニング学習処理部６３は、処理に必要な情報の初期化を行う（ステップＳ２０５）。まず、ディープラーニング学習処理部６３は、変数ｉ及び変数ｊを値０により初期化する。変数ｉは推定モデルの学習回数を示す。変数ｊは、モデルの精度を表す値である。さらに、ディープラーニング学習処理部６３は、上限の学習回数Ｎ回、推定モデルの精度のしきい値ｐ％を取得する。ディープラーニング学習処理部６３は、ディープラーニング用サーバ６の図示しない入力部により入力された上限の学習回数Ｎ回、推定モデルの精度のしきい値ｐ％を取得してもよく、図示しない記憶部から読み出してもよい。 In the model creation process, the deep learning learning process unit 63 initializes the information required for the process (step S205). First, the deep learning learning processing unit 63 initializes the variable i and the variable j with the value 0. The variable i indicates the number of times the estimation model has been trained. The variable j is a value representing the accuracy of the model. Further, the deep learning learning processing unit 63 acquires the upper limit learning number N times and the threshold value p% of the accuracy of the estimation model. The deep learning learning processing unit 63 may acquire the upper limit of the number of learnings N times input by the input unit (not shown) of the deep learning server 6 and the threshold value p% of the accuracy of the estimation model, and is a storage unit (not shown). You may read from.

ディープラーニング学習処理部６３は、ランキングが下位の動画投稿者の足切りを行う（ステップＳ２０５）。すなわち、ディープラーニング学習処理部６３は、各動画投稿者ＩＤに対応付けられたランキングの情報を参照し、ランキングが上位の所定割合又は所定数の動画投稿者ＩＤを選択する。ディープラーニング学習処理部６３は、選択された（Ｍ−１）人分の動画投稿者ＩＤにはランキングの順位をラベルとして付与し、選択されなかった動画投稿者ＩＤの全てにラベルＭを付与する（ステップＳ２１０）。 The deep learning learning processing unit 63 cuts off the video contributors with lower rankings (step S205). That is, the deep learning learning processing unit 63 refers to the ranking information associated with each video contributor ID, and selects a predetermined ratio or a predetermined number of video contributor IDs having a higher ranking. The deep learning learning processing unit 63 assigns a ranking rank as a label to the video contributor IDs for the selected (M-1) people, and assigns a label M to all the video contributor IDs not selected. (Step S210).

ディープラーニング学習処理部６３は、学習データに含まれる長期間の動画視聴ログを動画視聴者毎にシャッフルし、動画視聴回数を取得する対象期間における偏りをなくす（ステップＳ２１５）。ディープラーニング学習処理部６３は、動画視聴者毎に、シャッフルされた動画視聴ログに基づいて入力データを生成する（ステップＳ２２０）。すなわち、ディープラーニング学習処理部６３は、動画視聴者毎に、シャッフルされた動画視聴ログの対象期間から視聴された動画の動画ＩＤをランダムに所定の視聴回数分取得し、取得した動画ＩＤに対応付けられた動画投稿者ＩＤのラベルを特定する。ディープラーニング学習処理部６３は、ラベル毎に動画視聴回数を計数し、ラベルの順に計数した動画視聴回数を並べて入力データとする。 The deep learning learning processing unit 63 shuffles the long-term video viewing log included in the learning data for each video viewer, and eliminates the bias in the target period for acquiring the number of video viewings (step S215). The deep learning learning processing unit 63 generates input data for each video viewer based on the shuffled video viewing log (step S220). That is, the deep learning learning processing unit 63 randomly acquires the video ID of the video viewed from the target period of the shuffled video viewing log for each video viewer for a predetermined number of viewing times, and corresponds to the acquired video ID. Identify the attached video contributor ID label. The deep learning learning processing unit 63 counts the number of times of viewing a moving image for each label, and arranges the number of times of viewing a moving image counted in the order of labels as input data.

ディープラーニング学習処理部６３は、教師データのデータ量が各世代で均一か否かを判断する（ステップＳ２２５）。例えば、ディープラーニング学習処理部６３は、教師データから各端末ＩＤに対応した年代の情報を読み出して年代別に端末ＩＤの数を計数し、教師データ数とする。ディープラーニング学習処理部６３は、年代別の教師データ数を比較し、他の年代よりも所定以上少ない年代があれば、均一ではないと判断する。 The deep learning learning processing unit 63 determines whether or not the amount of teacher data is uniform in each generation (step S225). For example, the deep learning learning processing unit 63 reads out the information of the age corresponding to each terminal ID from the teacher data, counts the number of terminal IDs for each age, and uses it as the number of teacher data. The deep learning learning processing unit 63 compares the number of teacher data for each age group, and determines that the number of teacher data is not uniform if there is an age group smaller than a predetermined age group.

ディープラーニング学習処理部６３は、均一ではないと判断した場合（ステップＳ２２５：ＮＯ）、他の年代と比較して教師データ数が所定以上少ない年代の教師データを増やすようにオーバーサンプリングを行う（ステップＳ２３０）。オーバーサンプリングでは、少数の教師データを元にして不足分の教師データを補完する。 When the deep learning learning processing unit 63 determines that the data is not uniform (step S225: NO), oversampling is performed so as to increase the teacher data of the age group in which the number of teacher data is smaller than a predetermined value as compared with other age groups (step S225: NO). S230). In oversampling, the shortage of teacher data is supplemented based on a small number of teacher data.

図１１は、オーバーサンプリングの例を示す図である。本実施形態では、統計手法を使用して均等に不足分の教師データを補完する。図１１（ａ）は、オーバーサンプリング前の教師データ（年代が既知の入力データ）のプロットを示す。なお図１１では、簡単のため２次元で示しているが、実際にはＭ次元にプロットされる。図１１（ａ）に示すように、年代Ａの教師データ数よりも、年代Ｂ、Ｃ、Ｄの教師データ数が所定以上少ない。そこで、ディープラーニング学習処理部６３は、年代Ｂ、Ｃ、Ｄそれぞれについて教師データを補完する。 FIG. 11 is a diagram showing an example of oversampling. In this embodiment, statistical methods are used to evenly supplement the shortage of teacher data. FIG. 11A shows a plot of teacher data (input data of known age) before oversampling. Although it is shown in two dimensions in FIG. 11 for simplicity, it is actually plotted in M dimensions. As shown in FIG. 11A, the number of teacher data of ages B, C, and D is smaller than a predetermined number of teacher data of age A. Therefore, the deep learning learning processing unit 63 complements the teacher data for each of the ages B, C, and D.

例えば、年代Ｂの教師データを補完する場合、ディープラーニング学習処理部６３は、年代Ｂの教師データのうち一つの教師データを選択し、選択した教師データの近傍にある年代Ｂの他の教師データとの間でランダムに入力データを生成する。この生成した入力データを年代Ｂの補完した教師データとする。ディープラーニング学習処理部６３は、補完前の年代Ｂの教師データ数及び補完した教師データの数の合計と、年代Ａの教師データ数との差が所定以内になるまで、補完前の年代Ｂの教師データを選択して、教師データの補完を行う。ディープラーニング学習処理部６３は、この処理を、年代Ｃ及び年代Ｄのそれぞれについて行う。図１１（ｂ）は、オーバーサンプリング後の教師データのプロットを示す。 For example, when complementing the teacher data of the age B, the deep learning learning processing unit 63 selects one of the teacher data of the age B, and the other teacher data of the age B in the vicinity of the selected teacher data. Randomly generate input data with. This generated input data is used as the complementary teacher data of the age B. The deep learning learning processing unit 63 of the age B before completion until the difference between the total number of teacher data and the number of supplemented teacher data of the age B before completion and the number of teacher data of age A is within a predetermined range. Select teacher data to complement the teacher data. The deep learning learning processing unit 63 performs this processing for each of the age C and the age D. FIG. 11B shows a plot of teacher data after oversampling.

図９に示すように、ディープラーニング学習処理部６３は、ステップＳ２２５において教師データのデータ量が各世代で均一と判断した場合（ステップＳ２２５：ＹＥＳ）又はオーバーサンプリングを行った後（ステップＳ２３０）、ステップＳ２３５の処理を行う。すなわち、ディープラーニング学習処理部６３は、教師データを１つ選択すると、選択した教師データを用いてディープラーニング手法により推定モデルを学習する（ステップＳ２３５）。具体的には、ディープラーニング学習処理部６３は、選択した教師データが示す入力データを現在の推定モデル入力し、出力層の各ノードの値を推定結果として得る。ディープラーニング学習処理部６３は、推定結果と、教師データが示す年代が正解であるときの出力層の値との差分を損失関数により算出する。教師データが示す年代が正解であるときの出力層の値は、正解の年代に対応したノードの値が１００％であり、正解ではない年代に対応したノードの値が０％である。ディープラーニング学習処理部６３は、データ量に応じて損失関数による算出結果を重み付けする重み付け損失関数を用いて、推定モデルのノード間の結合強度を更新する。 As shown in FIG. 9, when the deep learning learning processing unit 63 determines in step S225 that the amount of teacher data is uniform in each generation (step S225: YES) or after oversampling (step S230), The process of step S235 is performed. That is, when one teacher data is selected, the deep learning learning processing unit 63 learns the estimation model by the deep learning method using the selected teacher data (step S235). Specifically, the deep learning learning processing unit 63 inputs the input data indicated by the selected teacher data to the current estimation model, and obtains the value of each node of the output layer as the estimation result. The deep learning learning processing unit 63 calculates the difference between the estimation result and the value of the output layer when the age indicated by the teacher data is correct by the loss function. When the age indicated by the teacher data is correct, the value of the node corresponding to the correct age is 100%, and the value of the node corresponding to the non-correct age is 0%. The deep learning learning processing unit 63 updates the coupling strength between the nodes of the estimation model by using a weighted loss function that weights the calculation result by the loss function according to the amount of data.

ディープラーニング学習処理部６３は、学習回数ｉに１を加算し、損失関数の値ｊに直前のステップＳ２３０において算出された損失関数の値ｐ’を設定する（ステップＳ２４０）。ディープラーニング学習処理部６３は、学習回数ｉが上限の学習回数Ｎに達した、又は、損失関数の値ｊが精度のしきい値ｐ％以上となったかを判断する（ステップＳ２４５）。なお、ここでは、損失関数の値が大きいほど精度が良いことを表す。損失関数の値が小さいほど精度が良いことを表す場合、「損失関数の値ｊが精度のしきい値ｐ％以上となったか」に代えて、「損失関数の値ｊが精度のしきい値ｐ％以下となったか」を判定条件とする。 The deep learning learning processing unit 63 adds 1 to the number of learning times i and sets the value j of the loss function to the value p'of the loss function calculated in the immediately preceding step S230 (step S240). The deep learning learning processing unit 63 determines whether the learning number i has reached the upper limit learning number N, or the value j of the loss function has reached the accuracy threshold value p% or more (step S245). Here, the larger the value of the loss function, the better the accuracy. When the smaller the value of the loss function is, the better the accuracy is, instead of "whether the value j of the loss function is equal to or more than the threshold value p% of the accuracy", "the value j of the loss function is the threshold value of the accuracy". "Is it p% or less?" Is the judgment condition.

ディープラーニング学習処理部６３は、学習回数ｉが上限の学習回数Ｎに達しておらず、かつ、損失関数の値ｊが精度のしきい値ｐ％に満たないと判断した場合（ステップＳ２４５：ＹＥＳ）、ステップＳ２５０の半教師学習を行う（ステップＳ２５０）。すなわち、ディープラーニング学習処理部６３は、ステップＳ２２０において生成した入力データのうち、教師データではない（正解の年代が付与されていない）入力データを学習中の現在の推定モデルに入力し、出力層の値を推定結果として得る。ディープラーニング学習処理部６３は、出力層のノードのうち、いずれか一つのノードのみが所定の確率以上であることを示す推定結果が得られた場合、その推定結果が得られたときの入力データを教師データとして追加する。追加した教師データの正解の年代は、所定の確率以上の値が得られたノードに対応する年代である。例えば、推定モデルの出力層が１０代以下、２０代、３０代、４０代、５０代、６０代以上のそれぞれに対応した６ノードからなり、各ノードの値として９９％、０．１％、０．５％、０．２％、０．１％、０．１％が得られたとする。この場合、ディープラーニング学習処理部６３は、入力データを教師データに追加し、所定の確率以上の９９％が得られたノードに対応した１０代以下を正解の年代とする。ディープラーニング学習処理部６３は、ステップＳ２２５からの処理を繰り返す。 When the deep learning learning processing unit 63 determines that the number of learnings i has not reached the upper limit of the number of learnings N and the value j of the loss function is less than the threshold value p% of accuracy (step S245: YES). ), Semi-supervised learning in step S250 is performed (step S250). That is, the deep learning learning processing unit 63 inputs the input data that is not the teacher data (the correct age is not given) among the input data generated in step S220 to the current estimation model being trained, and the output layer. The value of is obtained as the estimation result. When the deep learning learning processing unit 63 obtains an estimation result indicating that only one of the nodes in the output layer has a predetermined probability or more, the input data at the time when the estimation result is obtained. Is added as teacher data. The correct age of the added teacher data is the age corresponding to the node for which a value equal to or higher than a predetermined probability is obtained. For example, the output layer of the estimation model consists of 6 nodes corresponding to each of teens or younger, 20s, 30s, 40s, 50s, 60s or older, and the values of each node are 99%, 0.1%, and so on. It is assumed that 0.5%, 0.2%, 0.1%, and 0.1% are obtained. In this case, the deep learning learning processing unit 63 adds the input data to the teacher data, and sets the age of the correct answer to the teenager or younger corresponding to the node in which 99% or more of the predetermined probability is obtained. The deep learning learning processing unit 63 repeats the processing from step S225.

ディープラーニング学習処理部６３は、学習回数ｉが上限の学習回数Ｎに達したか、損失関数の値ｊが精度のしきい値％以上であると判断した場合（ステップＳ２４５：ＹＥＳ）、学習した推定モデルを学習モデル機能部６４に出力する（ステップＳ２５５）。学習モデル機能部６４は、学習済みの推定モデルを保存する（ステップＳ２６０）。 When the deep learning learning processing unit 63 determines that the number of learnings i has reached the upper limit of the number of learnings N or the value j of the loss function is equal to or more than the threshold value% of accuracy (step S245: YES), the deep learning learning processing unit 63 has learned. The estimated model is output to the learning model function unit 64 (step S255). The learning model function unit 64 saves the trained estimation model (step S260).

図１０は、属性推定システム１における推定処理を示すフロー図である。動画視聴端末３は、ネットワーク９を経由して動画視聴サーバ４にアクセスする（ステップＳ３０５）。動画視聴端末３の動画視聴機能部３２は、動画視聴者が入力部３１により行った操作に従って、動画配信要求を動画視聴サーバ４に送信する。動画視聴サーバ４の配信部４２は、動画配信要求により配信が要求された動画を記憶部４１から読み出し、動画視聴端末３へ配信する。動画視聴端末３の動画視聴機能部３２は、動画視聴サーバ４から配信された動画を再生し、出力部３３に出力する。動画視聴サーバ４のログ生成部４３は、動画視聴端末３の端末ＩＤと、動画視聴端末３に配信した動画の動画ＩＤと、配信した動画の動画投稿者情報及び動画のタイトルと、視聴開始時刻及び視聴終了時刻とを含む動画視聴ログを生成し、データ収集サーバ５に転送する（ステップＳ３１０）。データ収集サーバ５のログ収集機能部５１は、転送された動画視聴ログを記憶する。 FIG. 10 is a flow chart showing the estimation process in the attribute estimation system 1. The video viewing terminal 3 accesses the video viewing server 4 via the network 9 (step S305). The video viewing function unit 32 of the video viewing terminal 3 transmits a video distribution request to the video viewing server 4 according to the operation performed by the video viewer by the input unit 31. The distribution unit 42 of the video viewing server 4 reads the video requested to be distributed by the video distribution request from the storage unit 41 and distributes it to the video viewing terminal 3. The video viewing function unit 32 of the video viewing terminal 3 reproduces the video distributed from the video viewing server 4 and outputs it to the output unit 33. The log generation unit 43 of the video viewing server 4 includes the terminal ID of the video viewing terminal 3, the video ID of the video distributed to the video viewing terminal 3, the video poster information and the video title of the distributed video, and the viewing start time. A moving image viewing log including the viewing end time and the viewing end time is generated and transferred to the data collection server 5 (step S310). The log collection function unit 51 of the data collection server 5 stores the transferred moving image viewing log.

ディープラーニング用サーバ６の学習モデル機能部６４は、データ収集サーバ５のログ収集機能部５１が収集した、年代推定対象の動画視聴者の動画視聴ログを参照し、年代推定が可能な程度の動画視聴ログが収集されたことを検出する（ステップＳ３１５）。例えば、動画視聴回数が所定以上に達した場合に年代推定可能となる。 The learning model function unit 64 of the deep learning server 6 refers to the video viewing log of the video viewer to be dating, which is collected by the log collection function unit 51 of the data collection server 5, and is capable of dating. It is detected that the viewing log has been collected (step S315). For example, when the number of times the video is viewed reaches a predetermined value or more, the age can be estimated.

学習モデル機能部６４は、年代推定対象の動画視聴者の動画視聴ログを用いて、ディープラーニング学習処理部６３と同様の処理により入力データを生成する。すなわち、学習モデル機能部６４は、動画ログ取得期間における動画投稿者別の動画再生回数を集計すると、動画視聴ログをシャッフルする。学習モデル機能部６４は、視聴された動画の動画ＩＤをランダムに所定の視聴回数分、シャッフルされた動画視聴ログの対象期間から取得し、取得した動画ＩＤに対応付けられた動画投稿者ＩＤのラベルを特定する。学習モデル機能部６４は、ラベル毎に動画視聴回数を計数し、ラベルの順に計数した動画視聴回数を並べて入力データとする（ステップＳ３２０）。学習モデル機能部６４は、入力データを学習済みの推定モデルの入力層に入力し、推定結果を算出する（ステップＳ３２５）。 The learning model function unit 64 generates input data by the same processing as the deep learning learning processing unit 63, using the video viewing log of the video viewer whose age is to be estimated. That is, the learning model function unit 64 shuffles the video viewing log when the number of video playbacks for each video contributor during the video log acquisition period is totaled. The learning model function unit 64 randomly acquires the video ID of the viewed video for a predetermined number of viewing times from the target period of the shuffled video viewing log, and the video poster ID associated with the acquired video ID. Identify the label. The learning model function unit 64 counts the number of times the video is viewed for each label, and arranges the number of times the video is viewed in the order of the labels as input data (step S320). The learning model function unit 64 inputs the input data to the input layer of the trained estimation model and calculates the estimation result (step S325).

年代推定機能部６５は、ステップＳ３２５において得られた算出結果に基づいて、動画視聴者の年代を推定する（ステップＳ３３０）。すなわち、年代推定機能部６５は、出力層の各ノードの値のうち、最も大きな値のノードに対応した年代を推定結果とする。年代推定機能部６５は、推定結果の年代を出力する。出力は、ディスプレイへの表示でもよく、ネットワークにより接続される装置への送信でもよく、記録媒体への記録でもよい。 The age estimation function unit 65 estimates the age of the video viewer based on the calculation result obtained in step S325 (step S330). That is, the age estimation function unit 65 sets the age corresponding to the node having the largest value among the values of each node of the output layer as the estimation result. The age estimation function unit 65 outputs the age of the estimation result. The output may be displayed on a display, transmitted to a device connected by a network, or recorded on a recording medium.

なお、上記では、動画視聴サーバ４が動画視聴ログを生成しているが、動画視聴端末３が動画視聴ログを生成可能である場合、データ収集サーバ５は、各動画視聴端末３から直接又は動画視聴サーバ４を介して動画視聴ログを収集してもよい。 In the above, the video viewing server 4 generates the video viewing log, but when the video viewing terminal 3 can generate the video viewing log, the data collection server 5 directly from each video viewing terminal 3 or the video. The moving image viewing log may be collected via the viewing server 4.

上述したように、属性推定システム１は、動画視聴者の動画視聴ログを蓄積する基盤を有している。属性推定システム１は、ディープラーニングを用いて動画視聴ログを解析し、動画視聴者の年代を推定する。ディープラーニングにより年代を推定する際には、動画視聴ログを入力データとして適した形にしなければならない。また、ディープラーニングでは教師データの学習の仕方も重要である。そこで、属性推定システム１は、全ての動画視聴ログに基づいて動画投稿者のランキングを算出する。属性推定システム１は、過去の動画視聴ログにおける動画投稿者の名前又は動画投稿者ＩＤをランキングの順位に置き換える。属性推定システム１は、年代推定対象の動画視聴者の動画投稿者別の動画視聴回数を、全動画視聴者の視聴ログに基づくランキング順に並べて推定モデルに入力する。このとき、属性推定システム１は、動画視聴ログをシャッフルして得られた動画視聴回数を推定モデルに入力する。属性推定システム１は、推定モデルを用いて推定した年代を表示する。 As described above, the attribute estimation system 1 has a base for accumulating the video viewing log of the video viewer. The attribute estimation system 1 analyzes the video viewing log using deep learning and estimates the age of the video viewer. When estimating the age by deep learning, the video viewing log must be in a suitable form as input data. In deep learning, how to learn teacher data is also important. Therefore, the attribute estimation system 1 calculates the ranking of video contributors based on all video viewing logs. The attribute estimation system 1 replaces the name of the video poster or the video poster ID in the past video viewing log with the ranking. The attribute estimation system 1 inputs the number of times of video viewing by the video contributor of the video viewer whose age is to be estimated into the estimation model by arranging them in the order of ranking based on the viewing logs of all the video viewers. At this time, the attribute estimation system 1 inputs the number of times of video viewing obtained by shuffling the video viewing log into the estimation model. The attribute estimation system 1 displays the age estimated using the estimation model.

属性推定システム１は、ディープラーニングにより学習した推定モデルを用いることで、動画視聴者の年代を推定することができる。また、属性推定システム１は、動画視聴ログをランキング形式で集計し、精度に影響を与える上位動画投稿者については個別の動画視聴回数を入力データに用い、それ以外の動画投稿者については動画視聴回数を合計する。これにより、学習に必要なリソースを削減することができる。また、属性推定システム１は、動画視聴ログを入力時にシャッフルするため、入力データの偏りを失くし、推定精度を高めることができる。よって、動画視聴者が動画を複数視聴することで、属性推定システム１は、その動画視聴者の年代の推定が可能となる。このように、動画配信サービスを提供している事業者は、動画視聴端末への動画配信のログを利用して、動画視聴者の年代を推定することが可能となる。 The attribute estimation system 1 can estimate the age of the moving image viewer by using the estimation model learned by deep learning. In addition, the attribute estimation system 1 aggregates the video viewing logs in a ranking format, uses the individual video viewing counts as input data for the top video contributors that affect accuracy, and video viewing for other video contributors. Add up the number of times. As a result, the resources required for learning can be reduced. Further, since the attribute estimation system 1 shuffles the moving image viewing log at the time of input, it is possible to eliminate the bias of the input data and improve the estimation accuracy. Therefore, when the moving image viewer views a plurality of moving images, the attribute estimation system 1 can estimate the age of the moving image viewer. In this way, the business operator providing the video distribution service can estimate the age of the video viewer by using the log of the video distribution to the video viewing terminal.

以上説明した実施形態によれば、属性推定装置は、再生回数計数部と、モデル適用部と、推定部とを備える。例えば、属性推定装置はディープラーニング用サーバ６であり、再生回数計数部及びモデル適用部は学習モデル機能部６４であり、推定部は年代推定機能部６５である。再生回数計数部は、複数のコンテンツそれぞれを提供したコンテンツ提供者を示す提供者情報と、属性推定対象のコンテンツ利用者の端末において再生されたコンテンツを示すログデータとを用いて、属性推定対象のコンテンツ利用者の端末におけるコンテンツの再生回数をコンテンツ提供者毎に計数する。モデル適用部は、コンテンツ提供者毎の再生回数を入力してコンテンツ利用者の属性が各属性値である確率を表す値を出力するニューラルネットワークである推定モデルに、再生回数計数部が計数したコンテンツ提供者毎の再生回数を示す入力データを入力する。推定部は、入力データの入力に応じて推定モデルが出力した結果に基づいて属性推定対象のコンテンツ利用者の属性値を推定する。 According to the embodiment described above, the attribute estimation device includes a reproduction count counting unit, a model application unit, and an estimation unit. For example, the attribute estimation device is a deep learning server 6, the reproduction count counting unit and the model application unit are the learning model function unit 64, and the estimation unit is the age estimation function unit 65. The play count counting unit uses the provider information indicating the content provider who provided each of the plurality of contents and the log data indicating the content played on the terminal of the content user whose attribute is estimated, to estimate the attribute. The number of times the content is played on the content user's terminal is counted for each content provider. The model application unit inputs the number of times played by each content provider and outputs a value representing the probability that the attribute of the content user is each attribute value. The content counted by the number of times played unit counts in the estimation model, which is a neural network. Input input data indicating the number of playbacks for each provider. The estimation unit estimates the attribute value of the content user whose attribute is to be estimated based on the result output by the estimation model in response to the input of the input data.

例えば、コンテンツは動画であり、コンテンツ提供者は動画投稿者であり、コンテンツ利用者は動画視聴者であり、提供者情報は動画投稿者情報であり、コンテンツ利用者の端末は動画視聴端末３であり、ログデータは動画視聴ログである。また、例えば、属性は年齢であり、属性値は年代である。 For example, the content is a video, the content provider is a video poster, the content user is a video viewer, the provider information is the video poster information, and the content user's terminal is the video viewing terminal 3. Yes, the log data is a video viewing log. Also, for example, the attribute is age and the attribute value is age.

なお、ログデータは、コンテンツの再生日時の情報を含んでもよい。モデル適用部は、コンテンツ利用者のログデータが示す第一期間（期間Ｔ１）におけるコンテンツ提供者毎の再生回数をそれぞれ第一期間よりも短い第二期間（期間Ｔ２）における再生回数に変換する。モデル適用部は、変換後の第二期間におけるコンテンツ提供者毎の再生回数から所定の再生回数分を選択したときのコンテンツ提供者毎の再生回数を入力データとする。 The log data may include information on the playback date and time of the content. The model application unit converts the number of times played by each content provider in the first period (period T1) indicated by the log data of the content user into the number of times played in the second period (period T2), which is shorter than the first period. The model application unit uses the number of playbacks for each content provider when a predetermined number of playbacks is selected from the number of playbacks for each content provider in the second period after conversion as input data.

属性推定装置は、順位決定部をさらに備えてもよい。順位決定部は、例えば、ランキング作成処理部６１である。順位決定部は、提供者情報と、複数のコンテンツ利用者のログデータとを用いて、複数のコンテンツ利用者の端末におけるコンテンツの再生回数をコンテンツ提供者毎に計数し、計数した再生回数に基づいてコンテンツ提供者に順位を付与する。モデル適用部は、順位が所定よりも高いコンテンツ提供者それぞれの再生回数と、順位が所定以下のコンテンツ提供者それぞれの再生回数を合計した合計再生回数とを入力データとする。 The attribute estimation device may further include a ranking determination unit. The ranking determination unit is, for example, a ranking creation processing unit 61. The ranking determination unit counts the number of times the content has been played on the terminals of a plurality of content users for each content provider using the provider information and the log data of a plurality of content users, and is based on the counted number of times played. Gives ranking to content providers. The model application unit uses input data as the number of times played by each content provider whose rank is higher than the predetermined value and the total number of times played by each content provider whose rank is lower than the predetermined value.

属性推定装置は、学習用再生回数計数部と、学習処理部とをさらに備えてもよい。例えば、学習用再生回数計数部及び学習処理部は、ディープラーニング学習処理部６３である。学習用再生回数計数部は、提供者情報及び学習用の複数のコンテンツ利用者それぞれのログデータを用いて、学習用の複数のコンテンツ利用者それぞれの端末におけるコンテンツの再生回数をコンテンツ提供者毎に計数する。学習処理部は、学習用の複数のコンテンツ利用者それぞれの正解の属性値と、学習用の複数のコンテンツ利用者それぞれのコンテンツ提供者毎の再生回数を示す入力データとを用いて、推定モデルを学習する。 The attribute estimation device may further include a learning reproduction count counting unit and a learning processing unit. For example, the learning reproduction count counting unit and the learning processing unit are the deep learning learning processing unit 63. The learning playback count counting unit uses the provider information and the log data of each of the plurality of content users for learning to determine the number of times the content has been played on each terminal of the plurality of content users for learning for each content provider. Count. The learning processing unit uses the attribute values of the correct answers of each of the plurality of content users for learning and the input data indicating the number of times of playback for each content provider of each of the plurality of content users for learning to generate an estimation model. learn.

上述した実施形態における動画視聴端末３、動画視聴サーバ４、データ収集サーバ５及びディープラーニング用サーバ６の機能は、コンピュータで実現される。その場合、これらの機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 The functions of the moving image viewing terminal 3, the moving image viewing server 4, the data collecting server 5, and the deep learning server 6 in the above-described embodiment are realized by a computer. In that case, a program for realizing these functions may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed. The term "computer system" as used herein includes hardware such as an OS and peripheral devices. Further, the "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, or a storage device such as a hard disk built in a computer system. Further, a "computer-readable recording medium" is a communication line for transmitting a program via a network such as the Internet or a communication line such as a telephone line, and dynamically holds the program for a short period of time. It may also include a program that holds a program for a certain period of time, such as a volatile memory inside a computer system that serves as a server or a client in that case. Further, the above-mentioned program may be a program for realizing a part of the above-mentioned functions, and may be a program for realizing the above-mentioned functions in combination with a program already recorded in the computer system.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiments of the present invention have been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and the design and the like within a range not deviating from the gist of the present invention are also included.

１…属性推定システム，３…動画視聴端末，４…動画視聴サーバ，５…データ収集サーバ，６…ディープラーニング用サーバ，９…ネットワーク，３１…入力部，３２…動画視聴機能部，３３…出力部，４１…記憶部，４２…配信部，４３…ログ生成部，５１…ログ収集機能部，５２…ログ抽出機能部，６１…ランキング作成処理部，６２…学習データ作成処理部，６３…ディープラーニング学習処理部，６４…学習モデル機能部，６５…年代推定機能部 1 ... Attribute estimation system, 3 ... Video viewing terminal, 4 ... Video viewing server, 5 ... Data collection server, 6 ... Deep learning server, 9 ... Network, 31 ... Input unit, 32 ... Video viewing function unit, 33 ... Output Unit, 41 ... Storage unit, 42 ... Distribution unit, 43 ... Log generation unit, 51 ... Log collection function unit, 52 ... Log extraction function unit, 61 ... Ranking creation processing unit, 62 ... Learning data creation processing unit, 63 ... Deep Learning learning processing unit, 64 ... learning model function unit, 65 ... age estimation function unit

本発明の一態様は、上述の属性推定装置であって、前記ログデータは、前記コンテンツの再生日時の情報を含み、前記再生回数計数部は、前記コンテンツ利用者の前記ログデータが示す第一期間における前記コンテンツ提供者毎の再生期間をそれぞれ前記第一期間よりも短い第二期間における再生割合に変換し、変換後の前記第二期間における前記コンテンツ提供者毎の再生割合からランダムに所定の再生回数分を選択したときの前記コンテンツ提供者毎の再生回数を計数する。 One aspect of the present invention is the attribute estimation device described above, wherein the log data includes information on the playback date and time of the content, and the playback count counting unit is the first indicated by the log data of the content user. The playback period for each content provider in the period is converted into a playback ratio in the second period shorter than the first period, and a predetermined value is randomly determined from the playback ratio for each content provider in the second period after conversion. The number of playbacks for each content provider when the number of playbacks is selected is counted .

Claims

Using the provider information indicating the content provider that provided each of the plurality of contents and the log data indicating the content played on the terminal of the content user of the attribute estimation target, the content user of the attribute estimation target A playback count counting unit that counts the number of playbacks of the content on the terminal for each content provider,
The content provider counted by the play count unit in an estimation model that is a neural network that inputs a play count for each content provider and outputs a value indicating the probability that the attribute of the content user is each attribute value. A model application unit that inputs input data indicating the number of times of reproduction for each
An estimation unit that estimates the attribute value of the content user to be attribute-estimated based on the result output by the estimation model in response to the input of the input data, and an estimation unit.
Attribute estimation device including.

The log data includes information on the playback date and time of the content.
The model application unit converts the number of times played by each content provider in the first period indicated by the log data of the content user into the number of times played in a second period shorter than the first period, and after conversion. The number of playbacks for each content provider when a predetermined number of playbacks is selected from the number of playbacks for each content provider in the second period is used as the input data.
The attribute estimation device according to claim 1.

Using the provider information and the log data of the plurality of content users, the number of times the content is played on the terminal of the plurality of content users is counted for each content provider, and the counted reproduction is performed. Further provided with a ranking determination unit that assigns a ranking to the content provider based on the number of times.
The model application unit uses the number of times played by each of the content providers having a higher rank than a predetermined number and the total number of times played by each of the content providers having a rank lower than a predetermined value as the input data. ,
The attribute estimation device according to claim 1 or 2.

Using the provider information and the log data of each of the plurality of content users for learning, the number of times the content is played on the terminal of each of the plurality of content users for learning is counted for each content provider. Playback count counting unit for learning and
The estimation model is learned using the attribute values of the correct answers of each of the plurality of content users for learning and the input data indicating the number of times of playback of each of the plurality of content users for learning for each content provider. Further equipped with a learning processing unit,
The attribute estimation device according to any one of claims 1 to 3.

The content is a moving image
The attribute is age,
The attribute estimation device according to any one of claims 1 to 4.

It is an attribute estimation method executed by the attribute estimation device.
The attribute estimation device uses the provider information indicating the content provider that provided each of the plurality of contents and the log data indicating the content reproduced on the terminal of the content user to be attribute estimation, and the attribute estimation target. The playback count counting step of counting the playback count of the content on the terminal of the content user for each content provider, and
In the playback count counting step, the attribute estimation device is a neural network in which the playback count for each content provider is input and a value representing the probability that the content user's attribute is each attribute value is output. A model application step for inputting input data indicating the number of times of playback for each of the counted content providers, and
An estimation step of estimating the attribute value of the content user to be attribute-estimated based on the result output by the estimation model in response to the input of the input data, and an estimation step.
Attribute estimation method with.

Computer,
The program for functioning as the attribute estimation device according to any one of claims 1 to 5.