JP6719399B2

JP6719399B2 - Analysis device, analysis method, and program

Info

Publication number: JP6719399B2
Application number: JP2017022775A
Authority: JP
Inventors: 江森　正; 正江森
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2017-02-10
Filing date: 2017-02-10
Publication date: 2020-07-08
Anticipated expiration: 2037-02-10
Also published as: JP2018128942A

Description

本発明は、解析装置、解析方法、およびプログラムに関する。 The present invention relates to an analysis device, an analysis method, and a program.

従来、過去の検索行動からデータの関連性を自動的に学習し、その学習を利用して将来の検索行動を助ける技術が知られている（例えば、特許文献１参照）。また、ニューラルネットワークを用いて、特徴量の抽出と、その特徴量を用いた機械学習とを一括で行う技術が知られている（例えば、特許文献２参照）。 2. Description of the Related Art Conventionally, a technique is known in which data relevance is automatically learned from past search behaviors and the learning is used to assist future search behaviors (for example, see Patent Document 1). Further, there is known a technique of collectively performing feature amount extraction and machine learning using the feature amount using a neural network (for example, see Patent Document 2).

特開２００６−２８５９８２号公報JP, 2006-285982, A 特表２０１０−５１９６４２号公報Japanese Patent Publication No. 2010-516642

しかしながら、従来の技術では、学習の精度が十分でない場合があった。 However, in the conventional technology, the accuracy of learning may not be sufficient.

本発明は、上記の課題に鑑みてなされたものであり、学習の精度を向上させることができる解析装置、解析方法、およびプログラムを提供することを目的としている。 The present invention has been made in view of the above problems, and an object of the present invention is to provide an analysis device, an analysis method, and a program that can improve learning accuracy.

本発明の一態様は、第１データを入力したときに、独立に構成しうる分類器が多層に構成された分類システムに含まれる第１分類器から出力される第２データが、第１教師データに近づくように、前記分類システムを学習させた後、前記第１データを入力したときに、前記第１分類器よりも階層が深い第２分類器から出力される第３データが、第２教師データに近づくように、前記第１教師データを用いた学習結果に基づいて前記分類システムを学習させる学習処理部と、前記学習処理部により学習させられた分類システムに基づいて、未学習のデータを所定のカテゴリーに分類する分類処理部と、を備える解析装置である。 According to one aspect of the present invention, when the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be independently configured are multi-layered is the first teacher. After training the classification system so as to approach the data, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier is the second data. Unlearned data based on the learning processing unit for learning the classification system based on the learning result using the first teacher data and the classification system learned by the learning processing unit so as to approach the teacher data. And a classification processing unit that classifies the items into a predetermined category.

本発明の一態様によれば、学習の精度を向上させることができる解析装置、解析方法、およびプログラムを提供することができる。 According to one aspect of the present invention, it is possible to provide an analysis device, an analysis method, and a program that can improve learning accuracy.

実施形態における解析装置１００を含む解析システム１の一例を示す図である。It is a figure showing an example of analysis system 1 containing analysis device 100 in an embodiment. 実施形態における解析装置１００の構成の一例を示す図である。It is a figure which shows an example of a structure of the analysis apparatus 100 in embodiment. 行動履歴情報１３２の一例を示す図である。It is a figure which shows an example of action history information 132. アンケートの一例を示す図である。It is a figure which shows an example of a questionnaire. アンケート情報１３４の一例を示す図である。It is a figure which shows an example of the questionnaire information 134. 分析情報１３６の一例を示す図である。It is a figure which shows an example of the analysis information 136. ＤＮＮ構成情報１３８の一例を示す図である。It is a figure which shows an example of DNN structure information 138. ＤＮＮ構成情報１３８に基づき生成されるディープニューラルネットワークＤＮＮを模式的に示す図である。It is a figure which shows typically the deep neural network DNN produced|generated based on the DNN structure information 138. 学習処理部１１４により実行される処理の一例を示すフローチャートである。6 is a flowchart showing an example of processing executed by a learning processing unit 114. 層毎パラメータ情報１４０の一例を示す図である。It is a figure which shows an example of the parameter information 140 for every layer. 図９に示すフローチャートのループ処理の内容を模式的に示す図である。It is a figure which shows typically the content of the loop process of the flowchart shown in FIG. 分類処理部１１６により実行される処理の一例を示すフローチャートである。7 is a flowchart showing an example of processing executed by the classification processing unit 116. 本実施形態におけるディープニューラルネットワークＤＮＮの学習手法と、比較例として例示する手法とのそれぞれの学習精度を示す図である。It is a figure which shows each learning precision of the learning method of the deep neural network DNN in this embodiment, and the method illustrated as a comparative example. 実施形態の解析装置１００のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the analysis apparatus 100 of embodiment.

以下、本発明を適用した解析装置、解析方法、およびプログラムを、図面を参照して説明する。 Hereinafter, an analysis device, an analysis method, and a program to which the present invention is applied will be described with reference to the drawings.

［概要］
解析装置は、一以上のプロセッサによって実現される。解析装置は、ある入力データｖ（第１データの一例）を入力したときに、独立に構成しうる分類器が多層に構成された分類システムに含まれる第１分類器から出力される出力データｈ（第２データの一例）が、第１教師データに近づくように、分類システムを学習させた後、入力データｖを入力したときに、第１分類器よりも階層が深い第２分類器から出力される出力データｈ（第３データの一例）が、第２教師データに近づくように、第１教師データを用いた学習結果に基づいて分類システムを学習させる。 [Overview]
The analysis device is realized by one or more processors. When inputting certain input data v (an example of the first data), the analysis device outputs the output data h output from the first classifier included in the classification system in which the classifiers that can be independently configured are included in the multilayer system. After learning the classification system so that (an example of the second data) approaches the first teacher data, when the input data v is input, the second classifier having a deeper hierarchy than the first classifier outputs The classification system is trained based on the learning result using the first teacher data so that the output data h (an example of the third data) to be output approaches the second teacher data.

例えば、「分類器」は、ニューラルネットワークであり、「分類システム」は、複数のニューラルネットワークにより構成されるディープニューラルネットワークＤＮＮである。この場合、「第１分類器」は、ディープニューラルネットワークＤＮＮの隠れ層に含まれる複数の階層のうち、任意の階層のニューラルネットワークであり、「第２分類器」は、隠れ層に含まれる複数の階層のうち、少なくとも第１分類器としたニューラルネットワークの階層よりも深い階層のニューラルネットワークである。「深い階層」とは、後述する出力層に近い階層であることを意味する。「第１分類器」および「第２分類器」のそれぞれは、ある一つの階層のニューラルネットワークを意味してもよいし、複数の階層のニューラルネットワークを意味してもよい。 For example, the "classifier" is a neural network, and the "classification system" is a deep neural network DNN composed of a plurality of neural networks. In this case, the “first classifier” is a neural network of an arbitrary layer among a plurality of layers included in the hidden layer of the deep neural network DNN, and the “second classifier” is a plurality of layers included in the hidden layer. Of the layers of (1) to (3), the neural network is a layer deeper than at least the layer of the neural network used as the first classifier. The “deep layer” means a layer close to the output layer described later. Each of the “first classifier” and the “second classifier” may mean a neural network of a certain layer or a neural network of a plurality of layers.

そして、解析装置は、学習させた分類システムに基づいて、未学習のデータを所定のカテゴリーに分類する。これによって、学習の精度を向上させることができる。以下、一例として、分類システムが、複数のニューラルネットワークにより構成されるディープニューラルネットワークＤＮＮであるものとして説明する。 Then, the analysis device classifies the unlearned data into a predetermined category based on the learned classification system. This can improve the accuracy of learning. Hereinafter, as an example, the classification system will be described as a deep neural network DNN configured by a plurality of neural networks.

［全体構成］
図１は、実施形態における解析装置１００を含む解析システム１の一例を示す図である。実施形態における解析システム１は、例えば、一以上の端末装置１０と、サービス提供装置２０と、ログ取得装置３０と、広告配信サーバ装置４０と、解析装置１００とを備える。これらの装置は、ネットワークＮＷを介して接続される。なお、解析システム１に含まれる複数の装置の一部または全部は、一つの解析装置１００内に集約されていてもよい。 [overall structure]
FIG. 1 is a diagram illustrating an example of an analysis system 1 including an analysis device 100 according to an embodiment. The analysis system 1 according to the embodiment includes, for example, one or more terminal devices 10, a service providing device 20, a log acquisition device 30, an advertisement distribution server device 40, and an analysis device 100. These devices are connected via the network NW. Note that some or all of the plurality of devices included in the analysis system 1 may be integrated in one analysis device 100.

図１に示す各装置は、ネットワークＮＷを介して種々の情報を送受信する。ネットワークＮＷは、例えば、無線基地局、Ｗｉ‐Ｆｉアクセスポイント、通信回線、プロバイダ、インターネットなどを含む。なお、図１に示す各装置の全ての組み合わせが相互に通信可能である必要はなく、ネットワークＮＷは、一部にローカルなネットワークを含んでもよい。 Each device shown in FIG. 1 transmits and receives various information via the network NW. The network NW includes, for example, a wireless base station, a Wi-Fi access point, a communication line, a provider, the Internet and the like. Note that it is not necessary that all combinations of the respective devices illustrated in FIG. 1 can communicate with each other, and the network NW may partially include a local network.

端末装置１０は、ユーザによって使用される装置である。端末装置１０は、例えば、スマートフォンなどの携帯電話、タブレット端末、パーソナルコンピュータなどのコンピュータ装置である。例えば、端末装置１０は、ユーザから所定の操作を受け付けると、予めインストールされたアプリケーションを介してサービス提供装置２０と通信を行い、アプリケーション上で表示或いは再生するコンテンツを取得してよい。コンテンツは、例えば、動画データや、画像データ、音声データ、テキストデータなどである。例えば、アプリケーションは、インターネットショッピングや検索サービスを享受可能なアプリケーションであってもよいし、ＳＮＳ（Social Networking Service）、メールサービス、情報提供サービス（例えばニュースや天気予報など）などを享受可能なアプリケーションであってもよい。 The terminal device 10 is a device used by a user. The terminal device 10 is, for example, a mobile phone such as a smartphone, a tablet terminal, or a computer device such as a personal computer. For example, when the terminal device 10 receives a predetermined operation from the user, the terminal device 10 may communicate with the service providing device 20 via an application installed in advance to acquire the content to be displayed or reproduced on the application. The content is, for example, moving image data, image data, audio data, text data, or the like. For example, the application may be an application that can enjoy Internet shopping or a search service, or an application that can enjoy an SNS (Social Networking Service), a mail service, an information providing service (for example, news or weather forecast). It may be.

また、端末装置１０は、ユーザから所定の操作を受け付けると、所定のウェブブラウザを介して、サービス提供装置２０が提供するウェブサイトにアクセスしてもよい。例えば、サービス提供装置２０により提供されるウェブサイトでは、上述した各種アプリケーションにより提供されるサービスと同様のサービスが提供される。 In addition, when the terminal device 10 receives a predetermined operation from the user, the terminal device 10 may access the website provided by the service providing device 20 via a predetermined web browser. For example, the website provided by the service providing device 20 provides the same services as the services provided by the various applications described above.

サービス提供装置２０は、インターネット上において、ショッピングサイトや検索サイト等のウェブサイトを提供するウェブサーバ装置であってよいし、アプリケーションが起動された端末装置１０と通信を行って、各種情報の受け渡しを行うアプリケーションサーバ装置であってもよい。 The service providing device 20 may be a web server device that provides a website such as a shopping site or a search site on the Internet, or communicates with the terminal device 10 in which an application is activated to transfer various types of information. It may be an application server device that executes.

ログ取得装置３０は、例えば、端末装置１０またはサービス提供装置２０から、ウェブブラウザごとに管理されるクッキーやキャッシュ、ＨＴＴＰ（Hypertext Transfer Protocol）リクエストのログファイルなどを収集することで、端末装置１０を利用するユーザの行動履歴を取得する。行動履歴とは、例えば、インターネットショッピングでの購買履歴（購入した商品またはサービスの種類や金額、個数など）や、検索サイトでの検索履歴（使用された検索エンジンや入力された検索クエリなど）、動画像サイトでの視聴履歴（動画像の種類や視聴時間など）、どういったウェブページからのアクセスなのかを識別するＨＴＴＰリファラ（ＵＲＬの履歴）などである。 The log acquisition device 30 collects a cookie or a cache managed for each web browser, a log file of an HTTP (Hypertext Transfer Protocol) request, or the like from the terminal device 10 or the service providing device 20, and thereby the Acquire the behavior history of the user to use. The action history is, for example, purchase history in Internet shopping (type, price, number of purchased products or services, etc.), search history in search sites (search engine used, search query entered, etc.), It includes a viewing history on the moving image site (type of moving image, viewing time, etc.), an HTTP referrer (URL history) for identifying what kind of web page is accessed.

また、ログ取得装置３０は、ウェブブラウザを介したインターネット上でのユーザの行動履歴と同様に、端末装置１０において起動されたアプリケーションごとの購買履歴や検索履歴、視聴履歴などをユーザの行動履歴として取得してよい。 Further, the log acquisition device 30 uses the purchase history, the search history, the viewing history, and the like for each application activated in the terminal device 10 as the user's action history, similarly to the user's action history on the Internet via a web browser. You may get it.

また、例えば、端末装置１０で起動されるアプリケーションが、端末装置１０の位置情報（例えばＧＰＳ（Global Positioning System）の測位座標）を利用しながら、近くの実店舗を紹介したり、その地域の気象情報を提供したりするような各種サービスを提供する場合、ログ取得装置３０は、端末装置１０に蓄積された位置情報を、ユーザの行動履歴として取得してもよい。 In addition, for example, an application started on the terminal device 10 introduces a nearby real shop or uses the location information of the terminal device 10 (for example, positioning coordinates of GPS (Global Positioning System)) or weather in the area. When providing various services such as providing information, the log acquisition device 30 may acquire the position information accumulated in the terminal device 10 as the user's action history.

ログ取得装置３０に取得されるユーザの行動履歴は、端末装置１０毎であってもよいし、一つの端末装置１０におけるＯＳ（Operating System）単位でのアカウント毎であってもよいし、ウェブブラウザまたはアプリケーション単位でのアカウント毎であってもよい。 The action history of the user acquired by the log acquisition device 30 may be for each terminal device 10, may be for each account in OS (Operating System) in one terminal device 10, or may be a web browser. Alternatively, it may be set for each account in application units.

なお、ログ取得装置３０は、サービス提供装置２０からユーザの行動履歴を取得するのに代えて、或いは加えて、他のサービス提供装置（不図示）からユーザの行動履歴を取得してもよい。ログ取得装置３０は、取得したユーザの行動履歴を示す情報を、解析装置１００に送信する。 Note that the log acquisition device 30 may acquire the user's action history from another service providing device (not shown) instead of or in addition to acquiring the user's action history from the service providing device 20. The log acquisition device 30 transmits the acquired information indicating the behavior history of the user to the analysis device 100.

広告配信サーバ装置４０は、例えば、解析装置１００による解析結果に基づいて、サービス提供装置２０が端末装置１０に対してサービスを提供するのに伴って広告を配信する。例えば、端末装置１０が、サービス提供装置２０から広告枠を含むウェブページまたはこれに相当するアプリケーション用のスタイルシートを受信すると、広告枠に予め設けられたアドタグまたはＳＤＫ（Software Development Kit）に基づいて、広告配信サーバ装置４０に広告配信リクエストを送信する。これを受けて、広告配信サーバ装置４０は、予め契約しておいた広告依頼主のウェブサイトなどにアクセスすることが可能な広告画像（例えばＵＲＬが埋め込まれた画像）を広告配信リクエストに対するレスポンスとして端末装置１０に送信する。これによって、端末装置１０の画面には、広告画像が表示される。 The advertisement distribution server device 40 distributes an advertisement as the service providing device 20 provides a service to the terminal device 10, based on the analysis result by the analysis device 100, for example. For example, when the terminal device 10 receives a web page including an advertisement frame or a style sheet for an application corresponding thereto from the service providing device 20, based on an ad tag or SDK (Software Development Kit) provided in advance in the advertisement frame. , And sends an advertisement distribution request to the advertisement distribution server device 40. In response to this, the advertisement distribution server device 40 uses, as a response to the advertisement distribution request, an advertisement image (for example, an image in which a URL is embedded) capable of accessing the website of the advertisement client who has contracted in advance. It is transmitted to the terminal device 10. As a result, the advertisement image is displayed on the screen of the terminal device 10.

解析装置１００は、ディープニューラルネットワークＤＮＮを用いて、ログ取得装置３０により取得されたユーザの行動履歴を基に、その行動履歴に対応するユーザを所定のカテゴリーに分類する。上述したように、ディープニューラルネットワークＤＮＮは、少なくとも２層以上の階層を含むニューラルネットワークである。そして、解析装置１００は、ユーザごとのカテゴリーの分類結果を、例えば広告配信サーバ装置４０に提供する。これによって、広告配信サーバ装置４０は、分類されたカテゴリーに応じて広告の配信内容を変更することができる。 The analysis device 100 uses the deep neural network DNN to classify the user corresponding to the action history into a predetermined category based on the user's action history acquired by the log acquisition device 30. As described above, the deep neural network DNN is a neural network including at least two layers. Then, the analysis device 100 provides the classification result of the category for each user to, for example, the advertisement distribution server device 40. Thereby, the advertisement distribution server device 40 can change the distribution content of the advertisement according to the classified categories.

［解析装置の構成］
以下、図を参照して解析装置１００の構成について説明する。図２は、実施形態における解析装置１００の構成の一例を示す図である。図示のように、解析装置１００は、例えば、受付部１０２と、表示部１０４と、通信部１０６と、制御部１１０と、記憶部１３０とを備える。 [Analysis device configuration]
Hereinafter, the configuration of the analysis device 100 will be described with reference to the drawings. FIG. 2 is a diagram illustrating an example of the configuration of the analysis device 100 according to the embodiment. As illustrated, the analysis device 100 includes, for example, a reception unit 102, a display unit 104, a communication unit 106, a control unit 110, and a storage unit 130.

受付部１０２は、ユーザからの操作入力を受け付けるタッチパネルやキーボード、マウスなどの入力インターフェースである。表示部１０４は、液晶表示装置などの表示装置である。 The reception unit 102 is an input interface such as a touch panel, a keyboard, and a mouse that receives an operation input from the user. The display unit 104 is a display device such as a liquid crystal display device.

通信部１０６は、例えば、ＮＩＣ（Network Interface Card）等の通信インターフェースやＤＭＡ（Direct Memory Access）コントローラを含む。通信部１０６は、ネットワークＮＷを介して、ログ取得装置３０または広告配信サーバ装置４０などと通信する。例えば、通信部１０６は、ログ取得装置３０からユーザの行動履歴を受信し、これを行動履歴情報１３２として記憶部１３０に記憶させる。 The communication unit 106 includes, for example, a communication interface such as a NIC (Network Interface Card) and a DMA (Direct Memory Access) controller. The communication unit 106 communicates with the log acquisition device 30, the advertisement distribution server device 40, or the like via the network NW. For example, the communication unit 106 receives the user's action history from the log acquisition device 30 and stores it in the storage unit 130 as the action history information 132.

図３は、行動履歴情報１３２の一例を示す図である。図示の例のように、行動履歴情報１３２は、時刻に対して、端末装置１０の位置がどの住所に対応するのかが対応付けられていたり、使用された検索エンジン名とその検索エンジンに入力された検索クエリとが対応付けられていたり、閲覧されたウェブサイトのタイトル名とそのウェブサイトのＵＲＬとが対応付けられていたりする情報である。また、行動履歴情報１３２は、例えば、時刻に対して、購入された商品名またはサービス名と、これらが購入されたウェブサイトのＵＲＬとが対応付けられた情報であってよい。当然のことながら、ユーザは、ある事象に対して、想定し得る全ての行動を取るわけではなく、１つ又は２、３程度、多くとも数十から数百程度の行動を取るのが常である。例えば、ショッピングサイトで販売される商品のうち、一人のユーザが全ての商品を購入することは極めてまれであり、通常は、幾つかの商品を購入するだけである。従って、行動履歴情報１３２が示すユーザの行動履歴は、想定し得る全ての行動履歴を網羅的に含んでいないという意味で、情報量の乏しいスパース（疎）なデータとなる傾向がある。 FIG. 3 is a diagram showing an example of the action history information 132. As in the illustrated example, the action history information 132 is associated with the time to which address the position of the terminal device 10 corresponds to, or is input to the search engine name used and its search engine. It is the information that is associated with the search query, or that the title name of the browsed website and the URL of the website are associated with each other. Further, the action history information 132 may be, for example, information in which the purchased product name or service name is associated with the URL of the website from which they were purchased with respect to the time. As a matter of course, the user does not take all imaginable actions with respect to a certain event, but usually takes one or a few actions, at most tens to hundreds actions. is there. For example, it is extremely rare that one user purchases all the products out of the products sold at a shopping site, and usually only purchases some products. Therefore, the behavior history of the user indicated by the behavior history information 132 tends to be sparse data with a small amount of information in the sense that it does not comprehensively include all possible behavior history.

制御部１１０は、例えば、取得部１１２と、学習処理部１１４と、分類処理部１１６とを備える。これらの構成要素は、例えば、ＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）などのプロセッサが記憶部１３０に格納されたプログラムを実行することにより実現される。また、制御部１１０の構成要素の一部または全部は、ＬＳＩ（Large Scale Integration）、ＡＳＩＣ（Application Specific Integrated Circuit）、またはＦＰＧＡ（Field-Programmable Gate Array）などのハードウェアにより実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。 The control unit 110 includes, for example, an acquisition unit 112, a learning processing unit 114, and a classification processing unit 116. These components are realized, for example, by a processor such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) executing a program stored in the storage unit 130. Further, some or all of the constituent elements of the control unit 110 may be realized by hardware such as an LSI (Large Scale Integration), an ASIC (Application Specific Integrated Circuit), or an FPGA (Field-Programmable Gate Array). , May be realized by the cooperation of software and hardware.

記憶部１３０は、例えば、ＨＤＤ（Hard Disc Drive）、フラッシュメモリ、ＥＥＰＲＯＭ（Electrically Erasable Programmable Read Only Memory）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）などにより実現される。記憶部１３０は、ファームウェアやアプリケーションプログラムなどの各種プログラムや上述した行動履歴情報１３２の他に、アンケート情報１３４、分析情報１３６、ＤＮＮ構成情報１３８、および層毎パラメータ情報１４０などを記憶する。これらの情報については後述する。 The storage unit 130 is realized by, for example, an HDD (Hard Disc Drive), a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The storage unit 130 stores various programs such as firmware and application programs and the action history information 132 described above, as well as questionnaire information 134, analysis information 136, DNN configuration information 138, and parameter information 140 for each layer. These pieces of information will be described later.

取得部１１２は、例えば、通信部１０６を用いて外部サーバ（不図示）と通信することで、この外部サーバに記憶されたアンケート（サーベイ）の結果を示す情報を取得し、これをアンケート情報１３４として記憶部１３０に記憶させる。 The acquisition unit 112 acquires information indicating the result of the questionnaire (survey) stored in the external server by communicating with an external server (not shown) using the communication unit 106, and acquires the information from the questionnaire information 134. Is stored in the storage unit 130.

なお、サービス提供装置２０がアンケートに回答可能なアンケートモニターサイト、またはアンケートモニターのサービスをアプリケーションを介して提供する場合、取得部１１２は、通信部１０６を用いてサービス提供装置２０からアンケート情報１３４を取得してもよい。また、アンケート情報１３４は、予め記憶部１３０に記憶されていてもよい。 When the service providing apparatus 20 provides a questionnaire monitor site or a questionnaire monitor service capable of answering a questionnaire via an application, the acquisition unit 112 uses the communication unit 106 to obtain the questionnaire information 134 from the service providing apparatus 20. You may get it. The questionnaire information 134 may be stored in the storage unit 130 in advance.

また、例えば、解析装置１００を管理する管理者などが受付部１０２に対してアンケートの結果を入力する場合、取得部１１２は、受付部１０２に対して入力された情報をアンケート情報１３４としてもよい。 Further, for example, when the administrator or the like who manages the analysis device 100 inputs the result of the questionnaire to the reception unit 102, the acquisition unit 112 may use the information input to the reception unit 102 as the questionnaire information 134. ..

図４は、アンケートの一例を示す図である。図示の例のように、アンケートは、定型化された調査項目（図中Ｑ１〜４等）に対して、予め決められた選択肢が用意されている。図示の例では、５つの選択肢が用意されているがこれに限られず、２つ、３つ、または４つであってもよいし、６つ以上であってもよい。 FIG. 4 is a diagram showing an example of the questionnaire. As in the illustrated example, the questionnaire is prepared with predetermined options for the standardized survey items (Q1 to 4 etc. in the figure). In the illustrated example, five options are prepared, but the number is not limited to this, and may be two, three, four, or six or more.

図５は、アンケート情報１３４の一例を示す図である。図示の例のように、アンケート情報１３４は、例えば、調査項目に対して、ユーザにより選択された選択肢に応じたスコアが対応付けられた情報である。アンケート情報１３４のユーザと、行動履歴情報１３２のユーザは、少なくとも一部が同一であるものとする。すなわち、アンケートの回答者であるユーザの行動履歴は、行動履歴情報１３２として取得されているものとする。 FIG. 5 is a diagram showing an example of the questionnaire information 134. As in the illustrated example, the questionnaire information 134 is, for example, information in which a score corresponding to the option selected by the user is associated with the survey item. At least part of the user of the questionnaire information 134 and the user of the action history information 132 are the same. That is, the action history of the user who is the respondent to the questionnaire is acquired as the action history information 132.

例えば、上述したアンケートにおいて、調査項目に対して肯定的な回答選択肢ほどスコアが高く設定されている。例えば「そうは思わない」や「全くそうは思わない」といった否定的な回答選択肢に比べて、「非常にそう思う」や「ややそう思う」といった肯定的な回答選択肢の方がスコアは高く設定される。なお、この関係は一例であり、調査項目に対して否定的な回答選択肢ほどスコアが高く設定されてもよい。 For example, in the above-mentioned questionnaire, the higher the answer option for the survey item, the higher the score. For example, a positive answer option such as "I think so" or "I think so" has a higher score than a negative answer option such as "I don't think so" or "I don't think so at all". To be done. Note that this relationship is an example, and a negative answer option for a survey item may be set to have a higher score.

また、取得部１１２は、上述したアンケートの分析結果を示す情報を取得し、これを分析情報１３６として記憶部１３０に記憶させる。例えば、アンケート情報１３４の取得元である装置が分析を行う場合には、取得部１１２は、分析を行った装置から分析情報１３６を取得してよい。また、例えば、アンケート結果を分析するアナリストなどが受付部１０２に対して分析結果を入力する場合、取得部１１２は、受付部１０２に対して入力された情報を分析情報１３６としてもよい。 Further, the acquisition unit 112 acquires the information indicating the analysis result of the above-described questionnaire and stores it in the storage unit 130 as the analysis information 136. For example, when the device from which the questionnaire information 134 is acquired performs the analysis, the acquisition unit 112 may acquire the analysis information 136 from the device that has performed the analysis. Further, for example, when an analyst or the like who analyzes the questionnaire result inputs the analysis result to the reception unit 102, the acquisition unit 112 may use the information input to the reception unit 102 as the analysis information 136.

例えば、アンケート結果の分析は、（１）アンケートに含まれる各調査項目をどのように設計したのか、（２）どういった分析手法で定量的に評価するのか、（３）分析の結果としてアンケートに回答したユーザをどう分類するのか、といった事項を考慮して行われる。例えば、（１）については、調査項目に対する回答方法が、選択肢形式であるのか或いは記述形式であるのか、選択肢形式であるならば選択肢は文章で表現されているのか或いは数値で表現されているのか、といったことを考慮する必要がある。また、（２）については、ユーザの回答に応じたスコアを用いて、例えば、因子分析を行うことで、アンケートに回答したユーザの特性に寄与する因子が何であるのか、或いはユーザの特性において各因子がどの程度寄与するのか、といったことを考慮する必要がある。また、因子分析の代わりに主成分分析を行う場合、ユーザの特性を表しているであろう複数の変数から、どの変数が主成分となるのかを考慮する必要がある。また、（３）については、最終的な分析結果として、アンケートに回答した複数のユーザのそれぞれの特性から、どのユーザとどのユーザを同じ傾向と見做すのか、或いはユーザの分類先のカテゴリーは各ユーザに対して一つとするのか複数とするのかを考慮する必要がある。このように、アンケート結果を分析するアナリストなどが、アンケートを行うことに至った目的から適切に分析し、アンケートに回答したユーザをある程度の粒度で分類するための統計的な分類指標となるパラメータを導出する必要がある。このようなパラメータは、アナリストなどが適切に設定する必要のあるハイパーパラメータとして扱われる。 For example, in the analysis of the questionnaire results, (1) how the survey items included in the questionnaire were designed, (2) what kind of analysis method to quantitatively evaluate, and (3) the questionnaire as a result of the analysis. It is carried out in consideration of matters such as how to classify the users who answered to. For example, with respect to (1), whether the response method to the survey item is the option format or the description format, and if it is the option format, is the option expressed in text or numerical value? , Etc. need to be considered. Regarding (2), by using a score according to the user's answer, for example, by performing factor analysis, what is the factor that contributes to the characteristic of the user who answered the questionnaire, or in the characteristic of the user, It is necessary to consider how much the factors contribute. Further, when performing the principal component analysis instead of the factor analysis, it is necessary to consider which variable is the principal component among a plurality of variables that may represent the characteristics of the user. Regarding (3), as a final analysis result, which user and which user are considered to have the same tendency based on the characteristics of each of the plurality of users who answered the questionnaire, or the category of the user's classification destination is It is necessary to consider whether there is one or more for each user. In this way, analysts who analyze the results of the questionnaire perform a proper analysis for the purpose of conducting the questionnaire, and a parameter that is a statistical classification index for classifying the users who answered the questionnaire with a certain degree of granularity. Need to be derived. Such parameters are handled as hyperparameters that analysts and the like need to set appropriately.

例えば、（３）のハイパーパラメータとして、カテゴリーを「年収の高い」、「年収が低い」とする場合、（２）のハイパーパラメータ（分析指標）を、アンケートに回答したユーザのうち、年収の高いユーザに共通する因子や年収の低いユーザに共通する因子などをパラメータとすればよい。 For example, when the categories are “high annual income” and “low annual income” as the hyperparameter of (3), the hyperparameter (analysis index) of (2) has a high annual income among the users who answered the questionnaire. A parameter common to users or a factor common to users with low annual income may be used as a parameter.

図６は、分析情報１３６の一例を示す図である。図示の例では、アンケート結果に対して、上述した（１）〜（３）の事項を考慮した上で因子分析が行われた場合の結果を示している。例えば、分析情報１３６は、因子分析の際に考慮した因子（図中ＦＡＣ１等）に対して、アンケートに回答したユーザごとに因子スコアが対応付けられた情報である。因子スコアは、例えば、因子とユーザの特性との相関の程度を表す指標値である。図示の例では、因子ＦＡＣ１に対して、ユーザＡの因子スコアは、１．１１３であり、ユーザＢの因子スコアは、０．３８０であり、ユーザＣの因子スコアは、０．３９７であることを表している。これらの因子スコアは、例えば、アンケートの調査項目に対して用意された選択肢のうち、ユーザにより選択された回答選択肢に応じたスコアに基づいて計算されてよい。 FIG. 6 is a diagram showing an example of the analysis information 136. In the illustrated example, the result of the questionnaire is shown when the factor analysis is performed in consideration of the items (1) to (3) described above. For example, the analysis information 136 is information in which a factor score (FAC1 or the like in the drawing) considered in the factor analysis is associated with a factor score for each user who answers the questionnaire. The factor score is, for example, an index value indicating the degree of correlation between the factor and the characteristics of the user. In the illustrated example, with respect to the factor FAC1, the factor score of the user A is 1.113, the factor score of the user B is 0.380, and the factor score of the user C is 0.397. Represents. These factor scores may be calculated, for example, based on the scores corresponding to the answer options selected by the user among the options prepared for the survey items of the questionnaire.

また、図示の例のように、分析情報１３６は、アンケートに回答したユーザごとにどういったカテゴリーに分類したのかを表す分類結果を示してよい。ユーザの分類先とするカテゴリーは、例えば、上述した因子スコアなどの複数の分類指標を総合的に考慮することで導出されてよい。例えば、因子ＦＡＣ１〜ＦＡＣ４までの全ての因子スコアが正値であるユーザの場合、そのユーザのカテゴリーは１型とし、いずれか１つの因子スコアが負値であるユーザの場合、そのユーザのカテゴリーは２型とし、いずれか２つの因子スコアが負値であるユーザの場合、そのユーザのカテゴリーは３型とし、いずれか３つ以上の因子スコアが負値であるユーザの場合、そのユーザのカテゴリーは４型とする、といったような分類をしてよい。 Further, as in the illustrated example, the analysis information 136 may indicate a classification result indicating what category each user who has answered the questionnaire has been classified. The category to be classified by the user may be derived by comprehensively considering a plurality of classification indexes such as the factor score described above. For example, in the case of a user in which all the factor scores of factors FAC1 to FAC4 are positive values, the category of the user is set to type 1, and in the case of a user in which any one of the factor scores is a negative value, the category of the user is If the user has type 2 and any two factor scores have a negative value, the category of the user is type 3, and if any of the users has three or more factor scores with a negative value, the category of the user is You may classify as type 4.

学習処理部１１４は、ＤＮＮ構成情報１３８に基づいて学習対象とするディープニューラルネットワークＤＮＮを生成（構築）し、このディープニューラルネットワークＤＮＮに対して、上述した行動履歴情報１３２、アンケート情報１３４、および分析情報１３６を用いて事前学習（Pre-training）を行う。 The learning processing unit 114 generates (constructs) a deep neural network DNN as a learning target based on the DNN configuration information 138, and with respect to the deep neural network DNN, the above-described action history information 132, questionnaire information 134, and analysis. Pre-training is performed using the information 136.

図７は、ＤＮＮ構成情報１３８の一例を示す図である。図示の例のように、ＤＮＮ構成情報１３８は、各階層に対して、ニューラルネットワークの数を表すユニット数が対応付けられた情報である。一般的に、ディープニューラルネットワークＤＮＮは、入力層、隠れ層（中間層）、出力層の３つの階層により構成される。入力層には、ディープニューラルネットワークＤＮＮに学習させたいデータが入力される。出力層からは、ディープニューラルネットワークＤＮＮによって学習された結果が出力される。隠れ層は、学習の核となる処理を行う。例えば、隠れ層は、活性化関数（伝達関数）と呼ばれる関数により表現され、入力に応じた出力を返す。例えば、活性化関数は、正規化線形関数（ＲｅＬＵ関数）やシグモイド関数、ステップ関数などであるがこれに限られず、任意の関数が用いられてよい。入力層のニューラルネットワークは、「入力器」の一例であり、出力層のニューラルネットワークは、「出力器」の一例である。 FIG. 7 is a diagram showing an example of the DNN configuration information 138. As in the illustrated example, the DNN configuration information 138 is information in which the number of units representing the number of neural networks is associated with each layer. Generally, the deep neural network DNN is composed of three layers of an input layer, a hidden layer (intermediate layer), and an output layer. Data to be learned by the deep neural network DNN is input to the input layer. The result learned by the deep neural network DNN is output from the output layer. The hidden layer performs processing that is the core of learning. For example, the hidden layer is represented by a function called an activation function (transfer function) and returns an output according to an input. For example, the activation function is a normalized linear function (ReLU function), a sigmoid function, a step function, or the like, but is not limited to this, and an arbitrary function may be used. The input layer neural network is an example of an “input device”, and the output layer neural network is an example of an “output device”.

ＤＮＮ構成情報１３８は、これらの各階層のユニット数を定めている。図示の例では、入力層のユニット数は１個であり、出力層のユニット数はｓ個であることを表している。また、隠れ層は、少なくとも２層以上であり、図示の例では、ｎ層（ｎ≧２）であることを表している。また、隠れ層の各層のユニット数はｒ個であることを表している。なお、隠れ層の各層のユニット数は、一律ｒ個である必要はなく、各層で異なる個数であってもよい。上述した隠れ層の階層数と各層のユニット数はハイパーパラメータであり、任意に変更されてよい。例えば、アンケート結果の分析により５つのカテゴリーに分類することが決められた場合、出力層のユニット数は５つに設定されてよい。また、学習の精度を向上させるために、隠れ層のユニット数を増やしてもよい。 The DNN configuration information 138 defines the number of units in each of these layers. In the illustrated example, the number of units in the input layer is one, and the number of units in the output layer is s. Further, the number of hidden layers is at least two or more, and in the illustrated example, the number of hidden layers is n (n≧2). In addition, the number of units in each hidden layer is r. Note that the number of units in each hidden layer does not have to be uniformly r, and may be different in each layer. The number of layers of hidden layers and the number of units of each layer described above are hyperparameters and may be arbitrarily changed. For example, when it is decided to classify into five categories by analyzing the questionnaire results, the number of units in the output layer may be set to five. Further, the number of hidden layer units may be increased in order to improve learning accuracy.

学習処理部１１４は、このようなＤＮＮ構成情報１３８を参照することで、事前学習の際に構成が予め決められたディープニューラルネットワークＤＮＮを生成する。 The learning processing unit 114 refers to such DNN configuration information 138 to generate a deep neural network DNN having a predetermined configuration during pre-learning.

図８は、ＤＮＮ構成情報１３８に基づき生成されるディープニューラルネットワークＤＮＮを模式的に示す図である。図示の例のように、ディープニューラルネットワークＤＮＮは、入力層の一以上のユニット（ニューラルネットワーク）から、隠れ層のｎ層のうち、最も浅い層の複数のユニットのそれぞれに対して、神経伝達網を摸したエッジが接続される。「最も浅い層」とは、隠れ層に含まれる複数の層の中で、最も入力層に近い層である。本実施形態では、入力層側に近いことを「浅い」と表現し、出力層側に近いことを「深い」と表現する。上述した例の場合、最も浅い層は第１層である。第１層に含まれる各ユニットからは、隠れ層のｎ層のうち、第１層の次に最も浅い層（以下、第２層と称する）の複数のユニットのそれぞれに対して、エッジが接続される。第２層に含まれる各ユニットからは、隠れ層のｎ層のうち、第２層の次に最も浅い層（以下、第３層と称する）の複数のユニットのそれぞれに対して、エッジが接続される。第３層以降の層のユニットについても同様に、より深い層のユニットに対してエッジが接続される。最も深い第ｎ層の各ユニットからは、出力層の一以上のユニットにエッジが接続される。このように、ディープニューラルネットワークＤＮＮは、例えば、制約付きボルツマンマシン（ＲＢＭ）のように、各層のユニット同士がエッジで接続される状態確率モデルとして表現される。なお、ディープニューラルネットワークＤＮＮは、制約がないボルツマンマシンのように、更に、同じ層に属すユニット同士がエッジで接続される状態確率モデルとして表現されてもよい。 FIG. 8 is a diagram schematically showing a deep neural network DNN generated based on the DNN configuration information 138. As in the illustrated example, the deep neural network DNN is a neural transmission network from one or more units (neural network) of the input layer to each of a plurality of units of the shallowest layer among the n layers of the hidden layer. The edges are connected. The “shallowest layer” is a layer closest to the input layer among a plurality of layers included in the hidden layer. In the present embodiment, the proximity to the input layer side is expressed as “shallow”, and the proximity to the output layer side is expressed as “deep”. In the above example, the shallowest layer is the first layer. From each unit included in the first layer, an edge is connected to each of a plurality of units of the shallowest layer next to the first layer (hereinafter, referred to as a second layer) among the n hidden layers. To be done. An edge is connected from each unit included in the second layer to each of a plurality of units of the shallowest layer next to the second layer (hereinafter referred to as the third layer) among the n layers of the hidden layer. To be done. Similarly, for the units of the third and subsequent layers, the edges are connected to the units of the deeper layers. An edge is connected from each unit of the deepest nth layer to one or more units of the output layer. In this way, the deep neural network DNN is represented as a state probability model in which units in each layer are connected by edges, such as a constrained Boltzmann machine (RBM). The deep neural network DNN may be expressed as a state probability model in which units belonging to the same layer are further connected by an edge like a Boltzmann machine without restriction.

学習処理部１１４は、事前学習として、生成したディープニューラルネットワークＤＮＮにおいて、隠れ層の各層を段階的に学習することで、ディープニューラルネットワークＤＮＮの初期設定を行う。 As the pre-learning, the learning processing unit 114 performs initial setting of the deep neural network DNN by gradually learning each layer of the hidden layers in the generated deep neural network DNN.

［事前学習］
以下、事前学習の処理をフローチャートに即して説明する。図９は、学習処理部１１４により実行される処理の一例を示すフローチャートである。 [Pre-learning]
The pre-learning process will be described below with reference to the flowchart. FIG. 9 is a flowchart showing an example of processing executed by the learning processing unit 114.

まず、学習処理部１１４は、隠れ層のｎ層のうち、第ｉ層を処理対象の階層に決定する（Ｓ１００）。ｉは、処理の際に一時的に計算されるテンポラリパラメータであり、１からｎまでの範囲における自然数が与えられる。 First, the learning processing unit 114 determines the i-th layer of the hidden n-layers as the layer to be processed (S100). i is a temporary parameter that is temporarily calculated during processing, and is given a natural number in the range of 1 to n.

例えば、学習処理部１１４は、第１層（ｉ＝１）を処理対象の階層に決定する。なお、学習処理部１１４は、第１層以外の第２層や第３層などを処理対象の階層に決定してもよい。以下の説明では、まず始めに第１層を処理対象の階層に決定するものとする。 For example, the learning processing unit 114 determines the first layer (i=1) as the processing target layer. The learning processing unit 114 may determine the second layer, the third layer, or the like other than the first layer as the layer to be processed. In the following description, it is assumed that the first layer is first determined as the layer to be processed.

次に、学習処理部１１４は、生成したディープニューラルネットワークＤＮＮから、入力層および出力層と、前回までの処理において処理対象の階層として決定した全ての階層と、今回の処理において処理対象の階層として決定した第ｉ層とを抽出する（Ｓ１０２）。初回の処理の場合、例えば、学習処理部１１４は、ディープニューラルネットワークＤＮＮから、入力層および出力層と、第１層とを抽出する。 Next, the learning processing unit 114 determines, from the generated deep neural network DNN, the input layer and the output layer, all the layers determined as the processing target layers in the previous processing, and the processing target layers in the current processing. The determined i-th layer is extracted (S102). In the case of the first processing, for example, the learning processing unit 114 extracts the input layer and the output layer and the first layer from the deep neural network DNN.

次に、学習処理部１１４は、抽出した階層からなるニューラルネットワークにおいて、入力層に、アンケートに回答したユーザの行動履歴を示す行動履歴情報１３２を入力する（Ｓ１０４）。この際、学習処理部１１４は、ユーザの行動履歴を、プロセッサが処理可能な情報に変換することで、入力層に対する入力データｖとしてよい。 Next, the learning processing unit 114 inputs the action history information 132 indicating the action history of the user who answered the questionnaire in the input layer in the neural network including the extracted layers (S104). At this time, the learning processing unit 114 may convert the user's action history into information that can be processed by the processor, and use this as input data v for the input layer.

例えば、学習処理部１１４は、行動履歴情報１３２を参照して、ウェブページのビュー数や所定の検索クエリの入力回数などを導出し、これを入力データｖとする。また、学習処理部１１４は、購入商品の商品コードや閲覧サイトのＵＲＬなどの一つ一つの行動履歴に関連する文字列（アルファベットや数字、記号などを含んでよい）をベクトル化し、このベクトルを入力層に対する入力データｖとしてもよい。学習処理部１１４は、抽出した隠れ層の各ユニットの活性化関数に基づいて、出力層から出力されることになる出力データｈを導出する。出力データｈは、例えば、以下の数式（１）によって表すことができる。 For example, the learning processing unit 114 refers to the action history information 132 and derives the number of views of the web page, the number of times a predetermined search query is input, and the like, and sets this as input data v. Further, the learning processing unit 114 vectorizes a character string (which may include alphabets, numbers, symbols, etc.) related to each action history such as the product code of the purchased product or the URL of the browsing site, and this vector is used as the vector. The input data v for the input layer may be used. The learning processing unit 114 derives the output data h to be output from the output layer based on the extracted activation function of each unit of the hidden layer. The output data h can be represented by the following mathematical expression (1), for example.

ｈ＝σ（Ｗ^Ｔｖ＋ｂ） …（１） h=σ(W ^T v+b) (1)

σは、各層のそれぞれのユニットの活性化関数を表し、Ｗは、ある層のユニットから、より深い層のユニットにデータが出力される際に、出力データに対して付与される重みを表し、ｂは、各層の固有のバイアス成分を表している。 σ represents an activation function of each unit of each layer, W represents a weight given to output data when data is output from a unit of a certain layer to a unit of a deeper layer, b represents the unique bias component of each layer.

次に、学習処理部１１４は、ディープニューラルネットワークＤＮＮにおいて処理対象として選択した階層の深さに応じて、アンケート情報１３４および分析情報１３６から、教師データｔとするデータを抽出する（Ｓ１０６）。教師データｔとは、抽出した階層からなるニューラルネットワーク（上述した例では、入力層、第１層、および出力層の３層からなるニューラルネットワーク）において、出力層から出力される出力データの規範となるデータである。 Next, the learning processing unit 114 extracts data to be the teacher data t from the questionnaire information 134 and the analysis information 136 according to the depth of the hierarchy selected as the processing target in the deep neural network DNN (S106). The teacher data t is the norm of the output data output from the output layer in the extracted neural network (in the above-described example, the neural network including three layers of the input layer, the first layer, and the output layer). Data.

例えば、学習処理部１１４は、処理対象とする階層がより深い層であるほど、アンケート結果の分析をより進めたときに得られる指標値を教師データｔとして抽出する。例えば、アンケート結果に対して多変量解析が行われた状態は、多変量解析が行われていない状態と比べて、アンケート結果の分析がより進められた状態と見做せる。また、多変量解析の結果を基にユーザを分類するカテゴリーが決定された状態は、多変量解析が行われた直後の状態と比べて、更にアンケート結果の分析が進められた状態と見做せる。 For example, the learning processing unit 114 extracts, as the deeper layer to be processed, the index value obtained when the analysis of the questionnaire result is advanced, as the teacher data t. For example, the state in which the multivariate analysis is performed on the questionnaire result can be regarded as the state in which the analysis of the questionnaire result is more advanced than the state in which the multivariate analysis is not performed. In addition, the state in which the category for classifying users based on the results of the multivariate analysis is determined can be regarded as the state in which the analysis of the questionnaire results has been further advanced, compared to the state immediately after the multivariate analysis is performed. ..

このような前提のもとで、例えば、学習処理部１１４は、第１層を処理対象とする場合、分析が行われていない生データであるアンケート情報１３４を参照して、アンケートの回答のために選択された選択肢のスコアを、教師データｔとして抽出する。また、学習処理部１１４は、例えば、第２層を処理対象とする場合、因子分析が行われたデータである分析情報１３６を参照して、因子スコアを教師データｔとして抽出する。また、学習処理部１１４は、例えば、第３層を処理対象とする場合、分析情報１３６を参照して、因子スコアを基に決定されたカテゴリーを示す数値を教師データｔとして抽出する。すなわち、学習処理部１１４は、第１層を処理対象とするときには、上述したアンケート結果の分析時に考慮される事項（１）のハイパーパラメータを教師データｔとし、第２層を処理対象とするときには、アンケート結果の分析時に考慮される事項（２）のハイパーパラメータを教師データｔとし、第３層を処理対象とするときには、アンケート結果の分析時に考慮される事項（３）のハイパーパラメータを教師データｔとする。このように、学習処理部１１４は、互いに種類の異なるデータ（指標値）を各階層の教師データｔとする。この結果、分析を進めることにより得られる知見を、隠れ層の各層のニューラルネットワークに段階的に学習させることができる。 Based on such a premise, for example, when the first processing layer 114 is a processing target, the learning processing unit 114 refers to the questionnaire information 134 that is raw data that has not been analyzed, and returns the questionnaire. The score of the option selected for is extracted as teacher data t. In addition, for example, when the second layer is the processing target, the learning processing unit 114 refers to the analysis information 136 that is the data on which the factor analysis has been performed, and extracts the factor score as the teacher data t. Further, for example, when the third layer is a processing target, the learning processing unit 114 refers to the analysis information 136 and extracts a numerical value indicating the category determined based on the factor score as the teacher data t. That is, when the first layer is the processing target, the learning processing unit 114 sets the hyperparameter of the item (1) considered in the analysis of the above-mentioned questionnaire result as the teacher data t, and when the second layer is the processing target. , The hyperparameter of the item (2) considered in the analysis of the questionnaire result is the teacher data t, and when the third layer is the processing target, the hyperparameter of the item (3) considered in the analysis of the questionnaire result is the teacher data t. t. In this way, the learning processing unit 114 sets different types of data (index values) as the teacher data t of each layer. As a result, the knowledge obtained by advancing the analysis can be learned stepwise by the neural network of each hidden layer.

次に、学習処理部１１４は、抽出した階層からなるニューラルネットワークにおいて、出力層から出力される出力データｈと、処理対象とする階層の深さに応じて抽出した教師データｔとの誤差を評価関数Ｉとして導出する（Ｓ１０８）。例えば、評価関数Ｉは、（１／２）×（ｈ−ｔ）^２として表すことができる。なお、評価関数Ｉは、出力層の全ユニットのそれぞれから出力されるデータと、教師データｔとの誤差（二乗誤差）の総和であってよい。 Next, the learning processing unit 114 evaluates the error between the output data h output from the output layer and the teacher data t extracted according to the depth of the layer to be processed in the neural network including the extracted layer. It is derived as a function I (S108). For example, the evaluation function I can be expressed as (1/2)×(ht) ² . The evaluation function I may be the sum of the errors (squared errors) between the data output from all the units in the output layer and the teacher data t.

次に、学習処理部１１４は、誤差逆伝播法を用いて、評価関数Ｉを最小化するように、各層の重みＷとバイアスｂを決定する（Ｓ１１０）。例えば、学習処理部１１４は、評価関数Ｉの重みＷに関する偏微分を計算することで、評価関数Ｉの勾配∂Ｉ／∂Ｗを導出し、勾配∂Ｉ／∂Ｗに基づいて重みＷおよびバイアスｂを決定する。 Next, the learning processing unit 114 uses the error backpropagation method to determine the weight W and bias b of each layer so as to minimize the evaluation function I (S110). For example, the learning processing unit 114 derives the gradient ∂I/∂W of the evaluation function I by calculating a partial differential with respect to the weight W of the evaluation function I, and the weight W and the bias based on the gradient ∂I/∂W. Determine b.

次に、学習処理部１１４は、各層の重みＷとバイアスｂを、処理対象の階層に対応付け、この対応付けた情報を、層毎パラメータ情報１４０として記憶部１３０に記憶する（Ｓ１１２）。 Next, the learning processing unit 114 associates the weight W and the bias b of each layer with the layer to be processed, and stores the associated information in the storage unit 130 as the layer-specific parameter information 140 (S112).

図１０は、層毎パラメータ情報１４０の一例を示す図である。図示の例のように、層毎パラメータ情報１４０は、処理対象の階層に、決定された重みＷとバイアスｂとが対応付けられた情報である。 FIG. 10 is a diagram showing an example of the layer-by-layer parameter information 140. As in the illustrated example, the layer-by-layer parameter information 140 is information in which the determined weight W and the bias b are associated with the layer to be processed.

次に、学習処理部１１４は、生成したディープニューラルネットワークＤＮＮの１〜ｎまでの全ての階層を処理対象の階層として決定したか否かを判定する（Ｓ１１４）。 Next, the learning processing unit 114 determines whether or not all the hierarchies 1 to n of the generated deep neural network DNN have been decided as hierarchies to be processed (S114).

全ての階層を処理対象の階層として決定していない場合、学習処理部１１４は、テンポラリパラメータｉに１を加算して（Ｓ１１６）、処理を上述したＳ１００に移す。これにより、前回までに決定されたいずれの階層とも異なる階層が処理対象の階層に決定される。上述した例では、初回処理時に第１層が処理対象の階層に決定されるため、二回目の処理時には、例えば、第２層が処理対象の階層に決定される。 When all the hierarchies are not determined as the hierarchies to be processed, the learning processing unit 114 adds 1 to the temporary parameter i (S116), and moves the process to S100 described above. As a result, a layer different from any of the layers determined up to the previous time is determined as the layer to be processed. In the above-described example, the first layer is determined as the layer to be processed at the time of the first processing, and thus the second layer is determined as the layer to be processed at the time of the second processing, for example.

一方、全ての階層を処理対象の階層として決定した場合、学習処理部１１４は、本フローチャートの処理を終了する。 On the other hand, when all the hierarchies are determined as the hierarchies to be processed, the learning processing unit 114 ends the processing of this flowchart.

なお、上述したフローチャートの処理では、学習処理部１１４が、Ｓ１００の処理において、隠れ層のｎ層のうち、ある一つの第ｉ層を処理対象の階層に決定したがこれに限られない。例えば、学習処理部１１４は、隠れ層のｎ層のうち、複数層（例えば第１層と第２層など）をまとめて処理対象の階層に決定してもよい。この場合、学習処理部１１４は、処理対象の階層として決定した複数層のパラメータをまとめて学習してよい。 In the process of the flowchart described above, the learning processing unit 114 determines, in the process of S100, a certain i-th layer among the n-layers of the hidden layers as the process target layer, but the present invention is not limited to this. For example, the learning processing unit 114 may collectively determine a plurality of layers (for example, the first layer and the second layer) out of the n hidden layers as a layer to be processed. In this case, the learning processing unit 114 may collectively learn the parameters of a plurality of layers determined as the layer to be processed.

図１１は、図９に示すフローチャートのループ処理の内容を模式的に示す図である。図１０の例では、簡略的に各層のユニットが一つであるものとし、各層のパラメータとして重みＷのみを表現している。例えば、１回目の処理では、入力層と、隠れ層の第１層と、出力層とが抽出される。例えば、入力層に入力される入力データｖは、ユーザの行動履歴を数値化したデータである。この場合、第１層の重みＷ_１を決定するために、第１層の深さの程度に応じた第１教師データｔ_１が抽出される。例えば、第１教師データｔ_１には、上述したように、アンケートの回答のために選択された選択肢のスコアが採用されてよい。このとき、アンケート結果は、入力データｖとして入力された行動履歴の動作主体であったユーザによる回答結果であるものとする。例えば、ユーザＡの行動履歴を入力層に入力する場合、第１教師データｔ_１は、ユーザＡのアンケート結果が利用される。 FIG. 11 is a diagram schematically showing the contents of the loop processing of the flowchart shown in FIG. In the example of FIG. 10, it is assumed that the number of units of each layer is one, and only the weight W is expressed as a parameter of each layer. For example, in the first processing, the input layer, the first hidden layer, and the output layer are extracted. For example, the input data v input to the input layer is data that digitizes the behavior history of the user. In this case, in order to determine the weight W ₁ of the first layer, the first teacher data t ₁ according to the degree of the depth of the first layer is extracted. For example, as described above, the score of the option selected for answering the questionnaire may be adopted as the first teacher data t ₁ . At this time, the questionnaire result is assumed to be the answer result by the user who was the subject of the action history input as the input data v. For example, when the behavior history of the user A is input to the input layer, the questionnaire result of the user A is used as the first teacher data t ₁ .

２回目の処理では、例えば、入力層と、隠れ層の第１層および第２層と、出力層とが抽出される。例えば、入力層に入力される入力データｖは、１回目の処理で入力層に入力されたデータと同様に、ユーザの行動履歴を数値化したデータが採用される。例えば、１回目の処理で、あるユーザＡの行動履歴を数値化したデータが入力層に入力された場合、２回目の処理で入力層に入力される入力データｖは、ユーザＡの行動履歴を数値化したデータとなる。２回目の処理では、学習処理部１１４は、第２教師データｔ_２と、出力層から出力される出力データｈとの誤差を小さくするように、１回目の処理において決定された第１層のパラメータを維持しつつ新たに追加された第２層の重みＷ_２を決定する。第２教師データｔ_２は、第２層の階層の深さの程度に応じたデータであり、例えば、上述したように、因子分析により得られた因子スコアが用いられてよい。例えば、学習処理部１１４は、２回目の処理において、第１層の重みＷ_１＃を１回目の処理時に決定された重みＷ_１とすることで、第１層のパラメータを維持する。 In the second processing, for example, the input layer, the first and second hidden layers, and the output layer are extracted. For example, as the input data v input to the input layer, similarly to the data input to the input layer in the first process, data that digitizes the behavior history of the user is adopted. For example, in the first process, when the numerical data of the action history of a certain user A is input to the input layer, the input data v input to the input layer in the second process is the action history of the user A. It will be digitized data. In the second process, the learning processing unit 114 reduces the error between the second teacher data t ₂ and the output data h output from the output layer, and reduces the error in the first layer determined in the first process. The newly added second layer weight W ₂ is determined while maintaining the parameters. The second teacher data t ₂ is data according to the degree of the depth of the second layer, and for example, as described above, the factor score obtained by the factor analysis may be used. For example, the learning processing unit 114 in the processing of the second time, by a weight W ₁ to the determined weight W _{1 #} of the first layer during the first process, to maintain the parameters of the first layer.

３回目の処理では、例えば、入力層と、隠れ層の第１〜３層と、出力層とが抽出される。例えば、入力層に入力される入力データｖは、１回目および２回目の処理で入力層に入力されたデータが採用される。３回目の処理では、学習処理部１１４は、第３教師データｔ_３と、出力層から出力される出力データｈとの誤差を小さくするように、１回目の処理において決定された第１層のパラメータと、２回目の処理において決定された第２層のパラメータとを維持しつつ新たに追加された第３層の重みＷ_３を決定する。第３教師データｔ_３は、第３層の階層の深さの程度に応じたデータであり、例えば、上述したように、因子スコアを基に決定されたカテゴリーを示す数値が用いられてよい。例えば、学習処理部１１４は、３回目の処理において、第１層の重みＷ_１＃を１回目の処理時に決定された重みＷ_１とし、第２層の重みＷ_２＃を２回目の処理時に決定された重みＷ_２とすることで、第１層および第２層のパラメータを維持する。 In the third processing, for example, the input layer, the first to third hidden layers, and the output layer are extracted. For example, as the input data v input to the input layer, the data input to the input layer in the first and second processes is adopted. In the third processing, the learning processing unit 114 reduces the error between the third teacher data t ₃ and the output data h output from the output layer, and reduces the error of the first layer determined in the first processing. The newly added weight W ₃ of the third layer is determined while maintaining the parameters and the parameters of the second layer determined in the second processing. The third teacher data t ₃ is data according to the degree of the depth of the hierarchy of the third layer, and for example, as described above, a numerical value indicating a category determined based on the factor score may be used. For example, the learning processing unit 114, in the third process, the weighting W _{1 #} of the first layer first and a weight W _1, which is determined at the time of treatment, the weight W _{2 #} of the second layer during the second process By using the determined weight W ₂ , the parameters of the first layer and the second layer are maintained.

このように、処理の回数を重ねるごとに隠れ層を追加すると共に、追加する隠れ層の深さに応じたデータを教師データとすることにより、ディープニューラルネットワークＤＮＮに隠れ層の一つ一つの層を精度良く学習させることができる。また、前回の処理で決定した各層のパラメータを、次回の処理において維持するため、内部共変量シフトの発生を抑制することができる。これによって、内部共変量シフトが生じることで学習時間がより長くなってしまうのを抑制することができる。なお、内部共変量シフトの発生を許容する場合、学習処理部１１４は、前回の処理で決定したパラメータを今回の処理の誤差逆伝搬により再度決定してよい。この場合、学習処理部１１４は、層毎パラメータ情報１４０において、各層のパラメータを更新してよい。 As described above, the hidden layer is added every time the number of times of processing is increased, and the data corresponding to the depth of the added hidden layer is used as the teacher data, so that each hidden layer is added to the deep neural network DNN. Can be learned with high accuracy. In addition, since the parameters of each layer determined in the previous processing are maintained in the next processing, the occurrence of internal covariate shift can be suppressed. This makes it possible to prevent the learning time from becoming longer due to the occurrence of the internal covariate shift. In addition, when allowing the occurrence of the internal covariate shift, the learning processing unit 114 may redetermine the parameters determined in the previous processing by the error back propagation in the current processing. In this case, the learning processing unit 114 may update the parameters of each layer in the layer-specific parameter information 140.

分類処理部１１６は、ＤＮＮ構成情報１３８と、層毎パラメータ情報１４０とを参照することで、事前学習により各階層の重みＷとバイアスｂの初期値が決められたディープニューラルネットワークＤＮＮを生成（再構築）して、未学習のユーザの行動履歴を基に、その行動履歴に対応するユーザを所定のカテゴリーに分類する。「未学習」とは、例えば、事前学習の段階で学習に利用されていないことである。 The classification processing unit 116 refers to the DNN configuration information 138 and the layer-specific parameter information 140 to generate a deep neural network DNN in which the initial values of the weight W and bias b of each layer are determined by pre-learning (re-production). Then, based on the behavior history of the unlearned user, the user corresponding to the behavior history is classified into a predetermined category. “Unlearned” means, for example, that it has not been used for learning at the stage of pre-learning.

図１２は、分類処理部１１６により実行される処理の一例を示すフローチャートである。本フローチャートの処理は、例えば、所定の周期で繰り返し行われる。 FIG. 12 is a flowchart showing an example of processing executed by the classification processing unit 116. The process of this flowchart is repeatedly performed, for example, in a predetermined cycle.

まず、分類処理部１１６は、取得部１１２により新たな行動履歴情報１３２が取得されたか否かを判定し（Ｓ２００）、新たな行動履歴情報１３２が取得された場合、ＤＮＮ構成情報１３８と、層毎パラメータ情報１４０とに基づいて、階層ごとのパラメータ（重みＷおよびバイアスｂ）の初期値が決定されたディープニューラルネットワークＤＮＮを生成する（Ｓ２０２）。 First, the classification processing unit 116 determines whether or not the new action history information 132 is acquired by the acquisition unit 112 (S200). When the new action history information 132 is acquired, the DNN configuration information 138 and the layer Based on the per-parameter information 140, the deep neural network DNN in which the initial values of the parameters (weight W and bias b) for each hierarchy are determined is generated (S202).

次に、分類処理部１１６は、生成したディープニューラルネットワークＤＮＮにおいて、入力層に、アンケートに回答したユーザの行動履歴を示す行動履歴情報１３２を数値化して入力する（Ｓ２０４）。 Next, the classification processing unit 116 digitizes and inputs the action history information 132 indicating the action history of the user who has answered the questionnaire in the input layer in the generated deep neural network DNN (S204).

次に、分類処理部１１６は、ディープニューラルネットワークＤＮＮの出力層から出力される出力データｈを、ディープニューラルネットワークＤＮＮによる学習結果として出力する（Ｓ２０６）。 Next, the classification processing unit 116 outputs the output data h output from the output layer of the deep neural network DNN as a learning result by the deep neural network DNN (S206).

例えば、分類処理部１１６は、ディープニューラルネットワークＤＮＮによる学習結果を、通信部１０６を用いてサービス提供装置２０や広告配信サーバ装置４０に送信する。これによって、例えば、サービス提供装置２０は、ディープニューラルネットワークＤＮＮにより分類されたユーザのカテゴリーに応じたセール情報などを、ユーザが利用する端末装置１０に送信することができる。また、広告配信サーバ装置４０は、例えば、ディープニューラルネットワークＤＮＮにより分類されたユーザのカテゴリーに応じた広告を、ユーザが利用する端末装置１０に送信することができる。 For example, the classification processing unit 116 transmits the learning result of the deep neural network DNN to the service providing device 20 or the advertisement distribution server device 40 using the communication unit 106. Thereby, for example, the service providing apparatus 20 can transmit the sale information according to the category of the user classified by the deep neural network DNN to the terminal device 10 used by the user. Further, the advertisement distribution server device 40 can transmit, for example, an advertisement according to the category of the user classified by the deep neural network DNN to the terminal device 10 used by the user.

図１３は、本実施形態におけるディープニューラルネットワークＤＮＮの学習手法と、比較例として例示する手法とのそれぞれの学習精度を示す図である。比較例の手法は、例えば、隠れ層の各層の出力データｈを、その層の入力データｖに近づくように、各層のパラメータを学習させる手法である。すなわち、比較例の手法は、第ｉ層から出力される出力データｈと、第（ｉ−１）層の出力データｈ（＝第ｉ層への入力データｖ）とが近づくように、隠れ層の第ｉ層のパラメータを学習させる手法である。図中の最も上段のレコードと中段のレコードは、比較例の手法による学習精度を表し、最も下段のレコードは、本実施形態の手法による学習精度を表している。 FIG. 13 is a diagram showing the learning accuracy of each of the learning method of the deep neural network DNN in this embodiment and the method exemplified as a comparative example. The method of the comparative example is a method of learning the parameters of each layer such that the output data h of each layer of the hidden layer approaches the input data v of the layer. That is, the method of the comparative example uses the hidden layer so that the output data h output from the i-th layer and the output data h of the (i−1)-th layer (=input data v to the i-th layer) are close to each other. This is a method of learning the parameters of the i-th layer of. The uppermost record and the middle record in the figure represent learning accuracy by the method of the comparative example, and the lowermost record represents learning accuracy by the method of the present embodiment.

最上段のレコードに示す比較例の手法は、入力層のユニットが１個であり、第１層のユニットが１０００個であり、第２層のユニットが１０００個であり、出力層のユニットが５個である場合のディープニューラルネットワークＤＮＮを用いたときの学習精度を表している。また、中段のレコードに示す比較例の手法は、入力層のユニットが１個であり、第１層のユニットが１０００個であり、第２層のユニットが１０００個であり、第Ｘ層のユニットが所定数であり、出力層のユニットが５個である場合のディープニューラルネットワークＤＮＮを用いたときの学習精度を表している。また、最下段のレコードに示す本実施形態の手法は、入力層のユニットが１個であり、第１層のユニットが１０００個であり、第２層のユニットが１０００個であり、第Ｙ層のユニット数が上述した第Ｘ層のユニット数と同じであり、出力層のユニットが５個である場合のディープニューラルネットワークＤＮＮを用いたときの学習精度を表している。 The method of the comparative example shown in the uppermost record has one unit in the input layer, 1000 units in the first layer, 1000 units in the second layer, and 5 units in the output layer. The learning accuracy is shown when the deep neural network DNN is used for each case. In the method of the comparative example shown in the middle record, the input layer unit is 1, the first layer unit is 1000 units, the second layer unit is 1000 units, and the X layer unit is Is a predetermined number and represents the learning accuracy when using the deep neural network DNN in the case where the number of units in the output layer is 5. In the method of the present embodiment shown in the record at the bottom, the number of units in the input layer is 1, the number of units in the first layer is 1000, the number of units in the second layer is 1000, and the number of units in the Y layer is Shows the learning accuracy when using the deep neural network DNN in the case where the number of units is the same as the number of units in the Xth layer described above and the number of units in the output layer is five.

図示の例のように、最上段と中段を比較すると隠れ層の数を増やした場合であっても必ずしも学習精度が向上しないことがわかる。一般的に、ディープニューラルネットワークＤＮＮでは過学習が生じやすかったり、より深い層まで学習情報（誤差）が伝わらなかったりといった問題が生じやすい。これらに対処するためには、膨大なデータ（例えば、画像データなら少なくとも数万から数十万もの密なデータ）を入力データｖとすると共に、事前学習を行うことなどが知られている。 As in the example shown in the figure, comparing the uppermost stage and the middle stage shows that the learning accuracy does not necessarily improve even when the number of hidden layers is increased. In general, the deep neural network DNN is apt to cause problems such as over-learning and learning information (error) not being transmitted to deeper layers. In order to deal with these, it is known that enormous data (for example, at least tens of thousands to hundreds of thousands of dense data in the case of image data) is used as the input data v, and pre-learning is performed.

しかしながら、インターネット上での行動履歴のように、情報としてスパースなデータを入力データｖとして用いる場合、適切な学習結果を返すモデルが生成されない傾向にある。また、入力データｖの元となる学習用のデータがスパースであると、誤差逆伝搬法などの精度も低下しやすい。このように、比較例の手法の場合、事前学習に用いる入力データｖに応じて学習精度が左右される傾向にある。 However, when sparse data as information is used as the input data v like the action history on the Internet, a model that returns an appropriate learning result tends not to be generated. Further, if the learning data that is the source of the input data v is sparse, the accuracy of the error backpropagation method and the like tends to decrease. As described above, in the case of the method of the comparative example, the learning accuracy tends to depend on the input data v used for the pre-learning.

これに対して、本実施形態の手法は、事前学習の段階において、ディープニューラルネットワークＤＮＮに、出力データｈと入力データｖとが近づくように学習させるのではなく（入力データｖを教師データｔとするのではなく）、別途アンケート結果より求めた指標値を、各層の深さの程度に応じて変更しながら教師データｔとすることにより、スパースなデータを入力データｖとして用いた場合であっても、学習精度を向上させることができる。 On the other hand, in the method of the present embodiment, in the pre-learning stage, the deep neural network DNN does not perform learning so that the output data h and the input data v come close to each other (the input data v is set as the teaching data t However, the sparse data is used as the input data v by changing the index value separately obtained from the questionnaire result to the teacher data t while changing the index value according to the depth of each layer. Also, learning accuracy can be improved.

以上説明した実施形態によれば、入力データｖを入力したときに、独立に構成しうる分類器が多層に構成された分類システム（例えばディープニューラルネットワークＤＮＮ）に含まれる第１分類器（例えば隠れ層の第１層のニューラルネットワーク）から出力される出力データｈが、第１教師データｔに近づくように、分類システムを学習させた後に、同じ入力データｖを入力したときに、第１分類器の階層よりも深い階層の第２分類器（例えば隠れ層の第２層のニューラルネットワーク）から出力される出力データｈが、第２教師データに近づくように、第１教師データを用いた学習結果に基づいて分類システムを学習させる学習処理部１１４と、学習処理部１１４により学習させられた分類システムに基づいて、未学習のデータを所定のカテゴリーに分類する分類処理部１１６とを備えることにより、各分類器のパラメータを適切に設定することができる。この結果、学習の精度を向上させることができる。 According to the embodiment described above, when the input data v is input, the first classifier (for example, the hidden classifier) included in the classification system (for example, the deep neural network DNN) in which the classifiers that can be independently configured are configured in multiple layers. When the same input data v is input after the classification system is trained so that the output data h output from the first layer neural network) approaches the first teacher data t. The learning result using the first teacher data so that the output data h output from the second classifier in a hierarchy deeper than the hierarchy (for example, a neural network in the hidden second layer) approaches the second teacher data. By providing the learning processing unit 114 for learning the classification system based on the above, and the classification processing unit 116 for classifying the unlearned data into a predetermined category based on the classification system learned by the learning processing unit 114, The parameters of each classifier can be set appropriately. As a result, learning accuracy can be improved.

また、上述した実施形態によれば、事前学習の段階で、分類器ごとにパラメータを導出しておくので、分類システムの一部の分類器を変更したり追加したりする場合に、変更しない分類器については既に導出したパラメータ（学習済みのパラメータ）を流用することができる。例えば、ディープニューラルネットワークＤＮＮにおいて、分類結果を５パターンから、１０パターンに変更する場合、出力層のユニット数だけを変更すればよく、隠れ層についてはそのまま流用することができる。この結果、汎用性の高いディープニューラルネットワークＤＮＮを構築することができる。 Further, according to the above-described embodiment, since the parameters are derived for each classifier at the stage of pre-learning, when changing or adding some classifiers of the classification system, the classification that is not changed The already derived parameters (learned parameters) can be used for the container. For example, in the deep neural network DNN, when changing the classification result from 5 patterns to 10 patterns, only the number of units in the output layer needs to be changed, and the hidden layer can be used as it is. As a result, a highly versatile deep neural network DNN can be constructed.

＜適用例＞
以下、上述した実施形態の適用例について説明する。例えば、上述した実施形態における解析装置１００は、ある検索クエリの検索結果として提供されるウェブページの要約文を生成するために、ウェブページを構成するテキストデータを入力データｖとし、ディープニューラルネットワークＤＮＮに、テキストデータの要約文を出力するように隠れ層の各層のパラメータを学習させる。このとき、解析装置１００の学習処理部１１４は、各層の出力データｈの教師データｔを、隠れ層の深さに応じてワード数が段階的に少なくなるテキストデータとする。例えば、学習処理部１１４は、最も浅い（入力層に最も近い）第１層の出力データｈに対する教師データｔ_１を、ワード数が第１所定数のテキストデータとする。第１所定数は、入力データｖとするテキストデータのワード数よりも少ない数である。また、学習処理部１１４は、第２層の出力データｈに対する教師データｔ_２を、ワード数が第１所定数よりも少ない第２所定数のテキストデータとする。このように、階層が深くなるにつれて教師データｔとするテキストデータのワード数を少なくすることで、より精度良くウェブページの要約文を生成することができる。 <Application example>
Hereinafter, application examples of the above-described embodiment will be described. For example, the analysis device 100 in the above-described embodiment uses the text data forming the web page as the input data v to generate the summary of the web page provided as the search result of a certain search query, and uses the deep neural network DNN. Trains the parameters of each hidden layer to output a summary of the text data. At this time, the learning processing unit 114 of the analysis device 100 sets the teacher data t of the output data h of each layer to text data in which the number of words gradually decreases according to the depth of the hidden layer. For example, the learning processing unit 114 sets the teacher data t ₁ for the output data h of the first layer, which is the shallowest (closest to the input layer), as text data having a first predetermined number of words. The first predetermined number is a number smaller than the number of words of the text data used as the input data v. Further, the learning processing unit 114 sets the teacher data t ₂ for the output data h of the second layer as the second predetermined number of text data having the number of words smaller than the first predetermined number. In this way, by reducing the number of words of the text data to be the teacher data t as the hierarchy becomes deeper, it is possible to more accurately generate the abstract sentence of the web page.

また、上述した実施形態における解析装置１００は、文字や写真などの画像認識の際に、ある画像を入力データｖとし、ディープニューラルネットワークＤＮＮに、適切な認識結果を出力するように隠れ層の各層のパラメータを学習させてもよい。このとき、解析装置１００の学習処理部１１４は、各層の出力データｈの教師データｔを、正解データとするのではなく、クラウドソーシングなどで得られた不特定多数のユーザによる認識結果とする。例えば、不特定多数のユーザに対して、予め正解が決められた所定の画像（例えば「猫の画像」）を見せて、その画像の認識結果を、教師データｔとする。このとき、あるユーザＡは、視認した画像を「猫」であると認識したり、あるユーザＢは、視認した画像を「犬」であると認識したりする。このように、同じ画像を不特定多数のユーザのそれぞれに見せた場合、ユーザによっては正解となる認識とは異なる認識をする場合がある。従って、教師データｔは、正誤に関して揺れのあるデータとなり、正しい認識結果と誤った認識結果の双方を含むデータとなる傾向がある。例えば、学習処理部１１４は、第１層の出力データｈに対する教師データｔ_１を、クラウドソーシングなどで得られた不特定多数のユーザによる認識結果のうち、最も正誤の揺れの大きい認識結果（例えば、正：誤＝０．５：０．５）とし、第２層の出力データｈに対する教師データｔ_２を、教師データｔ_１よりも正誤の揺れの小さい認識結果（例えば、正：誤＝０．６：０．４）とし、第３層の出力データｈに対する教師データｔ_３を、教師データｔ_２よりも正誤の揺れの小さい認識結果（例えば、正：誤＝０．７：０．３）とする。このように、階層が深くなるにつれて正しい認識結果がより多いデータを教師データｔとし、最も深いｎ層の出力データｈに対する教師データｔ_ｎを、正解データ（正：誤＝１：０））とすることで、画像認識の精度を向上させることができる。 Further, the analysis device 100 in the above-described embodiment uses each image as an input data v when recognizing an image such as a character or a photograph, and outputs each of the hidden layers so as to output an appropriate recognition result to the deep neural network DNN. Parameters may be learned. At this time, the learning processing unit 114 of the analysis device 100 does not use the teacher data t of the output data h of each layer as the correct answer data, but the recognition result by an unspecified large number of users obtained by crowdsourcing or the like. For example, a predetermined image (for example, “cat image”) whose correct answer is determined in advance is shown to an unspecified number of users, and the recognition result of the image is used as the teacher data t. At this time, a certain user A recognizes the visually recognized image as a “cat”, and a certain user B recognizes the visually recognized image as a “dog”. In this way, when the same image is shown to each of a large number of unspecified users, some users may make a different recognition from the correct recognition. Therefore, the teacher data t tends to be data that has fluctuations with respect to correctness, and tends to be data that includes both correct recognition results and erroneous recognition results. For example, the learning processing unit 114 recognizes the teacher data t ₁ with respect to the output data h of the first layer as a recognition result having the largest fluctuation of correctness among recognition results by an unspecified number of users obtained by crowdsourcing or the like (for example, , Correct: false=0.5:0.5), and the recognition result of the teacher data t ₂ for the output data h of the second layer is smaller than the teacher data t _{1 in} the fluctuation of correctness (for example, correct: false=0. .:6:0.4) and the teacher data t _{3 corresponding} to the output data h of the third layer has a smaller recognition fluctuation than the teacher data t ₂ (for example, correct:error=0.7:0.3). ). In this way, the data having more correct recognition results as the hierarchy becomes deeper is the teacher data t, and the teacher data t _n for the output data h of the deepest n layer is the correct answer data (correct: false=1:0). By doing so, the accuracy of image recognition can be improved.

＜ハードウェア構成＞
上述した実施形態の解析システム１に含まれる複数の装置のうち、少なくとも解析装置１００は、例えば、図１４に示すようなハードウェア構成により実現される。図１４は、実施形態の解析装置１００のハードウェア構成の一例を示す図である。 <Hardware configuration>
Among the plurality of devices included in the analysis system 1 of the above-described embodiment, at least the analysis device 100 is realized by, for example, the hardware configuration shown in FIG. FIG. 14 is a diagram illustrating an example of the hardware configuration of the analysis device 100 of the embodiment.

解析装置１００は、ＮＩＣ１００−１、ＣＰＵ１００−２、ＧＰＵ（Graphics Processing Unit）１００−３、ＲＡＭ１００−４、ＲＯＭ１００−５、フラッシュメモリやＨＤＤなどの二次記憶装置１００−６、およびドライブ装置１００−７が、内部バスあるいは専用通信線によって相互に接続された構成となっている。ドライブ装置１００−７には、光ディスクなどの可搬型記憶媒体が装着される。二次記憶装置１００−６、またはドライブ装置１００−７に装着された可搬型記憶媒体に格納されたプログラムがＤＭＡコントローラ（不図示）などによってＲＡＭ１００−４に展開され、ＣＰＵ１００−２やＧＰＵ１００−３によって実行されることで、制御部１１０が実現される。制御部１１０が参照するプログラムは、ネットワークＮＷを介して他の装置からダウンロードされてもよい。 The analysis device 100 includes a NIC 100-1, a CPU 100-2, a GPU (Graphics Processing Unit) 100-3, a RAM 100-4, a ROM 100-5, a secondary storage device 100-6 such as a flash memory or an HDD, and a drive device 100-. 7 are connected to each other by an internal bus or a dedicated communication line. A portable storage medium such as an optical disk is attached to the drive device 100-7. A program stored in a portable storage medium mounted on the secondary storage device 100-6 or the drive device 100-7 is expanded in the RAM 100-4 by a DMA controller (not shown) or the like, and the CPU 100-2 or the GPU 100-3. By being executed by the control unit 110, the control unit 110 is realized. The program referenced by the control unit 110 may be downloaded from another device via the network NW.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何ら限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 Although the embodiments for carrying out the present invention have been described above with reference to the embodiments, the present invention is not limited to these embodiments, and various modifications and substitutions are made without departing from the gist of the present invention. Can be added.

１…解析システム、１０…端末装置、２０…サービス提供装置、３０…ログ取得装置、４０…広告配信サーバ装置、１００…解析装置、１０２…受付部、１０４…表示部、１０６…通信部、１１０…制御部、１１２…取得部、１１４…学習処理部、１１６…分類処理部、１３０…記憶部、１３２…行動履歴情報、１３４…アンケート情報、１３６…分析情報、１３８…ＤＮＮ構成情報、１４０…層毎パラメータ情報 1... Analysis system, 10... Terminal device, 20... Service providing device, 30... Log acquisition device, 40... Advertisement distribution server device, 100... Analysis device, 102... Reception unit, 104... Display unit, 106... Communication unit, 110 ... control unit, 112... acquisition unit, 114... learning processing unit, 116... classification processing unit, 130... storage unit, 132... action history information, 134... questionnaire information 136... analysis information, 138... DNN configuration information, 140... Parameter information for each layer

Claims

When the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be configured independently are included in the multi-layered system is close to the first teacher data. After learning the classification system, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier approaches the second teacher data. A learning processing unit for learning the classification system based on a learning result using the first teacher data;
Based on the classification system learned by the learning processing unit, a classification processing unit that classifies unlearned data into a predetermined category,
An analysis device having.

The first data is data indicating a behavior history of the user,
The first teacher data is data based on the answer of the user,
The second teacher data is data obtained by analyzing the answer of the user,
The analysis device according to claim 1 .

The first data is text data,
The first teacher data is text data having a number of words smaller than the number of words of the first data,
The second teacher data is text data having a number of words smaller than the number of words of the first teacher data.
The analysis device according to claim 1 .

The first data is image data,
The first teacher data is data indicating a recognition result of the image data by each of an unspecified number of users,
The second teacher data is data indicating a recognition result of the image data by each of an unspecified number of users, and has more correct recognition results than the first teacher data.
The analysis device according to claim 1 .

The learning processing unit,
Using the input device and the output device of the plurality of classifiers included in the classification system and the first classifier, the error of the second data output from the output device with respect to the first teacher data is calculated. Determining the parameters of the first classifier to be small,
Using the input device and the output device, and the first classifier and the second classifier, to reduce an error of the third data output from the output device with respect to the second teacher data, Determining the parameters of the second classifier while maintaining the determined parameters of the first classifier,
The analysis device according to any one of claims 1 to 4 .

Computer
When the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be configured independently are included in the multi-layered system is close to the first teacher data. After learning the classification system, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier approaches the second teacher data. Train the classification system based on a learning result using the first teacher data,
Classifying unlearned data into predetermined categories based on the learned classification system;
analysis method.

On the computer,
When the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be configured independently are included in the multi-layered system is close to the first teacher data. After learning the classification system, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier approaches the second teacher data. A process of learning the classification system based on a learning result using the first teacher data,
A process of classifying unlearned data into a predetermined category based on the learned classification system;
A program to execute.