JP2018128942A

JP2018128942A - Analyzing apparatus, analyzing method, and program

Info

Publication number: JP2018128942A
Application number: JP2017022775A
Authority: JP
Inventors: 江森　正; Tadashi Emori; 正江森
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2017-02-10
Filing date: 2017-02-10
Publication date: 2018-08-16
Anticipated expiration: 2037-02-10
Also published as: JP6719399B2

Abstract

PROBLEM TO BE SOLVED: To provide an analyzing apparatus, an analyzing method, and a program which can improve accuracy of the learning.SOLUTION: An analyzing apparatus has a learning processing unit for, when a first data is input, in order that a second data output from a first sorter included in a classification system formed of multi-layer sorter each of which can be formed independently becomes closer to a first teacher data, making the classification system learn and for, then, when the first data is input, in order that a third data output from a second sorter in a deeper hierarchy as compared with the first sorter becomes closer to a second teacher data, making the classification system learn on the basis of a learning result based on the first teacher data, and a classification processing unit for classifying unlearned data into a predetermined category on the basis of the classification system which are made to learn by the learning processing unit.SELECTED DRAWING: Figure 2

Description

本発明は、解析装置、解析方法、およびプログラムに関する。 The present invention relates to an analysis apparatus, an analysis method, and a program.

従来、過去の検索行動からデータの関連性を自動的に学習し、その学習を利用して将来の検索行動を助ける技術が知られている（例えば、特許文献１参照）。また、ニューラルネットワークを用いて、特徴量の抽出と、その特徴量を用いた機械学習とを一括で行う技術が知られている（例えば、特許文献２参照）。 2. Description of the Related Art Conventionally, a technology is known that automatically learns data relevance from past search behaviors and uses the learning to assist future search behaviors (see, for example, Patent Document 1). In addition, a technique is known in which a feature amount is extracted and machine learning using the feature amount is performed at once using a neural network (see, for example, Patent Document 2).

特開２００６−２８５９８２号公報JP 2006-285882 A 特表２０１０−５１９６４２号公報JP 2010-519642 A

しかしながら、従来の技術では、学習の精度が十分でない場合があった。 However, in the conventional technique, the accuracy of learning may not be sufficient.

本発明は、上記の課題に鑑みてなされたものであり、学習の精度を向上させることができる解析装置、解析方法、およびプログラムを提供することを目的としている。 SUMMARY An advantage of some aspects of the invention is that it provides an analysis apparatus, an analysis method, and a program that can improve the accuracy of learning.

本発明の一態様は、第１データを入力したときに、独立に構成しうる分類器が多層に構成された分類システムに含まれる第１分類器から出力される第２データが、第１教師データに近づくように、前記分類システムを学習させた後、前記第１データを入力したときに、前記第１分類器よりも階層が深い第２分類器から出力される第３データが、第２教師データに近づくように、前記第１教師データを用いた学習結果に基づいて前記分類システムを学習させる学習処理部と、前記学習処理部により学習させられた分類システムに基づいて、未学習のデータを所定のカテゴリーに分類する分類処理部と、を備える解析装置である。 According to one aspect of the present invention, when the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be independently configured are configured in multiple layers is the first teacher. After learning the classification system to approach the data, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier is the second data. A learning processing unit that learns the classification system based on a learning result using the first teacher data so as to approach the teacher data, and an unlearned data based on the classification system learned by the learning processing unit And a classification processing unit that classifies the data into a predetermined category.

本発明の一態様によれば、学習の精度を向上させることができる解析装置、解析方法、およびプログラムを提供することができる。 According to one embodiment of the present invention, it is possible to provide an analysis device, an analysis method, and a program that can improve the accuracy of learning.

実施形態における解析装置１００を含む解析システム１の一例を示す図である。It is a figure which shows an example of the analysis system 1 containing the analysis apparatus 100 in embodiment. 実施形態における解析装置１００の構成の一例を示す図である。It is a figure which shows an example of a structure of the analyzer 100 in embodiment. 行動履歴情報１３２の一例を示す図である。It is a figure showing an example of action history information 132. アンケートの一例を示す図である。It is a figure which shows an example of a questionnaire. アンケート情報１３４の一例を示す図である。It is a figure which shows an example of the questionnaire information. 分析情報１３６の一例を示す図である。It is a figure which shows an example of the analysis information 136. FIG. ＤＮＮ構成情報１３８の一例を示す図である。It is a figure which shows an example of the DNN structure information 138. ＤＮＮ構成情報１３８に基づき生成されるディープニューラルネットワークＤＮＮを模式的に示す図である。It is a figure which shows typically the deep neural network DNN produced | generated based on DNN structure information 138. FIG. 学習処理部１１４により実行される処理の一例を示すフローチャートである。6 is a flowchart illustrating an example of processing executed by a learning processing unit 114. 層毎パラメータ情報１４０の一例を示す図である。It is a figure which shows an example of the parameter information 140 for every layer. 図９に示すフローチャートのループ処理の内容を模式的に示す図である。It is a figure which shows typically the content of the loop process of the flowchart shown in FIG. 分類処理部１１６により実行される処理の一例を示すフローチャートである。5 is a flowchart illustrating an example of processing executed by a classification processing unit 116. 本実施形態におけるディープニューラルネットワークＤＮＮの学習手法と、比較例として例示する手法とのそれぞれの学習精度を示す図である。It is a figure which shows each learning accuracy of the learning method of the deep neural network DNN in this embodiment, and the method illustrated as a comparative example. 実施形態の解析装置１００のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the analyzer 100 of embodiment.

以下、本発明を適用した解析装置、解析方法、およびプログラムを、図面を参照して説明する。 Hereinafter, an analysis apparatus, an analysis method, and a program to which the present invention is applied will be described with reference to the drawings.

［概要］
解析装置は、一以上のプロセッサによって実現される。解析装置は、ある入力データｖ（第１データの一例）を入力したときに、独立に構成しうる分類器が多層に構成された分類システムに含まれる第１分類器から出力される出力データｈ（第２データの一例）が、第１教師データに近づくように、分類システムを学習させた後、入力データｖを入力したときに、第１分類器よりも階層が深い第２分類器から出力される出力データｈ（第３データの一例）が、第２教師データに近づくように、第１教師データを用いた学習結果に基づいて分類システムを学習させる。 [Overview]
The analysis device is realized by one or more processors. When the analysis apparatus inputs certain input data v (an example of the first data), output data h output from the first classifier included in the classification system in which classifiers that can be configured independently are configured in multiple layers. When the input data v is input after learning the classification system so that (an example of the second data) approaches the first teacher data, it is output from the second classifier that is deeper than the first classifier. The classification system is trained based on the learning result using the first teacher data so that the output data h (an example of the third data) to be approximated to the second teacher data.

例えば、「分類器」は、ニューラルネットワークであり、「分類システム」は、複数のニューラルネットワークにより構成されるディープニューラルネットワークＤＮＮである。この場合、「第１分類器」は、ディープニューラルネットワークＤＮＮの隠れ層に含まれる複数の階層のうち、任意の階層のニューラルネットワークであり、「第２分類器」は、隠れ層に含まれる複数の階層のうち、少なくとも第１分類器としたニューラルネットワークの階層よりも深い階層のニューラルネットワークである。「深い階層」とは、後述する出力層に近い階層であることを意味する。「第１分類器」および「第２分類器」のそれぞれは、ある一つの階層のニューラルネットワークを意味してもよいし、複数の階層のニューラルネットワークを意味してもよい。 For example, the “classifier” is a neural network, and the “classification system” is a deep neural network DNN composed of a plurality of neural networks. In this case, the “first classifier” is a neural network of an arbitrary hierarchy among a plurality of layers included in the hidden layer of the deep neural network DNN, and the “second classifier” is a plurality of layers included in the hidden layer. The neural network of a hierarchy deeper than the hierarchy of the neural network at least as the first classifier. “Deep hierarchy” means a hierarchy close to an output layer described later. Each of the “first classifier” and the “second classifier” may mean a neural network of a certain hierarchy, or may mean a neural network of a plurality of hierarchies.

そして、解析装置は、学習させた分類システムに基づいて、未学習のデータを所定のカテゴリーに分類する。これによって、学習の精度を向上させることができる。以下、一例として、分類システムが、複数のニューラルネットワークにより構成されるディープニューラルネットワークＤＮＮであるものとして説明する。 Then, the analysis device classifies the unlearned data into a predetermined category based on the learned classification system. Thereby, the accuracy of learning can be improved. Hereinafter, as an example, a description will be given assuming that the classification system is a deep neural network DNN composed of a plurality of neural networks.

［全体構成］
図１は、実施形態における解析装置１００を含む解析システム１の一例を示す図である。実施形態における解析システム１は、例えば、一以上の端末装置１０と、サービス提供装置２０と、ログ取得装置３０と、広告配信サーバ装置４０と、解析装置１００とを備える。これらの装置は、ネットワークＮＷを介して接続される。なお、解析システム１に含まれる複数の装置の一部または全部は、一つの解析装置１００内に集約されていてもよい。 [overall structure]
FIG. 1 is a diagram illustrating an example of an analysis system 1 including an analysis apparatus 100 according to the embodiment. The analysis system 1 in the embodiment includes, for example, one or more terminal devices 10, a service providing device 20, a log acquisition device 30, an advertisement distribution server device 40, and an analysis device 100. These devices are connected via a network NW. Note that some or all of the plurality of devices included in the analysis system 1 may be integrated into one analysis device 100.

図１に示す各装置は、ネットワークＮＷを介して種々の情報を送受信する。ネットワークＮＷは、例えば、無線基地局、Ｗｉ‐Ｆｉアクセスポイント、通信回線、プロバイダ、インターネットなどを含む。なお、図１に示す各装置の全ての組み合わせが相互に通信可能である必要はなく、ネットワークＮＷは、一部にローカルなネットワークを含んでもよい。 Each apparatus shown in FIG. 1 transmits and receives various information via the network NW. The network NW includes, for example, a radio base station, a Wi-Fi access point, a communication line, a provider, the Internet, and the like. Note that it is not necessary for all combinations of the devices shown in FIG. 1 to be able to communicate with each other, and the network NW may partially include a local network.

端末装置１０は、ユーザによって使用される装置である。端末装置１０は、例えば、スマートフォンなどの携帯電話、タブレット端末、パーソナルコンピュータなどのコンピュータ装置である。例えば、端末装置１０は、ユーザから所定の操作を受け付けると、予めインストールされたアプリケーションを介してサービス提供装置２０と通信を行い、アプリケーション上で表示或いは再生するコンテンツを取得してよい。コンテンツは、例えば、動画データや、画像データ、音声データ、テキストデータなどである。例えば、アプリケーションは、インターネットショッピングや検索サービスを享受可能なアプリケーションであってもよいし、ＳＮＳ（Social Networking Service）、メールサービス、情報提供サービス（例えばニュースや天気予報など）などを享受可能なアプリケーションであってもよい。 The terminal device 10 is a device used by a user. The terminal device 10 is a computer device such as a mobile phone such as a smartphone, a tablet terminal, or a personal computer. For example, when receiving a predetermined operation from the user, the terminal device 10 may communicate with the service providing device 20 via a preinstalled application and acquire content to be displayed or reproduced on the application. The content is, for example, moving image data, image data, audio data, text data, or the like. For example, the application may be an application that can enjoy Internet shopping and a search service, or an application that can enjoy SNS (Social Networking Service), mail service, information providing service (for example, news and weather forecast), and the like. There may be.

また、端末装置１０は、ユーザから所定の操作を受け付けると、所定のウェブブラウザを介して、サービス提供装置２０が提供するウェブサイトにアクセスしてもよい。例えば、サービス提供装置２０により提供されるウェブサイトでは、上述した各種アプリケーションにより提供されるサービスと同様のサービスが提供される。 Further, when receiving a predetermined operation from the user, the terminal device 10 may access a website provided by the service providing apparatus 20 via a predetermined web browser. For example, on the website provided by the service providing apparatus 20, services similar to the services provided by the various applications described above are provided.

サービス提供装置２０は、インターネット上において、ショッピングサイトや検索サイト等のウェブサイトを提供するウェブサーバ装置であってよいし、アプリケーションが起動された端末装置１０と通信を行って、各種情報の受け渡しを行うアプリケーションサーバ装置であってもよい。 The service providing device 20 may be a web server device that provides a website such as a shopping site or a search site on the Internet, and communicates with the terminal device 10 on which the application is started to exchange various information. The application server apparatus to perform may be sufficient.

ログ取得装置３０は、例えば、端末装置１０またはサービス提供装置２０から、ウェブブラウザごとに管理されるクッキーやキャッシュ、ＨＴＴＰ（Hypertext Transfer Protocol）リクエストのログファイルなどを収集することで、端末装置１０を利用するユーザの行動履歴を取得する。行動履歴とは、例えば、インターネットショッピングでの購買履歴（購入した商品またはサービスの種類や金額、個数など）や、検索サイトでの検索履歴（使用された検索エンジンや入力された検索クエリなど）、動画像サイトでの視聴履歴（動画像の種類や視聴時間など）、どういったウェブページからのアクセスなのかを識別するＨＴＴＰリファラ（ＵＲＬの履歴）などである。 The log acquisition device 30 collects, for example, cookies and caches managed for each web browser, log files of HTTP (Hypertext Transfer Protocol) requests, and the like from the terminal device 10 or the service providing device 20, thereby causing the terminal device 10 to Acquire user action history. Behavior history includes, for example, purchase history on internet shopping (such as the type, price, and number of products or services purchased), search history on search sites (such as search engines used and search queries entered), A viewing history (moving image type, viewing time, etc.) on a moving image site, an HTTP referer (URL history) for identifying what kind of web page is accessed.

また、ログ取得装置３０は、ウェブブラウザを介したインターネット上でのユーザの行動履歴と同様に、端末装置１０において起動されたアプリケーションごとの購買履歴や検索履歴、視聴履歴などをユーザの行動履歴として取得してよい。 In addition, the log acquisition device 30 uses the purchase history, the search history, the viewing history, and the like for each application activated in the terminal device 10 as the user's behavior history in the same manner as the user's behavior history on the Internet via the web browser. You may get it.

また、例えば、端末装置１０で起動されるアプリケーションが、端末装置１０の位置情報（例えばＧＰＳ（Global Positioning System）の測位座標）を利用しながら、近くの実店舗を紹介したり、その地域の気象情報を提供したりするような各種サービスを提供する場合、ログ取得装置３０は、端末装置１０に蓄積された位置情報を、ユーザの行動履歴として取得してもよい。 In addition, for example, an application that is activated on the terminal device 10 introduces a nearby real store or uses the location information of the terminal device 10 (for example, GPS (Global Positioning System positioning coordinates)) When providing various services such as providing information, the log acquisition device 30 may acquire position information stored in the terminal device 10 as a user's action history.

ログ取得装置３０に取得されるユーザの行動履歴は、端末装置１０毎であってもよいし、一つの端末装置１０におけるＯＳ（Operating System）単位でのアカウント毎であってもよいし、ウェブブラウザまたはアプリケーション単位でのアカウント毎であってもよい。 The user action history acquired by the log acquisition device 30 may be for each terminal device 10, may be for each account in OS (Operating System) unit in one terminal device 10, or may be a web browser. Alternatively, it may be per account in application units.

なお、ログ取得装置３０は、サービス提供装置２０からユーザの行動履歴を取得するのに代えて、或いは加えて、他のサービス提供装置（不図示）からユーザの行動履歴を取得してもよい。ログ取得装置３０は、取得したユーザの行動履歴を示す情報を、解析装置１００に送信する。 Note that the log acquisition device 30 may acquire the user's behavior history from another service provision device (not shown) instead of or in addition to acquiring the user's behavior history from the service provision device 20. The log acquisition device 30 transmits information indicating the acquired user action history to the analysis device 100.

広告配信サーバ装置４０は、例えば、解析装置１００による解析結果に基づいて、サービス提供装置２０が端末装置１０に対してサービスを提供するのに伴って広告を配信する。例えば、端末装置１０が、サービス提供装置２０から広告枠を含むウェブページまたはこれに相当するアプリケーション用のスタイルシートを受信すると、広告枠に予め設けられたアドタグまたはＳＤＫ（Software Development Kit）に基づいて、広告配信サーバ装置４０に広告配信リクエストを送信する。これを受けて、広告配信サーバ装置４０は、予め契約しておいた広告依頼主のウェブサイトなどにアクセスすることが可能な広告画像（例えばＵＲＬが埋め込まれた画像）を広告配信リクエストに対するレスポンスとして端末装置１０に送信する。これによって、端末装置１０の画面には、広告画像が表示される。 For example, the advertisement distribution server device 40 distributes advertisements as the service providing device 20 provides a service to the terminal device 10 based on the analysis result of the analysis device 100. For example, when the terminal device 10 receives a web page including an advertising space or a style sheet for an application corresponding thereto from the service providing device 20, the terminal device 10 is based on an ad tag or SDK (Software Development Kit) provided in advance in the advertising space. Then, an advertisement distribution request is transmitted to the advertisement distribution server device 40. In response to this, the advertisement distribution server device 40 uses, as a response to the advertisement distribution request, an advertisement image (for example, an image in which a URL is embedded) that can access the website of the advertisement requester who has contracted in advance. It transmits to the terminal device 10. As a result, the advertisement image is displayed on the screen of the terminal device 10.

解析装置１００は、ディープニューラルネットワークＤＮＮを用いて、ログ取得装置３０により取得されたユーザの行動履歴を基に、その行動履歴に対応するユーザを所定のカテゴリーに分類する。上述したように、ディープニューラルネットワークＤＮＮは、少なくとも２層以上の階層を含むニューラルネットワークである。そして、解析装置１００は、ユーザごとのカテゴリーの分類結果を、例えば広告配信サーバ装置４０に提供する。これによって、広告配信サーバ装置４０は、分類されたカテゴリーに応じて広告の配信内容を変更することができる。 Using the deep neural network DNN, the analysis device 100 classifies the user corresponding to the behavior history into a predetermined category based on the behavior history of the user acquired by the log acquisition device 30. As described above, the deep neural network DNN is a neural network including at least two layers. Then, the analysis apparatus 100 provides the category distribution result for each user to the advertisement distribution server apparatus 40, for example. Thereby, the advertisement delivery server device 40 can change the delivery content of the advertisement according to the classified category.

［解析装置の構成］
以下、図を参照して解析装置１００の構成について説明する。図２は、実施形態における解析装置１００の構成の一例を示す図である。図示のように、解析装置１００は、例えば、受付部１０２と、表示部１０４と、通信部１０６と、制御部１１０と、記憶部１３０とを備える。 [Configuration of analyzer]
Hereinafter, the configuration of the analysis apparatus 100 will be described with reference to the drawings. FIG. 2 is a diagram illustrating an example of a configuration of the analysis apparatus 100 according to the embodiment. As illustrated, the analysis apparatus 100 includes, for example, a reception unit 102, a display unit 104, a communication unit 106, a control unit 110, and a storage unit 130.

受付部１０２は、ユーザからの操作入力を受け付けるタッチパネルやキーボード、マウスなどの入力インターフェースである。表示部１０４は、液晶表示装置などの表示装置である。 The accepting unit 102 is an input interface such as a touch panel, a keyboard, and a mouse that accepts an operation input from a user. The display unit 104 is a display device such as a liquid crystal display device.

通信部１０６は、例えば、ＮＩＣ（Network Interface Card）等の通信インターフェースやＤＭＡ（Direct Memory Access）コントローラを含む。通信部１０６は、ネットワークＮＷを介して、ログ取得装置３０または広告配信サーバ装置４０などと通信する。例えば、通信部１０６は、ログ取得装置３０からユーザの行動履歴を受信し、これを行動履歴情報１３２として記憶部１３０に記憶させる。 The communication unit 106 includes, for example, a communication interface such as a NIC (Network Interface Card) or a DMA (Direct Memory Access) controller. The communication unit 106 communicates with the log acquisition device 30 or the advertisement distribution server device 40 via the network NW. For example, the communication unit 106 receives a user's action history from the log acquisition device 30 and stores this in the storage unit 130 as action history information 132.

図３は、行動履歴情報１３２の一例を示す図である。図示の例のように、行動履歴情報１３２は、時刻に対して、端末装置１０の位置がどの住所に対応するのかが対応付けられていたり、使用された検索エンジン名とその検索エンジンに入力された検索クエリとが対応付けられていたり、閲覧されたウェブサイトのタイトル名とそのウェブサイトのＵＲＬとが対応付けられていたりする情報である。また、行動履歴情報１３２は、例えば、時刻に対して、購入された商品名またはサービス名と、これらが購入されたウェブサイトのＵＲＬとが対応付けられた情報であってよい。当然のことながら、ユーザは、ある事象に対して、想定し得る全ての行動を取るわけではなく、１つ又は２、３程度、多くとも数十から数百程度の行動を取るのが常である。例えば、ショッピングサイトで販売される商品のうち、一人のユーザが全ての商品を購入することは極めてまれであり、通常は、幾つかの商品を購入するだけである。従って、行動履歴情報１３２が示すユーザの行動履歴は、想定し得る全ての行動履歴を網羅的に含んでいないという意味で、情報量の乏しいスパース（疎）なデータとなる傾向がある。 FIG. 3 is a diagram illustrating an example of the action history information 132. As shown in the example in the figure, the action history information 132 is associated with the address to which address the position of the terminal device 10 corresponds, or input to the search engine name used and the search engine. The search query is associated with each other, or the title name of the browsed website and the URL of the website are associated with each other. Moreover, the action history information 132 may be information in which, for example, the purchased product name or service name and the URL of the website from which the product is purchased are associated with the time. Of course, a user does not take all possible actions for a certain event, but usually takes one, a few, or even a few tens to hundreds of actions. is there. For example, it is extremely rare for one user to purchase all of the products sold at a shopping site, and usually only a few products are purchased. Therefore, the user's behavior history indicated by the behavior history information 132 tends to be sparse data with a small amount of information in the sense that it does not comprehensively include all possible behavior histories.

制御部１１０は、例えば、取得部１１２と、学習処理部１１４と、分類処理部１１６とを備える。これらの構成要素は、例えば、ＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）などのプロセッサが記憶部１３０に格納されたプログラムを実行することにより実現される。また、制御部１１０の構成要素の一部または全部は、ＬＳＩ（Large Scale Integration）、ＡＳＩＣ（Application Specific Integrated Circuit）、またはＦＰＧＡ（Field-Programmable Gate Array）などのハードウェアにより実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。 The control unit 110 includes, for example, an acquisition unit 112, a learning processing unit 114, and a classification processing unit 116. These components are realized, for example, when a processor such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) executes a program stored in the storage unit 130. In addition, some or all of the components of the control unit 110 may be realized by hardware such as a large scale integration (LSI), an application specific integrated circuit (ASIC), or a field-programmable gate array (FPGA). It may be realized by cooperation of software and hardware.

記憶部１３０は、例えば、ＨＤＤ（Hard Disc Drive）、フラッシュメモリ、ＥＥＰＲＯＭ（Electrically Erasable Programmable Read Only Memory）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）などにより実現される。記憶部１３０は、ファームウェアやアプリケーションプログラムなどの各種プログラムや上述した行動履歴情報１３２の他に、アンケート情報１３４、分析情報１３６、ＤＮＮ構成情報１３８、および層毎パラメータ情報１４０などを記憶する。これらの情報については後述する。 The storage unit 130 is realized by, for example, an HDD (Hard Disc Drive), a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), a ROM (Read Only Memory), a RAM (Random Access Memory), or the like. The storage unit 130 stores questionnaire information 134, analysis information 136, DNN configuration information 138, layer-by-layer parameter information 140, and the like in addition to various programs such as firmware and application programs and the action history information 132 described above. Such information will be described later.

取得部１１２は、例えば、通信部１０６を用いて外部サーバ（不図示）と通信することで、この外部サーバに記憶されたアンケート（サーベイ）の結果を示す情報を取得し、これをアンケート情報１３４として記憶部１３０に記憶させる。 For example, the acquisition unit 112 communicates with an external server (not shown) by using the communication unit 106 to acquire information indicating a result of a questionnaire (survey) stored in the external server, and obtains the questionnaire information 134. Is stored in the storage unit 130.

なお、サービス提供装置２０がアンケートに回答可能なアンケートモニターサイト、またはアンケートモニターのサービスをアプリケーションを介して提供する場合、取得部１１２は、通信部１０６を用いてサービス提供装置２０からアンケート情報１３４を取得してもよい。また、アンケート情報１３４は、予め記憶部１３０に記憶されていてもよい。 When the service providing apparatus 20 provides a questionnaire monitor site that can answer a questionnaire or a questionnaire monitor service via an application, the acquisition unit 112 uses the communication unit 106 to obtain the questionnaire information 134 from the service providing apparatus 20. You may get it. The questionnaire information 134 may be stored in the storage unit 130 in advance.

また、例えば、解析装置１００を管理する管理者などが受付部１０２に対してアンケートの結果を入力する場合、取得部１１２は、受付部１０２に対して入力された情報をアンケート情報１３４としてもよい。 For example, when an administrator who manages the analysis apparatus 100 inputs a questionnaire result to the reception unit 102, the acquisition unit 112 may use the information input to the reception unit 102 as the questionnaire information 134. .

図４は、アンケートの一例を示す図である。図示の例のように、アンケートは、定型化された調査項目（図中Ｑ１〜４等）に対して、予め決められた選択肢が用意されている。図示の例では、５つの選択肢が用意されているがこれに限られず、２つ、３つ、または４つであってもよいし、６つ以上であってもよい。 FIG. 4 is a diagram illustrating an example of a questionnaire. As in the example shown in the figure, the questionnaire is prepared with predetermined options for the standardized survey items (Q1 to 4 in the figure). In the illustrated example, five options are prepared, but the present invention is not limited to this, and may be two, three, four, or six or more.

図５は、アンケート情報１３４の一例を示す図である。図示の例のように、アンケート情報１３４は、例えば、調査項目に対して、ユーザにより選択された選択肢に応じたスコアが対応付けられた情報である。アンケート情報１３４のユーザと、行動履歴情報１３２のユーザは、少なくとも一部が同一であるものとする。すなわち、アンケートの回答者であるユーザの行動履歴は、行動履歴情報１３２として取得されているものとする。 FIG. 5 is a diagram illustrating an example of the questionnaire information 134. Like the example of illustration, the questionnaire information 134 is the information by which the score according to the choice selected by the user was matched with the investigation item, for example. The user of the questionnaire information 134 and the user of the action history information 132 are at least partially the same. That is, it is assumed that the action history of the user who is the answerer of the questionnaire is acquired as the action history information 132.

例えば、上述したアンケートにおいて、調査項目に対して肯定的な回答選択肢ほどスコアが高く設定されている。例えば「そうは思わない」や「全くそうは思わない」といった否定的な回答選択肢に比べて、「非常にそう思う」や「ややそう思う」といった肯定的な回答選択肢の方がスコアは高く設定される。なお、この関係は一例であり、調査項目に対して否定的な回答選択肢ほどスコアが高く設定されてもよい。 For example, in the above-described questionnaire, the higher the answer option for the survey item, the higher the score is set. For example, compared to negative answer options such as “I do not think so” or “I do not think so”, positive answer options such as “I think so” or “I think so” are set higher. Is done. Note that this relationship is an example, and a negative answer option may be set higher for a survey item.

また、取得部１１２は、上述したアンケートの分析結果を示す情報を取得し、これを分析情報１３６として記憶部１３０に記憶させる。例えば、アンケート情報１３４の取得元である装置が分析を行う場合には、取得部１１２は、分析を行った装置から分析情報１３６を取得してよい。また、例えば、アンケート結果を分析するアナリストなどが受付部１０２に対して分析結果を入力する場合、取得部１１２は、受付部１０２に対して入力された情報を分析情報１３６としてもよい。 In addition, the acquisition unit 112 acquires information indicating the analysis result of the questionnaire described above, and stores this in the storage unit 130 as analysis information 136. For example, when the device from which the questionnaire information 134 is acquired performs analysis, the acquisition unit 112 may acquire the analysis information 136 from the device that has performed the analysis. For example, when an analyst who analyzes a questionnaire result inputs an analysis result to the reception unit 102, the acquisition unit 112 may use the information input to the reception unit 102 as analysis information 136.

例えば、アンケート結果の分析は、（１）アンケートに含まれる各調査項目をどのように設計したのか、（２）どういった分析手法で定量的に評価するのか、（３）分析の結果としてアンケートに回答したユーザをどう分類するのか、といった事項を考慮して行われる。例えば、（１）については、調査項目に対する回答方法が、選択肢形式であるのか或いは記述形式であるのか、選択肢形式であるならば選択肢は文章で表現されているのか或いは数値で表現されているのか、といったことを考慮する必要がある。また、（２）については、ユーザの回答に応じたスコアを用いて、例えば、因子分析を行うことで、アンケートに回答したユーザの特性に寄与する因子が何であるのか、或いはユーザの特性において各因子がどの程度寄与するのか、といったことを考慮する必要がある。また、因子分析の代わりに主成分分析を行う場合、ユーザの特性を表しているであろう複数の変数から、どの変数が主成分となるのかを考慮する必要がある。また、（３）については、最終的な分析結果として、アンケートに回答した複数のユーザのそれぞれの特性から、どのユーザとどのユーザを同じ傾向と見做すのか、或いはユーザの分類先のカテゴリーは各ユーザに対して一つとするのか複数とするのかを考慮する必要がある。このように、アンケート結果を分析するアナリストなどが、アンケートを行うことに至った目的から適切に分析し、アンケートに回答したユーザをある程度の粒度で分類するための統計的な分類指標となるパラメータを導出する必要がある。このようなパラメータは、アナリストなどが適切に設定する必要のあるハイパーパラメータとして扱われる。 For example, the analysis of the questionnaire results includes (1) how each survey item included in the questionnaire is designed, (2) what analysis method is used for quantitative evaluation, and (3) the results of the analysis This is done in consideration of matters such as how to classify users who responded to. For example, for (1), whether the answer method for the survey item is an option format or a description format, or if the option format is an option format, is the option expressed in text or numerically? It is necessary to consider such things. For (2), by using a score according to the user's answer, for example, by performing factor analysis, what is the factor contributing to the characteristics of the user who answered the questionnaire, It is necessary to consider how much the factor contributes. In addition, when principal component analysis is performed instead of factor analysis, it is necessary to consider which variable is the principal component from a plurality of variables that may represent the characteristics of the user. For (3), as a final analysis result, which user and which user are considered to have the same tendency from the characteristics of each of the plurality of users who answered the questionnaire, or the category of the user's classification destination is It is necessary to consider whether there is one or more for each user. In this way, parameters that serve as statistical classification indicators for analysts who analyze questionnaire results to analyze appropriately from the purpose of conducting the questionnaire and classify users who answered the questionnaire to a certain degree of granularity Need to be derived. Such parameters are treated as hyperparameters that an analyst or the like needs to set appropriately.

例えば、（３）のハイパーパラメータとして、カテゴリーを「年収の高い」、「年収が低い」とする場合、（２）のハイパーパラメータ（分析指標）を、アンケートに回答したユーザのうち、年収の高いユーザに共通する因子や年収の低いユーザに共通する因子などをパラメータとすればよい。 For example, when the category is “high annual income” and “low annual income” as the hyperparameter of (3), the hyperparameter (analysis index) of (2) has the highest annual income among the users who answered the questionnaire Factors common to users or factors common to users with low annual income may be used as parameters.

図６は、分析情報１３６の一例を示す図である。図示の例では、アンケート結果に対して、上述した（１）〜（３）の事項を考慮した上で因子分析が行われた場合の結果を示している。例えば、分析情報１３６は、因子分析の際に考慮した因子（図中ＦＡＣ１等）に対して、アンケートに回答したユーザごとに因子スコアが対応付けられた情報である。因子スコアは、例えば、因子とユーザの特性との相関の程度を表す指標値である。図示の例では、因子ＦＡＣ１に対して、ユーザＡの因子スコアは、１．１１３であり、ユーザＢの因子スコアは、０．３８０であり、ユーザＣの因子スコアは、０．３９７であることを表している。これらの因子スコアは、例えば、アンケートの調査項目に対して用意された選択肢のうち、ユーザにより選択された回答選択肢に応じたスコアに基づいて計算されてよい。 FIG. 6 is a diagram illustrating an example of the analysis information 136. In the example shown in the figure, the result when the factor analysis is performed on the questionnaire result in consideration of the items (1) to (3) described above is shown. For example, the analysis information 136 is information in which a factor score is associated with each of the users who answered the questionnaire with respect to the factor considered in the factor analysis (such as FAC1 in the figure). The factor score is, for example, an index value that represents the degree of correlation between the factor and the user characteristics. In the illustrated example, the factor score of the user A is 1.113, the factor score of the user B is 0.380, and the factor score of the user C is 0.397 with respect to the factor FAC1. Represents. These factor scores may be calculated based on, for example, a score corresponding to the answer option selected by the user among the options prepared for the survey items of the questionnaire.

また、図示の例のように、分析情報１３６は、アンケートに回答したユーザごとにどういったカテゴリーに分類したのかを表す分類結果を示してよい。ユーザの分類先とするカテゴリーは、例えば、上述した因子スコアなどの複数の分類指標を総合的に考慮することで導出されてよい。例えば、因子ＦＡＣ１〜ＦＡＣ４までの全ての因子スコアが正値であるユーザの場合、そのユーザのカテゴリーは１型とし、いずれか１つの因子スコアが負値であるユーザの場合、そのユーザのカテゴリーは２型とし、いずれか２つの因子スコアが負値であるユーザの場合、そのユーザのカテゴリーは３型とし、いずれか３つ以上の因子スコアが負値であるユーザの場合、そのユーザのカテゴリーは４型とする、といったような分類をしてよい。 Further, as in the example shown in the figure, the analysis information 136 may indicate a classification result indicating what category the user has classified for each user who answered the questionnaire. The category to be classified by the user may be derived, for example, by comprehensively considering a plurality of classification indexes such as the above-described factor score. For example, if all factor scores from factor FAC1 to FAC4 are positive values, the user category is type 1, and if any one of the factor scores is negative, the user category is In the case of a user who is a type 2 and any two factor scores are negative, the category of the user is a type 3, and in the case of a user where any three or more factor scores are negative, the category of the user is Classification may be made such as type 4.

学習処理部１１４は、ＤＮＮ構成情報１３８に基づいて学習対象とするディープニューラルネットワークＤＮＮを生成（構築）し、このディープニューラルネットワークＤＮＮに対して、上述した行動履歴情報１３２、アンケート情報１３４、および分析情報１３６を用いて事前学習（Pre-training）を行う。 The learning processing unit 114 generates (constructs) a deep neural network DNN to be learned based on the DNN configuration information 138, and the behavior history information 132, the questionnaire information 134, and the analysis described above for the deep neural network DNN. Pre-training is performed using information 136.

図７は、ＤＮＮ構成情報１３８の一例を示す図である。図示の例のように、ＤＮＮ構成情報１３８は、各階層に対して、ニューラルネットワークの数を表すユニット数が対応付けられた情報である。一般的に、ディープニューラルネットワークＤＮＮは、入力層、隠れ層（中間層）、出力層の３つの階層により構成される。入力層には、ディープニューラルネットワークＤＮＮに学習させたいデータが入力される。出力層からは、ディープニューラルネットワークＤＮＮによって学習された結果が出力される。隠れ層は、学習の核となる処理を行う。例えば、隠れ層は、活性化関数（伝達関数）と呼ばれる関数により表現され、入力に応じた出力を返す。例えば、活性化関数は、正規化線形関数（ＲｅＬＵ関数）やシグモイド関数、ステップ関数などであるがこれに限られず、任意の関数が用いられてよい。入力層のニューラルネットワークは、「入力器」の一例であり、出力層のニューラルネットワークは、「出力器」の一例である。 FIG. 7 is a diagram illustrating an example of the DNN configuration information 138. As in the illustrated example, the DNN configuration information 138 is information in which the number of units representing the number of neural networks is associated with each layer. In general, the deep neural network DNN includes three layers: an input layer, a hidden layer (intermediate layer), and an output layer. Data to be learned by the deep neural network DNN is input to the input layer. A result learned by the deep neural network DNN is output from the output layer. The hidden layer performs processing that is the core of learning. For example, the hidden layer is expressed by a function called an activation function (transfer function) and returns an output corresponding to the input. For example, the activation function is a normalized linear function (ReLU function), a sigmoid function, a step function, or the like, but is not limited thereto, and an arbitrary function may be used. The neural network in the input layer is an example of “input device”, and the neural network in the output layer is an example of “output device”.

ＤＮＮ構成情報１３８は、これらの各階層のユニット数を定めている。図示の例では、入力層のユニット数は１個であり、出力層のユニット数はｓ個であることを表している。また、隠れ層は、少なくとも２層以上であり、図示の例では、ｎ層（ｎ≧２）であることを表している。また、隠れ層の各層のユニット数はｒ個であることを表している。なお、隠れ層の各層のユニット数は、一律ｒ個である必要はなく、各層で異なる個数であってもよい。上述した隠れ層の階層数と各層のユニット数はハイパーパラメータであり、任意に変更されてよい。例えば、アンケート結果の分析により５つのカテゴリーに分類することが決められた場合、出力層のユニット数は５つに設定されてよい。また、学習の精度を向上させるために、隠れ層のユニット数を増やしてもよい。 The DNN configuration information 138 defines the number of units in each of these layers. In the illustrated example, the number of units in the input layer is one, and the number of units in the output layer is s. Further, the hidden layer is at least two layers, and in the illustrated example, the hidden layer represents n layers (n ≧ 2). Moreover, it represents that the number of units of each layer of a hidden layer is r. Note that the number of units in each layer of the hidden layer is not necessarily r, and may be different in each layer. The number of hidden layers and the number of units in each layer described above are hyperparameters and may be arbitrarily changed. For example, when it is decided to classify into five categories by analyzing the questionnaire results, the number of units in the output layer may be set to five. Moreover, in order to improve the learning accuracy, the number of hidden layer units may be increased.

学習処理部１１４は、このようなＤＮＮ構成情報１３８を参照することで、事前学習の際に構成が予め決められたディープニューラルネットワークＤＮＮを生成する。 The learning processing unit 114 refers to such DNN configuration information 138 to generate a deep neural network DNN having a predetermined configuration at the time of prior learning.

図８は、ＤＮＮ構成情報１３８に基づき生成されるディープニューラルネットワークＤＮＮを模式的に示す図である。図示の例のように、ディープニューラルネットワークＤＮＮは、入力層の一以上のユニット（ニューラルネットワーク）から、隠れ層のｎ層のうち、最も浅い層の複数のユニットのそれぞれに対して、神経伝達網を摸したエッジが接続される。「最も浅い層」とは、隠れ層に含まれる複数の層の中で、最も入力層に近い層である。本実施形態では、入力層側に近いことを「浅い」と表現し、出力層側に近いことを「深い」と表現する。上述した例の場合、最も浅い層は第１層である。第１層に含まれる各ユニットからは、隠れ層のｎ層のうち、第１層の次に最も浅い層（以下、第２層と称する）の複数のユニットのそれぞれに対して、エッジが接続される。第２層に含まれる各ユニットからは、隠れ層のｎ層のうち、第２層の次に最も浅い層（以下、第３層と称する）の複数のユニットのそれぞれに対して、エッジが接続される。第３層以降の層のユニットについても同様に、より深い層のユニットに対してエッジが接続される。最も深い第ｎ層の各ユニットからは、出力層の一以上のユニットにエッジが接続される。このように、ディープニューラルネットワークＤＮＮは、例えば、制約付きボルツマンマシン（ＲＢＭ）のように、各層のユニット同士がエッジで接続される状態確率モデルとして表現される。なお、ディープニューラルネットワークＤＮＮは、制約がないボルツマンマシンのように、更に、同じ層に属すユニット同士がエッジで接続される状態確率モデルとして表現されてもよい。 FIG. 8 is a diagram schematically showing the deep neural network DNN generated based on the DNN configuration information 138. As shown in FIG. As in the example shown in the figure, the deep neural network DNN has a nerve transmission network from one or more units (neural network) of the input layer to each of a plurality of units of the shallowest layer among the n layers of the hidden layer. Edges that are not connected are connected. The “shallowest layer” is a layer closest to the input layer among a plurality of layers included in the hidden layer. In the present embodiment, the fact that it is close to the input layer side is expressed as “shallow”, and the case that it is close to the output layer side is expressed as “deep”. In the example described above, the shallowest layer is the first layer. From each unit included in the first layer, an edge is connected to each of a plurality of units of the shallowest layer (hereinafter referred to as the second layer) after the first layer among n layers of the hidden layer. Is done. From each unit included in the second layer, an edge is connected to each of a plurality of units of the shallowest layer (hereinafter referred to as the third layer) after the second layer among the n layers of the hidden layer. Is done. Similarly, for the units in the third and subsequent layers, the edge is connected to the units in the deeper layer. An edge is connected to each unit of the output layer from each deepest nth layer unit. In this way, the deep neural network DNN is expressed as a state probability model in which units of each layer are connected by edges, such as a constrained Boltzmann machine (RBM). Note that the deep neural network DNN may be expressed as a state probability model in which units belonging to the same layer are connected by edges, like a Boltzmann machine without restrictions.

学習処理部１１４は、事前学習として、生成したディープニューラルネットワークＤＮＮにおいて、隠れ層の各層を段階的に学習することで、ディープニューラルネットワークＤＮＮの初期設定を行う。 As pre-learning, the learning processing unit 114 performs initial setting of the deep neural network DNN by learning each layer of the hidden layer step by step in the generated deep neural network DNN.

［事前学習］
以下、事前学習の処理をフローチャートに即して説明する。図９は、学習処理部１１４により実行される処理の一例を示すフローチャートである。 [Pre-learning]
Hereinafter, the prior learning process will be described with reference to a flowchart. FIG. 9 is a flowchart illustrating an example of processing executed by the learning processing unit 114.

まず、学習処理部１１４は、隠れ層のｎ層のうち、第ｉ層を処理対象の階層に決定する（Ｓ１００）。ｉは、処理の際に一時的に計算されるテンポラリパラメータであり、１からｎまでの範囲における自然数が与えられる。 First, the learning processing unit 114 determines the i-th layer among the n layers of the hidden layers as a processing target layer (S100). i is a temporary parameter temporarily calculated during processing, and a natural number in the range of 1 to n is given.

例えば、学習処理部１１４は、第１層（ｉ＝１）を処理対象の階層に決定する。なお、学習処理部１１４は、第１層以外の第２層や第３層などを処理対象の階層に決定してもよい。以下の説明では、まず始めに第１層を処理対象の階層に決定するものとする。 For example, the learning processing unit 114 determines the first layer (i = 1) as a processing target layer. Note that the learning processing unit 114 may determine the second layer or the third layer other than the first layer as a processing target layer. In the following description, first, the first layer is determined as a processing target layer.

次に、学習処理部１１４は、生成したディープニューラルネットワークＤＮＮから、入力層および出力層と、前回までの処理において処理対象の階層として決定した全ての階層と、今回の処理において処理対象の階層として決定した第ｉ層とを抽出する（Ｓ１０２）。初回の処理の場合、例えば、学習処理部１１４は、ディープニューラルネットワークＤＮＮから、入力層および出力層と、第１層とを抽出する。 Next, the learning processing unit 114, from the generated deep neural network DNN, the input layer and the output layer, all the layers determined as processing target layers in the previous processing, and the processing target layer in the current processing The determined i-th layer is extracted (S102). In the case of the first process, for example, the learning processing unit 114 extracts the input layer, the output layer, and the first layer from the deep neural network DNN.

次に、学習処理部１１４は、抽出した階層からなるニューラルネットワークにおいて、入力層に、アンケートに回答したユーザの行動履歴を示す行動履歴情報１３２を入力する（Ｓ１０４）。この際、学習処理部１１４は、ユーザの行動履歴を、プロセッサが処理可能な情報に変換することで、入力層に対する入力データｖとしてよい。 Next, the learning processing unit 114 inputs the action history information 132 indicating the action history of the user who answered the questionnaire in the input layer in the extracted neural network (S104). At this time, the learning processing unit 114 may convert the user's action history into information that can be processed by the processor, thereby obtaining input data v for the input layer.

例えば、学習処理部１１４は、行動履歴情報１３２を参照して、ウェブページのビュー数や所定の検索クエリの入力回数などを導出し、これを入力データｖとする。また、学習処理部１１４は、購入商品の商品コードや閲覧サイトのＵＲＬなどの一つ一つの行動履歴に関連する文字列（アルファベットや数字、記号などを含んでよい）をベクトル化し、このベクトルを入力層に対する入力データｖとしてもよい。学習処理部１１４は、抽出した隠れ層の各ユニットの活性化関数に基づいて、出力層から出力されることになる出力データｈを導出する。出力データｈは、例えば、以下の数式（１）によって表すことができる。 For example, the learning processing unit 114 refers to the action history information 132 to derive the number of views of the web page, the number of times of input of a predetermined search query, and the like as input data v. In addition, the learning processing unit 114 vectorizes a character string (which may include alphabets, numbers, symbols, and the like) related to each action history such as the product code of the purchased product and the URL of the browsing site. The input data v for the input layer may be used. The learning processing unit 114 derives output data h to be output from the output layer based on the extracted activation function of each unit of the hidden layer. The output data h can be expressed by, for example, the following formula (1).

ｈ＝σ（Ｗ^Ｔｖ＋ｂ） …（１） h = σ (W ^T v + b) (1)

σは、各層のそれぞれのユニットの活性化関数を表し、Ｗは、ある層のユニットから、より深い層のユニットにデータが出力される際に、出力データに対して付与される重みを表し、ｂは、各層の固有のバイアス成分を表している。 σ represents an activation function of each unit of each layer, W represents a weight given to output data when data is output from a unit of a certain layer to a unit of a deeper layer, b represents a unique bias component of each layer.

次に、学習処理部１１４は、ディープニューラルネットワークＤＮＮにおいて処理対象として選択した階層の深さに応じて、アンケート情報１３４および分析情報１３６から、教師データｔとするデータを抽出する（Ｓ１０６）。教師データｔとは、抽出した階層からなるニューラルネットワーク（上述した例では、入力層、第１層、および出力層の３層からなるニューラルネットワーク）において、出力層から出力される出力データの規範となるデータである。 Next, the learning processing unit 114 extracts data as teacher data t from the questionnaire information 134 and the analysis information 136 according to the depth of the hierarchy selected as a processing target in the deep neural network DNN (S106). The teacher data t is the norm of the output data output from the output layer in the extracted neural network (in the above example, the neural network including the input layer, the first layer, and the output layer). Is the data.

例えば、学習処理部１１４は、処理対象とする階層がより深い層であるほど、アンケート結果の分析をより進めたときに得られる指標値を教師データｔとして抽出する。例えば、アンケート結果に対して多変量解析が行われた状態は、多変量解析が行われていない状態と比べて、アンケート結果の分析がより進められた状態と見做せる。また、多変量解析の結果を基にユーザを分類するカテゴリーが決定された状態は、多変量解析が行われた直後の状態と比べて、更にアンケート結果の分析が進められた状態と見做せる。 For example, the learning processing unit 114 extracts, as the teacher data t, the index value obtained when the analysis of the questionnaire result is further advanced as the layer to be processed is deeper. For example, a state where the multivariate analysis is performed on the questionnaire result can be regarded as a state where the analysis of the questionnaire result is further advanced as compared with a state where the multivariate analysis is not performed. In addition, the state in which the category for categorizing users based on the results of multivariate analysis has been determined can be regarded as a state in which the analysis of the questionnaire results has been further advanced compared to the state immediately after the multivariate analysis was performed. .

このような前提のもとで、例えば、学習処理部１１４は、第１層を処理対象とする場合、分析が行われていない生データであるアンケート情報１３４を参照して、アンケートの回答のために選択された選択肢のスコアを、教師データｔとして抽出する。また、学習処理部１１４は、例えば、第２層を処理対象とする場合、因子分析が行われたデータである分析情報１３６を参照して、因子スコアを教師データｔとして抽出する。また、学習処理部１１４は、例えば、第３層を処理対象とする場合、分析情報１３６を参照して、因子スコアを基に決定されたカテゴリーを示す数値を教師データｔとして抽出する。すなわち、学習処理部１１４は、第１層を処理対象とするときには、上述したアンケート結果の分析時に考慮される事項（１）のハイパーパラメータを教師データｔとし、第２層を処理対象とするときには、アンケート結果の分析時に考慮される事項（２）のハイパーパラメータを教師データｔとし、第３層を処理対象とするときには、アンケート結果の分析時に考慮される事項（３）のハイパーパラメータを教師データｔとする。このように、学習処理部１１４は、互いに種類の異なるデータ（指標値）を各階層の教師データｔとする。この結果、分析を進めることにより得られる知見を、隠れ層の各層のニューラルネットワークに段階的に学習させることができる。 Under such a premise, for example, when the first processing layer is the processing target, the learning processing unit 114 refers to the questionnaire information 134 that is raw data that has not been analyzed, for answering the questionnaire. The score of the selected option is extracted as teacher data t. In addition, for example, when the second layer is a processing target, the learning processing unit 114 refers to the analysis information 136 that is data on which factor analysis has been performed, and extracts a factor score as teacher data t. For example, when the third layer is a processing target, the learning processing unit 114 refers to the analysis information 136 and extracts a numerical value indicating a category determined based on the factor score as the teacher data t. That is, when the first layer is to be processed, the learning processing unit 114 sets the hyperparameter of the item (1) considered at the time of the above-described questionnaire result analysis as the teacher data t and the second layer as the processing target. When the hyperparameter of the item (2) considered in the analysis of the questionnaire result is the teacher data t and the third layer is the processing target, the hyperparameter of the item (3) considered in the analysis of the questionnaire result is the teacher data. Let t. As described above, the learning processing unit 114 sets different types of data (index values) as the teacher data t of each layer. As a result, the knowledge obtained by proceeding with the analysis can be learned step by step by the neural network in each layer of the hidden layer.

次に、学習処理部１１４は、抽出した階層からなるニューラルネットワークにおいて、出力層から出力される出力データｈと、処理対象とする階層の深さに応じて抽出した教師データｔとの誤差を評価関数Ｉとして導出する（Ｓ１０８）。例えば、評価関数Ｉは、（１／２）×（ｈ−ｔ）^２として表すことができる。なお、評価関数Ｉは、出力層の全ユニットのそれぞれから出力されるデータと、教師データｔとの誤差（二乗誤差）の総和であってよい。 Next, the learning processing unit 114 evaluates an error between the output data h output from the output layer and the teacher data t extracted according to the depth of the processing target hierarchy in the extracted neural network. Derived as function I (S108). For example, the evaluation function I can be expressed as (1/2) × (ht) ² . The evaluation function I may be the sum of errors (square errors) between the data output from each of all units in the output layer and the teacher data t.

次に、学習処理部１１４は、誤差逆伝播法を用いて、評価関数Ｉを最小化するように、各層の重みＷとバイアスｂを決定する（Ｓ１１０）。例えば、学習処理部１１４は、評価関数Ｉの重みＷに関する偏微分を計算することで、評価関数Ｉの勾配∂Ｉ／∂Ｗを導出し、勾配∂Ｉ／∂Ｗに基づいて重みＷおよびバイアスｂを決定する。 Next, the learning processing unit 114 determines the weight W and the bias b of each layer so as to minimize the evaluation function I using the error back propagation method (S110). For example, the learning processing unit 114 derives the gradient ∂I / ∂W of the evaluation function I by calculating a partial derivative of the evaluation function I with respect to the weight W, and the weight W and the bias based on the gradient ∂I / ∂W. b is determined.

次に、学習処理部１１４は、各層の重みＷとバイアスｂを、処理対象の階層に対応付け、この対応付けた情報を、層毎パラメータ情報１４０として記憶部１３０に記憶する（Ｓ１１２）。 Next, the learning processing unit 114 associates the weight W and bias b of each layer with the hierarchy to be processed, and stores the associated information in the storage unit 130 as layer-specific parameter information 140 (S112).

図１０は、層毎パラメータ情報１４０の一例を示す図である。図示の例のように、層毎パラメータ情報１４０は、処理対象の階層に、決定された重みＷとバイアスｂとが対応付けられた情報である。 FIG. 10 is a diagram illustrating an example of the parameter information 140 for each layer. As in the illustrated example, the layer-by-layer parameter information 140 is information in which the determined weight W and the bias b are associated with the processing target layer.

次に、学習処理部１１４は、生成したディープニューラルネットワークＤＮＮの１〜ｎまでの全ての階層を処理対象の階層として決定したか否かを判定する（Ｓ１１４）。 Next, the learning processing unit 114 determines whether or not all layers 1 to n of the generated deep neural network DNN are determined as processing target layers (S114).

全ての階層を処理対象の階層として決定していない場合、学習処理部１１４は、テンポラリパラメータｉに１を加算して（Ｓ１１６）、処理を上述したＳ１００に移す。これにより、前回までに決定されたいずれの階層とも異なる階層が処理対象の階層に決定される。上述した例では、初回処理時に第１層が処理対象の階層に決定されるため、二回目の処理時には、例えば、第２層が処理対象の階層に決定される。 If all hierarchies have not been determined as processing target hierarchies, the learning processing unit 114 adds 1 to the temporary parameter i (S116), and moves the process to S100 described above. Thereby, a hierarchy different from any hierarchy determined up to the previous time is determined as a hierarchy to be processed. In the above-described example, the first layer is determined as the processing target layer at the time of the first processing, and therefore the second layer is determined as the processing target layer at the second processing.

一方、全ての階層を処理対象の階層として決定した場合、学習処理部１１４は、本フローチャートの処理を終了する。 On the other hand, when all the hierarchies are determined as processing target hierarchies, the learning processing unit 114 ends the process of this flowchart.

なお、上述したフローチャートの処理では、学習処理部１１４が、Ｓ１００の処理において、隠れ層のｎ層のうち、ある一つの第ｉ層を処理対象の階層に決定したがこれに限られない。例えば、学習処理部１１４は、隠れ層のｎ層のうち、複数層（例えば第１層と第２層など）をまとめて処理対象の階層に決定してもよい。この場合、学習処理部１１４は、処理対象の階層として決定した複数層のパラメータをまとめて学習してよい。 In the process of the flowchart described above, the learning processing unit 114 determines a certain i-th layer among the n layers of the hidden layers in the process of S100, but the present invention is not limited to this. For example, the learning processing unit 114 may collectively determine a plurality of layers (for example, the first layer and the second layer) among the n layers of the hidden layers as a processing target layer. In this case, the learning processing unit 114 may learn the parameters of a plurality of layers determined as the processing target hierarchy.

図１１は、図９に示すフローチャートのループ処理の内容を模式的に示す図である。図１０の例では、簡略的に各層のユニットが一つであるものとし、各層のパラメータとして重みＷのみを表現している。例えば、１回目の処理では、入力層と、隠れ層の第１層と、出力層とが抽出される。例えば、入力層に入力される入力データｖは、ユーザの行動履歴を数値化したデータである。この場合、第１層の重みＷ_１を決定するために、第１層の深さの程度に応じた第１教師データｔ_１が抽出される。例えば、第１教師データｔ_１には、上述したように、アンケートの回答のために選択された選択肢のスコアが採用されてよい。このとき、アンケート結果は、入力データｖとして入力された行動履歴の動作主体であったユーザによる回答結果であるものとする。例えば、ユーザＡの行動履歴を入力層に入力する場合、第１教師データｔ_１は、ユーザＡのアンケート結果が利用される。 FIG. 11 is a diagram schematically showing the contents of the loop processing of the flowchart shown in FIG. In the example of FIG. 10, it is assumed that there is one unit for each layer, and only the weight W is expressed as a parameter for each layer. For example, in the first process, the input layer, the first layer of the hidden layer, and the output layer are extracted. For example, the input data v input to the input layer is data obtained by digitizing a user's action history. In this case, in order to determine the weight W ₁ of the first layer, the first teacher data t ₁ corresponding to the depth of the first layer is extracted. For example, the first training data t _1, as described above, the score of the selected alternatives may be employed for the survey responses. At this time, it is assumed that the questionnaire result is an answer result by the user who was the operation subject of the action history input as the input data v. For example, when applied to the input layer to the action history of the user A, the first training data t _1, the survey results of the user A is utilized.

２回目の処理では、例えば、入力層と、隠れ層の第１層および第２層と、出力層とが抽出される。例えば、入力層に入力される入力データｖは、１回目の処理で入力層に入力されたデータと同様に、ユーザの行動履歴を数値化したデータが採用される。例えば、１回目の処理で、あるユーザＡの行動履歴を数値化したデータが入力層に入力された場合、２回目の処理で入力層に入力される入力データｖは、ユーザＡの行動履歴を数値化したデータとなる。２回目の処理では、学習処理部１１４は、第２教師データｔ_２と、出力層から出力される出力データｈとの誤差を小さくするように、１回目の処理において決定された第１層のパラメータを維持しつつ新たに追加された第２層の重みＷ_２を決定する。第２教師データｔ_２は、第２層の階層の深さの程度に応じたデータであり、例えば、上述したように、因子分析により得られた因子スコアが用いられてよい。例えば、学習処理部１１４は、２回目の処理において、第１層の重みＷ_１＃を１回目の処理時に決定された重みＷ_１とすることで、第１層のパラメータを維持する。 In the second process, for example, the input layer, the first and second layers of the hidden layer, and the output layer are extracted. For example, as the input data v input to the input layer, data obtained by quantifying the user's action history is employed, similarly to the data input to the input layer in the first process. For example, in the first process, when data that digitizes the action history of a user A is input to the input layer, the input data v input to the input layer in the second process is the action history of the user A. It becomes numerical data. In the second process, the learning processing unit 114, a second training data t _2, so as to reduce the error between the output data h which is output from the output layer, the first layer determined in the first process of The newly added weight W ₂ of the second layer is determined while maintaining the parameters. The second training data t ₂ is the data corresponding to the degree of depth of the hierarchy of the second layer, for example, as described above, factor score may be used which is obtained by factor analysis. For example, the learning processing unit 114 in the processing of the second time, by a weight W ₁ to the determined weight W _{1 #} of the first layer during the first process, to maintain the parameters of the first layer.

３回目の処理では、例えば、入力層と、隠れ層の第１〜３層と、出力層とが抽出される。例えば、入力層に入力される入力データｖは、１回目および２回目の処理で入力層に入力されたデータが採用される。３回目の処理では、学習処理部１１４は、第３教師データｔ_３と、出力層から出力される出力データｈとの誤差を小さくするように、１回目の処理において決定された第１層のパラメータと、２回目の処理において決定された第２層のパラメータとを維持しつつ新たに追加された第３層の重みＷ_３を決定する。第３教師データｔ_３は、第３層の階層の深さの程度に応じたデータであり、例えば、上述したように、因子スコアを基に決定されたカテゴリーを示す数値が用いられてよい。例えば、学習処理部１１４は、３回目の処理において、第１層の重みＷ_１＃を１回目の処理時に決定された重みＷ_１とし、第２層の重みＷ_２＃を２回目の処理時に決定された重みＷ_２とすることで、第１層および第２層のパラメータを維持する。 In the third process, for example, the input layer, the first to third layers of the hidden layer, and the output layer are extracted. For example, as the input data v input to the input layer, data input to the input layer in the first and second processing is adopted. In the third process, the learning processing unit 114, a third training data t _3, so as to reduce the error between the output data h which is output from the output layer, the first layer determined in the first process of parameters and to determine a third layer weights W ₃ of the newly added while maintaining the parameters of the second layer determined in the processing of the second time. The third teacher data t ₃ is data corresponding to the depth of the third layer, and for example, as described above, a numerical value indicating a category determined based on the factor score may be used. For example, the learning processing unit 114, in the third process, the weighting W _{1 #} of the first layer first and a weight W _1, which is determined at the time of treatment, the weight W _{2 #} of the second layer during the second process with determined weight W _2, to maintain the parameters of the first and second layers.

このように、処理の回数を重ねるごとに隠れ層を追加すると共に、追加する隠れ層の深さに応じたデータを教師データとすることにより、ディープニューラルネットワークＤＮＮに隠れ層の一つ一つの層を精度良く学習させることができる。また、前回の処理で決定した各層のパラメータを、次回の処理において維持するため、内部共変量シフトの発生を抑制することができる。これによって、内部共変量シフトが生じることで学習時間がより長くなってしまうのを抑制することができる。なお、内部共変量シフトの発生を許容する場合、学習処理部１１４は、前回の処理で決定したパラメータを今回の処理の誤差逆伝搬により再度決定してよい。この場合、学習処理部１１４は、層毎パラメータ情報１４０において、各層のパラメータを更新してよい。 Thus, every time the number of processes is repeated, a hidden layer is added, and data corresponding to the depth of the added hidden layer is used as teacher data, so that each layer of the hidden layer is added to the deep neural network DNN. Can be learned with high accuracy. In addition, since the parameters of each layer determined in the previous process are maintained in the next process, the occurrence of an internal covariate shift can be suppressed. This can prevent the learning time from becoming longer due to the occurrence of the internal covariate shift. Note that when the occurrence of the internal covariate shift is allowed, the learning processing unit 114 may re-determine the parameter determined in the previous process by the error back propagation of the current process. In this case, the learning processing unit 114 may update the parameters of each layer in the layer-by-layer parameter information 140.

分類処理部１１６は、ＤＮＮ構成情報１３８と、層毎パラメータ情報１４０とを参照することで、事前学習により各階層の重みＷとバイアスｂの初期値が決められたディープニューラルネットワークＤＮＮを生成（再構築）して、未学習のユーザの行動履歴を基に、その行動履歴に対応するユーザを所定のカテゴリーに分類する。「未学習」とは、例えば、事前学習の段階で学習に利用されていないことである。 By referring to the DNN configuration information 138 and the layer parameter information 140, the classification processing unit 116 generates (re-creates) a deep neural network DNN in which the initial values of the weight W and the bias b of each layer are determined by prior learning. The user corresponding to the behavior history is classified into a predetermined category based on the behavior history of the unlearned user. “Unlearned” means, for example, that it is not used for learning at the stage of prior learning.

図１２は、分類処理部１１６により実行される処理の一例を示すフローチャートである。本フローチャートの処理は、例えば、所定の周期で繰り返し行われる。 FIG. 12 is a flowchart illustrating an example of processing executed by the classification processing unit 116. The processing of this flowchart is repeatedly performed at a predetermined cycle, for example.

まず、分類処理部１１６は、取得部１１２により新たな行動履歴情報１３２が取得されたか否かを判定し（Ｓ２００）、新たな行動履歴情報１３２が取得された場合、ＤＮＮ構成情報１３８と、層毎パラメータ情報１４０とに基づいて、階層ごとのパラメータ（重みＷおよびバイアスｂ）の初期値が決定されたディープニューラルネットワークＤＮＮを生成する（Ｓ２０２）。 First, the classification processing unit 116 determines whether or not new action history information 132 has been acquired by the acquisition unit 112 (S200). When the new action history information 132 is acquired, the DNN configuration information 138 and the layer A deep neural network DNN in which initial values of parameters (weight W and bias b) for each layer are determined is generated based on the parameter information 140 (S202).

次に、分類処理部１１６は、生成したディープニューラルネットワークＤＮＮにおいて、入力層に、アンケートに回答したユーザの行動履歴を示す行動履歴情報１３２を数値化して入力する（Ｓ２０４）。 Next, the classification processing unit 116 digitizes and inputs the action history information 132 indicating the action history of the user who answered the questionnaire in the input layer in the generated deep neural network DNN (S204).

次に、分類処理部１１６は、ディープニューラルネットワークＤＮＮの出力層から出力される出力データｈを、ディープニューラルネットワークＤＮＮによる学習結果として出力する（Ｓ２０６）。 Next, the classification processing unit 116 outputs the output data h output from the output layer of the deep neural network DNN as a learning result by the deep neural network DNN (S206).

例えば、分類処理部１１６は、ディープニューラルネットワークＤＮＮによる学習結果を、通信部１０６を用いてサービス提供装置２０や広告配信サーバ装置４０に送信する。これによって、例えば、サービス提供装置２０は、ディープニューラルネットワークＤＮＮにより分類されたユーザのカテゴリーに応じたセール情報などを、ユーザが利用する端末装置１０に送信することができる。また、広告配信サーバ装置４０は、例えば、ディープニューラルネットワークＤＮＮにより分類されたユーザのカテゴリーに応じた広告を、ユーザが利用する端末装置１０に送信することができる。 For example, the classification processing unit 116 transmits the learning result by the deep neural network DNN to the service providing apparatus 20 and the advertisement distribution server apparatus 40 using the communication unit 106. Thereby, for example, the service providing apparatus 20 can transmit, for example, sale information corresponding to the category of the user classified by the deep neural network DNN to the terminal apparatus 10 used by the user. Moreover, the advertisement delivery server apparatus 40 can transmit the advertisement according to the user's category classified by the deep neural network DNN to the terminal apparatus 10 used by the user, for example.

図１３は、本実施形態におけるディープニューラルネットワークＤＮＮの学習手法と、比較例として例示する手法とのそれぞれの学習精度を示す図である。比較例の手法は、例えば、隠れ層の各層の出力データｈを、その層の入力データｖに近づくように、各層のパラメータを学習させる手法である。すなわち、比較例の手法は、第ｉ層から出力される出力データｈと、第（ｉ−１）層の出力データｈ（＝第ｉ層への入力データｖ）とが近づくように、隠れ層の第ｉ層のパラメータを学習させる手法である。図中の最も上段のレコードと中段のレコードは、比較例の手法による学習精度を表し、最も下段のレコードは、本実施形態の手法による学習精度を表している。 FIG. 13 is a diagram illustrating the learning accuracy of the deep neural network DNN learning method according to the present embodiment and the method exemplified as a comparative example. The method of the comparative example is, for example, a method of learning the parameters of each layer so that the output data h of each layer of the hidden layer approaches the input data v of that layer. That is, in the method of the comparative example, the output data h output from the i-th layer and the output data h of the (i−1) -th layer (= input data v to the i-th layer) are close to each other. This is a method for learning the parameters of the i-th layer. In the figure, the uppermost record and the middle record represent learning accuracy by the method of the comparative example, and the lowermost record represents learning accuracy by the method of the present embodiment.

最上段のレコードに示す比較例の手法は、入力層のユニットが１個であり、第１層のユニットが１０００個であり、第２層のユニットが１０００個であり、出力層のユニットが５個である場合のディープニューラルネットワークＤＮＮを用いたときの学習精度を表している。また、中段のレコードに示す比較例の手法は、入力層のユニットが１個であり、第１層のユニットが１０００個であり、第２層のユニットが１０００個であり、第Ｘ層のユニットが所定数であり、出力層のユニットが５個である場合のディープニューラルネットワークＤＮＮを用いたときの学習精度を表している。また、最下段のレコードに示す本実施形態の手法は、入力層のユニットが１個であり、第１層のユニットが１０００個であり、第２層のユニットが１０００個であり、第Ｙ層のユニット数が上述した第Ｘ層のユニット数と同じであり、出力層のユニットが５個である場合のディープニューラルネットワークＤＮＮを用いたときの学習精度を表している。 The method of the comparative example shown in the uppermost record has one input layer unit, 1000 first layer units, 1000 second layer units, and 5 output layer units. The learning accuracy is shown when the deep neural network DNN is used. The method of the comparative example shown in the middle record has one input layer unit, 1000 first layer units, 1000 second layer units, and X layer unit. Is a predetermined number and represents the learning accuracy when the deep neural network DNN is used when the number of units in the output layer is five. In the method of the present embodiment shown in the bottom record, there are one input layer unit, 1000 first layer units, 1000 second layer units, and Y layer. Represents the learning accuracy when the deep neural network DNN is used in the case where the number of units is the same as the number of units in the X-th layer and the number of units in the output layer is five.

図示の例のように、最上段と中段を比較すると隠れ層の数を増やした場合であっても必ずしも学習精度が向上しないことがわかる。一般的に、ディープニューラルネットワークＤＮＮでは過学習が生じやすかったり、より深い層まで学習情報（誤差）が伝わらなかったりといった問題が生じやすい。これらに対処するためには、膨大なデータ（例えば、画像データなら少なくとも数万から数十万もの密なデータ）を入力データｖとすると共に、事前学習を行うことなどが知られている。 As in the example shown in the drawing, when the uppermost stage and the middle stage are compared, it can be seen that the learning accuracy is not necessarily improved even when the number of hidden layers is increased. In general, in the deep neural network DNN, problems such as overlearning easily occur and learning information (error) is not transmitted to a deeper layer are likely to occur. In order to deal with these problems, it is known that a large amount of data (for example, at least tens of thousands to several hundreds of thousands of dense data in the case of image data) is used as input data v and prior learning is performed.

しかしながら、インターネット上での行動履歴のように、情報としてスパースなデータを入力データｖとして用いる場合、適切な学習結果を返すモデルが生成されない傾向にある。また、入力データｖの元となる学習用のデータがスパースであると、誤差逆伝搬法などの精度も低下しやすい。このように、比較例の手法の場合、事前学習に用いる入力データｖに応じて学習精度が左右される傾向にある。 However, when sparse data as information is used as input data v, such as an action history on the Internet, a model that returns an appropriate learning result tends not to be generated. In addition, if the learning data that is the source of the input data v is sparse, the accuracy of the error back propagation method or the like is likely to decrease. Thus, in the case of the method of the comparative example, the learning accuracy tends to depend on the input data v used for the prior learning.

これに対して、本実施形態の手法は、事前学習の段階において、ディープニューラルネットワークＤＮＮに、出力データｈと入力データｖとが近づくように学習させるのではなく（入力データｖを教師データｔとするのではなく）、別途アンケート結果より求めた指標値を、各層の深さの程度に応じて変更しながら教師データｔとすることにより、スパースなデータを入力データｖとして用いた場合であっても、学習精度を向上させることができる。 On the other hand, the method of this embodiment does not cause the deep neural network DNN to learn so that the output data h and the input data v are close to each other in the preliminary learning stage (the input data v is changed to the teacher data t). In this case, the sparse data is used as the input data v by changing the index value obtained separately from the questionnaire results to the teacher data t while changing it according to the depth of each layer. Also, the learning accuracy can be improved.

以上説明した実施形態によれば、入力データｖを入力したときに、独立に構成しうる分類器が多層に構成された分類システム（例えばディープニューラルネットワークＤＮＮ）に含まれる第１分類器（例えば隠れ層の第１層のニューラルネットワーク）から出力される出力データｈが、第１教師データｔに近づくように、分類システムを学習させた後に、同じ入力データｖを入力したときに、第１分類器の階層よりも深い階層の第２分類器（例えば隠れ層の第２層のニューラルネットワーク）から出力される出力データｈが、第２教師データに近づくように、第１教師データを用いた学習結果に基づいて分類システムを学習させる学習処理部１１４と、学習処理部１１４により学習させられた分類システムに基づいて、未学習のデータを所定のカテゴリーに分類する分類処理部１１６とを備えることにより、各分類器のパラメータを適切に設定することができる。この結果、学習の精度を向上させることができる。 According to the embodiment described above, when the input data v is input, the first classifier (for example, hidden) included in the classification system (for example, deep neural network DNN) in which the classifiers that can be configured independently are configured in multiple layers. When the same input data v is input after the classification system is trained so that the output data h output from the first layer neural network) approaches the first teacher data t, the first classifier Learning result using the first teacher data so that the output data h output from the second classifier (for example, the neural network of the second layer of the hidden layer) deeper than the second hierarchy data approaches the second teacher data A learning processing unit 114 that learns the classification system based on the classification, and unlearned data based on the classification system learned by the learning processing unit 114 By providing a classification processing unit 116 classifies into categories, the parameters for each classifier can be set appropriately. As a result, the learning accuracy can be improved.

また、上述した実施形態によれば、事前学習の段階で、分類器ごとにパラメータを導出しておくので、分類システムの一部の分類器を変更したり追加したりする場合に、変更しない分類器については既に導出したパラメータ（学習済みのパラメータ）を流用することができる。例えば、ディープニューラルネットワークＤＮＮにおいて、分類結果を５パターンから、１０パターンに変更する場合、出力層のユニット数だけを変更すればよく、隠れ層についてはそのまま流用することができる。この結果、汎用性の高いディープニューラルネットワークＤＮＮを構築することができる。 In addition, according to the above-described embodiment, parameters are derived for each classifier in the pre-learning stage. Therefore, when some classifiers of the classification system are changed or added, classifications that do not change are performed. As for the container, the parameters already derived (learned parameters) can be used. For example, in the deep neural network DNN, when the classification result is changed from 5 patterns to 10 patterns, only the number of units in the output layer may be changed, and the hidden layer can be used as it is. As a result, a highly versatile deep neural network DNN can be constructed.

＜適用例＞
以下、上述した実施形態の適用例について説明する。例えば、上述した実施形態における解析装置１００は、ある検索クエリの検索結果として提供されるウェブページの要約文を生成するために、ウェブページを構成するテキストデータを入力データｖとし、ディープニューラルネットワークＤＮＮに、テキストデータの要約文を出力するように隠れ層の各層のパラメータを学習させる。このとき、解析装置１００の学習処理部１１４は、各層の出力データｈの教師データｔを、隠れ層の深さに応じてワード数が段階的に少なくなるテキストデータとする。例えば、学習処理部１１４は、最も浅い（入力層に最も近い）第１層の出力データｈに対する教師データｔ_１を、ワード数が第１所定数のテキストデータとする。第１所定数は、入力データｖとするテキストデータのワード数よりも少ない数である。また、学習処理部１１４は、第２層の出力データｈに対する教師データｔ_２を、ワード数が第１所定数よりも少ない第２所定数のテキストデータとする。このように、階層が深くなるにつれて教師データｔとするテキストデータのワード数を少なくすることで、より精度良くウェブページの要約文を生成することができる。 <Application example>
Hereinafter, application examples of the above-described embodiment will be described. For example, the analysis apparatus 100 according to the above-described embodiment uses the deep neural network DNN as text data constituting a web page as input data v in order to generate a summary sentence of the web page provided as a search result of a certain search query. Next, the parameters of each layer of the hidden layer are learned so as to output a summary sentence of the text data. At this time, the learning processing unit 114 of the analysis apparatus 100 sets the teacher data t of the output data h of each layer as text data in which the number of words decreases stepwise according to the depth of the hidden layer. For example, the learning processing unit 114 sets the teacher data t ₁ for the output data h of the first layer that is the shallowest (closest to the input layer) as text data having a first predetermined number of words. The first predetermined number is a number smaller than the number of words of the text data as the input data v. Also, the learning processing unit 114, the teacher data t ₂ for the output data h of the second layer, the number of words and a second predetermined number of the text data less than the first predetermined number. Thus, by reducing the number of words of text data used as the teacher data t as the hierarchy becomes deeper, it is possible to generate a web page summary sentence with higher accuracy.

また、上述した実施形態における解析装置１００は、文字や写真などの画像認識の際に、ある画像を入力データｖとし、ディープニューラルネットワークＤＮＮに、適切な認識結果を出力するように隠れ層の各層のパラメータを学習させてもよい。このとき、解析装置１００の学習処理部１１４は、各層の出力データｈの教師データｔを、正解データとするのではなく、クラウドソーシングなどで得られた不特定多数のユーザによる認識結果とする。例えば、不特定多数のユーザに対して、予め正解が決められた所定の画像（例えば「猫の画像」）を見せて、その画像の認識結果を、教師データｔとする。このとき、あるユーザＡは、視認した画像を「猫」であると認識したり、あるユーザＢは、視認した画像を「犬」であると認識したりする。このように、同じ画像を不特定多数のユーザのそれぞれに見せた場合、ユーザによっては正解となる認識とは異なる認識をする場合がある。従って、教師データｔは、正誤に関して揺れのあるデータとなり、正しい認識結果と誤った認識結果の双方を含むデータとなる傾向がある。例えば、学習処理部１１４は、第１層の出力データｈに対する教師データｔ_１を、クラウドソーシングなどで得られた不特定多数のユーザによる認識結果のうち、最も正誤の揺れの大きい認識結果（例えば、正：誤＝０．５：０．５）とし、第２層の出力データｈに対する教師データｔ_２を、教師データｔ_１よりも正誤の揺れの小さい認識結果（例えば、正：誤＝０．６：０．４）とし、第３層の出力データｈに対する教師データｔ_３を、教師データｔ_２よりも正誤の揺れの小さい認識結果（例えば、正：誤＝０．７：０．３）とする。このように、階層が深くなるにつれて正しい認識結果がより多いデータを教師データｔとし、最も深いｎ層の出力データｈに対する教師データｔ_ｎを、正解データ（正：誤＝１：０））とすることで、画像認識の精度を向上させることができる。 Further, the analysis apparatus 100 according to the above-described embodiment uses each layer of the hidden layer so that an image is input data v and an appropriate recognition result is output to the deep neural network DNN when recognizing images such as characters and photographs. These parameters may be learned. At this time, the learning processing unit 114 of the analysis apparatus 100 does not use the teacher data t of the output data h of each layer as the correct answer data but as a recognition result by an unspecified number of users obtained by crowdsourcing or the like. For example, a predetermined image (for example, “cat image”) whose correct answer has been determined in advance is shown to a large number of unspecified users, and the recognition result of the image is set as teacher data t. At this time, a certain user A recognizes the visually recognized image as a “cat”, and a certain user B recognizes the visually recognized image as a “dog”. In this way, when the same image is shown to each of an unspecified number of users, some users may recognize differently from the correct recognition. Therefore, the teacher data t tends to be data that fluctuates with respect to correctness and data that includes both a correct recognition result and an incorrect recognition result. For example, the learning processing unit 114 recognizes the teacher data t ₁ for the output data h of the first layer among the recognition results obtained by an unspecified number of users obtained by crowdsourcing or the like (for example, the recognition result having the largest correctness / wrongness). positive: false = 0.5: 0.5), and the teacher data _{t 2} for the output data h of the second layer, a small recognition result of shaking of correctness than teacher data _{t 1} (e.g., positive: false = 0 .6: 0.4), and the teacher data _{t 3} for the output data h in the third layer, less recognition result of shaking of correctness than the teacher data _{t 2} (e.g., positive: false = 0.7: 0.3 ). In this way, data with more correct recognition results as the depth of the hierarchy becomes teacher data t, and teacher data t _n for the deepest n-layer output data h is correct data (correct: incorrect = 1: 0)). By doing so, the accuracy of image recognition can be improved.

＜ハードウェア構成＞
上述した実施形態の解析システム１に含まれる複数の装置のうち、少なくとも解析装置１００は、例えば、図１４に示すようなハードウェア構成により実現される。図１４は、実施形態の解析装置１００のハードウェア構成の一例を示す図である。 <Hardware configuration>
Among the plurality of devices included in the analysis system 1 of the above-described embodiment, at least the analysis device 100 is realized by a hardware configuration as illustrated in FIG. 14, for example. FIG. 14 is a diagram illustrating an example of a hardware configuration of the analysis apparatus 100 according to the embodiment.

解析装置１００は、ＮＩＣ１００−１、ＣＰＵ１００−２、ＧＰＵ（Graphics Processing Unit）１００−３、ＲＡＭ１００−４、ＲＯＭ１００−５、フラッシュメモリやＨＤＤなどの二次記憶装置１００−６、およびドライブ装置１００−７が、内部バスあるいは専用通信線によって相互に接続された構成となっている。ドライブ装置１００−７には、光ディスクなどの可搬型記憶媒体が装着される。二次記憶装置１００−６、またはドライブ装置１００−７に装着された可搬型記憶媒体に格納されたプログラムがＤＭＡコントローラ（不図示）などによってＲＡＭ１００−４に展開され、ＣＰＵ１００−２やＧＰＵ１００−３によって実行されることで、制御部１１０が実現される。制御部１１０が参照するプログラムは、ネットワークＮＷを介して他の装置からダウンロードされてもよい。 The analysis device 100 includes a NIC 100-1, a CPU 100-2, a GPU (Graphics Processing Unit) 100-3, a RAM 100-4, a ROM 100-5, a secondary storage device 100-6 such as a flash memory and an HDD, and a drive device 100-. 7 are connected to each other by an internal bus or a dedicated communication line. The drive device 100-7 is loaded with a portable storage medium such as an optical disk. A program stored in a portable storage medium mounted on the secondary storage device 100-6 or the drive device 100-7 is expanded in the RAM 100-4 by a DMA controller (not shown) or the like, and the CPU 100-2 or the GPU 100-3. As a result, the control unit 110 is realized. The program referred to by the control unit 110 may be downloaded from another device via the network NW.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何ら限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 As mentioned above, although the form for implementing this invention was demonstrated using embodiment, this invention is not limited to such embodiment at all, In the range which does not deviate from the summary of this invention, various deformation | transformation and substitution Can be added.

１…解析システム、１０…端末装置、２０…サービス提供装置、３０…ログ取得装置、４０…広告配信サーバ装置、１００…解析装置、１０２…受付部、１０４…表示部、１０６…通信部、１１０…制御部、１１２…取得部、１１４…学習処理部、１１６…分類処理部、１３０…記憶部、１３２…行動履歴情報、１３４…アンケート情報、１３６…分析情報、１３８…ＤＮＮ構成情報、１４０…層毎パラメータ情報 DESCRIPTION OF SYMBOLS 1 ... Analysis system, 10 ... Terminal device, 20 ... Service providing device, 30 ... Log acquisition device, 40 ... Advertisement distribution server device, 100 ... Analysis device, 102 ... Reception part, 104 ... Display part, 106 ... Communication part, 110 ... Control unit, 112 ... Acquisition unit, 114 ... Learning processing unit, 116 ... Classification processing unit, 130 ... Storage unit, 132 ... Behavior history information, 134 ... Question information, 136 ... Analysis information, 138 ... DNN configuration information, 140 ... Parameter information for each layer

Claims

When the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be configured independently are configured to be close to the first teacher data After learning the classification system, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier approaches the second teacher data. A learning processing unit for learning the classification system based on a learning result using the first teacher data;
A classification processing unit that classifies unlearned data into a predetermined category based on the classification system learned by the learning processing unit;
An analysis apparatus comprising:

The first teacher data is a different type of data from the first data and the second data,
The second teacher data is a different type of data from the first data and the third data.
The analysis device according to claim 1.

The first data is data indicating a user's behavior history,
The first teacher data is data based on the user's answer,
The second teacher data is data obtained by analyzing the answer of the user.
The analysis device according to claim 2.

The first data is text data,
The first teacher data is text data having a number of words smaller than the number of words of the first data,
The second teacher data is text data having a number of words smaller than the number of words of the first teacher data.
The analysis device according to claim 2.

The first data is image data;
The first teacher data is data indicating a recognition result of the image data by each of an unspecified number of users,
The second teacher data is data indicating a recognition result of the image data by each of an unspecified number of users, and is data having more correct recognition results than the first teacher data.
The analysis device according to claim 2.

The learning processing unit
An error of the second data output from the output device with respect to the first teacher data is obtained by using an input device and an output device of the plurality of classifiers included in the classification system and the first classifier. Determining the parameters of the first classifier to be small;
Using the input device and the output device, and the first classifier and the second classifier, so as to reduce the error of the third data output from the output device with respect to the second teacher data, Determining the parameters of the second classifier while maintaining the determined parameters of the first classifier;
The analysis device according to any one of claims 1 to 5.

When the first data is input, the second data output from the classification system in which the classifiers that can be configured independently are configured in multiple layers approaches the first data and the third data different from the second data. A learning processing unit for learning the classification system;
A classification processing unit that classifies unlearned data into a predetermined category based on the classification system learned by the learning processing unit;
An analysis apparatus comprising:

Computer
When the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be configured independently are configured to be close to the first teacher data After learning the classification system, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier approaches the second teacher data. Learning the classification system based on a learning result using the first teacher data;
Classifying unlearned data into a predetermined category based on the learned classification system;
analysis method.

On the computer,
When the first data is input, the second data output from the first classifier included in the classification system in which the classifiers that can be configured independently are configured to be close to the first teacher data After learning the classification system, when the first data is input, the third data output from the second classifier having a deeper hierarchy than the first classifier approaches the second teacher data. A process of learning the classification system based on a learning result using the first teacher data;
A process of classifying unlearned data into a predetermined category based on the learned classification system;
A program that executes