JP2020064537A

JP2020064537A - Recommendation system and recommendation method

Info

Publication number: JP2020064537A
Application number: JP2018197228A
Authority: JP
Inventors: シャイマダヒリ; Chaima Dhahri; 啓一郎帆足; Keiichiro Hoashi
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2018-10-19
Filing date: 2018-10-19
Publication date: 2020-04-23
Anticipated expiration: 2038-10-19
Also published as: JP7026032B2

Abstract

To provide a recommendation system and a recommendation method capable of recommending a recommended action to a specific individual.SOLUTION: The recommendation system S includes: a general-purpose model creating unit 11 that creates a general-purpose machine learning model for presenting a recommended action content to a user based on a data set indicating a relationship between action contents of a plurality of persons and moods; a personal model creating unit 12 that creates a personal machine learning model for a specific user based on the general-purpose machine learning model; a mood identifying unit 13 that identifies a mood of the specific user; and a recommendation unit 14 that recommends recommended actions output from the personal machine learning models in response to input of mood information indicating the mood of the specific user, identified by the mood identifying unit 13, to the personal machine learning model. The personal model creating unit 12 updates the personal machine learning model based on feedback information indicating a satisfaction level of the specific user.SELECTED DRAWING: Figure 2

Description

本発明は、推奨する行動をレコメンドするレコメンドシステム及びレコメンド方法に関する。 The present invention relates to a recommendation system and a recommendation method for recommending recommended behavior.

ＧＡＮ（Generative Adversarial Network）を用いてユーザの嗜好を推定し、推定した結果に基づいてユーザが取るべき行動をレコメンドする方法が知られている（例えば、非特許文献１を参照）。 There is known a method of estimating a user's preference using GAN (Generative Adversarial Network) and recommending an action to be taken by the user based on the estimated result (for example, see Non-Patent Document 1).

ジャエユーンユー他「レコメンドのためのエネルギーベースシーケンスＧＡＮと模倣学習との関係（Energy-Based Sequence GANs for Recommendation and Their Connection to Imitation Learning）」、２０１７年７月Jae Yun Yu et al., "Energy-Based Sequence GANs for Recommendation and Their Connection to Imitation Learning," July 2017.

ＧＡＮを用いることにより、柔軟な学習環境を提供することができる。例えば多数の人の行動内容と気分との関係を示す大量の教師データを用いることにより、人の気分を示すデータを入力することにより推奨する行動を出力できる機械学習モデルを作成することができる。 A flexible learning environment can be provided by using GAN. For example, by using a large amount of teacher data indicating the relationship between the behavior content and the mood of a large number of people, it is possible to create a machine learning model that can output recommended behavior by inputting data indicating the mood of a person.

しかしながら、行動内容と気分との関係は人によって異なるので、特定の個人に合った推奨行動をレコメンドするためには、特定の個人の行動内容と気分との関係を示す教師データが必要である。しかしながら、このような特定の個人に関する教師データを大量に取得することは困難である。そこで、散発的で少量の教師データを用いて、効果的に特定の個人の状態を推定し、特定の個人に推奨される行動をレコメンドすることが求められている。 However, since the relationship between the action content and the mood varies from person to person, teacher data indicating the relationship between the action content and the mood of the particular individual is necessary in order to recommend the recommended behavior that suits the particular individual. However, it is difficult to acquire a large amount of teacher data about such a specific individual. Therefore, it is required to effectively estimate the state of a specific individual and recommend the action recommended to the specific individual by using sporadic and small amount of teacher data.

そこで、本発明はこれらの点に鑑みてなされたものであり、特定の個人に推奨される行動をレコメンドすることができるレコメンドシステム及びレコメンド方法を提供することを目的とする。 Then, this invention is made | formed in view of these points, and an object of this invention is to provide the recommendation system and recommendation method which can recommend the action recommended to a specific individual.

本発明の第１の態様のレコメンドシステムは、複数の人の行動内容と気分との関係を示すデータセットに基づいて、ユーザに推奨する行動内容を提示する汎用機械学習モデルを作成する汎用モデル作成部と、前記汎用機械学習モデルに基づいて、特定ユーザ用の個人機械学習モデルを作成する個人モデル作成部と、前記特定ユーザの気分を特定する気分特定部と、前記気分特定部が特定した前記特定ユーザの気分を示す気分情報を前記個人機械学習モデルに入力することにより前記個人機械学習モデルから出力される推奨行動をレコメンドするレコメンド部と、を有し、前記気分特定部は、前記レコメンド部が前記特定ユーザに前記推奨行動をレコメンドした後の前記特定ユーザの満足度を示すフィードバック情報を前記個人モデル作成部に入力し、前記個人モデル作成部は、前記フィードバック情報に基づいて前記個人機械学習モデルを更新する。 The recommendation system according to the first aspect of the present invention is a general-purpose model creation that creates a general-purpose machine learning model that presents recommended content of behavior to a user based on a data set showing the relationship between the content of behavior of multiple people and mood. Section, a personal model creating section that creates a personal machine learning model for a specific user based on the general-purpose machine learning model, a mood identifying section that identifies the mood of the particular user, and the mood identifying section identifies A recommendation unit that recommends recommended behavior output from the individual machine learning model by inputting mood information indicating the mood of a specific user into the individual machine learning model, wherein the mood identification unit is the recommendation unit. To the personal model creation unit with feedback information indicating the satisfaction of the specific user after recommending the recommended behavior to the specific user. And force, the personal model creation unit updates the individual machine learning models based on the feedback information.

前記個人モデル作成部は、前記気分特定部が所定の数の前記フィードバック情報を生成するたびに前記個人機械学習モデルを更新してもよい。 The personal model creation unit may update the personal machine learning model every time the mood identification unit generates a predetermined number of the feedback information.

前記個人モデル作成部は、前記ユーザの気分が変化したことを示す前記フィードバック情報に基づいて前記個人機械学習モデルを更新してもよい。 The personal model creation unit may update the personal machine learning model based on the feedback information indicating that the mood of the user has changed.

前記フィードバック情報は、前記特定ユーザが前記推奨行動を実行する前の前記特定ユーザの気分と、前記レコメンド部が前記特定ユーザにレコメンドした前記推奨行動の内容と、前記特定ユーザが前記推奨行動を実行した後の前記特定ユーザの気分と、を示す情報を含み、前記個人モデル作成部は、前記フィードバック情報が示す前記推奨行動を前記レコメンド部がレコメンドする前の前記特定ユーザの気分と、前記レコメンド部がレコメンドした前記推奨行動の内容と、前記特定ユーザが前記推奨行動を実行した後の前記特定ユーザの気分との関係とに基づいて、前記個人機械学習モデルを更新してもよい。 The feedback information is the mood of the specific user before the specific user executes the recommended action, the content of the recommended action recommended by the recommendation unit to the specific user, and the specific user executes the recommended action. Mood of the specific user after, and the personal model creating unit, the mood of the specific user before the recommending unit recommends the recommended behavior indicated by the feedback information, and the recommending unit The personal machine learning model may be updated based on the content of the recommended behavior recommended by the user and the relationship between the specific user and the mood of the specific user after executing the recommended behavior.

前記ユーザが前記推奨行動を実行したことによる気分の変化内容の期待値と、実際の前記ユーザの気分の変化内容との差分の大きさに対して前記個人機械学習モデルを変化させる度合を示す指標である更新感度の設定を受け付ける設定受付部をさらに有し、前記個人モデル作成部は、前記差分に前記更新感度を乗算した値の大きさに基づいて、前記個人機械学習モデルを更新してもよい。 An index indicating the degree to which the personal machine learning model is changed with respect to the magnitude of the difference between the expected value of the mood change content caused by the user performing the recommended action and the actual mood change content of the user. Further including a setting reception unit that receives the setting of the update sensitivity, and the personal model creation unit may update the personal machine learning model based on the magnitude of the value obtained by multiplying the difference by the update sensitivity. Good.

前記気分特定部は、前記特定ユーザの行動履歴に基づいて前記特定ユーザの気分を推定することにより前記気分情報を特定してもよい。 The mood identifying unit may identify the mood information by estimating the mood of the particular user based on the behavior history of the particular user.

前記汎用モデル作成部は、ＧＡＩＬを用いることにより前記汎用機械学習モデルを作成し、前記個人モデル作成部は、ＧＡＩＬを用いることなく前記個人機械学習モデルを作成してもよい。 The general-purpose model creating unit may create the general-purpose machine learning model by using GAIL, and the personal model creating unit may create the personal machine learning model without using GAIL.

本発明の第２の態様のレコメンド方法は、複数の人の行動内容と気分との関係を示すデータセットに基づいて、ユーザに推奨する行動内容を提示する汎用機械学習モデルを作成するステップと、前記汎用機械学習モデルに基づいて、特定ユーザ用の個人機械学習モデルを作成するステップと、前記特定ユーザの気分を示す気分情報を取得するステップと、取得した前記特定ユーザの気分情報を前記個人機械学習モデルに入力することにより前記個人機械学習モデルから出力される推奨行動をレコメンドするステップと、前記特定ユーザに前記推奨行動をレコメンドした後に取得した前記特定ユーザの満足度を示すフィードバック情報に基づいて前記個人機械学習モデルを更新するステップと、を有する。 The recommendation method according to the second aspect of the present invention includes a step of creating a general-purpose machine learning model that presents recommended action content to a user based on a data set indicating a relationship between action content and mood of a plurality of people, Creating a personal machine learning model for a specific user based on the general-purpose machine learning model; acquiring mood information indicating the mood of the specific user; and acquiring the acquired mood information of the specific user in the personal machine. Based on feedback information indicating the satisfaction of the specific user obtained after recommending the recommended behavior output from the personal machine learning model by inputting to the learning model, and recommending the recommended behavior to the specific user Updating the personal machine learning model.

本発明によれば、特定の個人に推奨される行動をレコメンドすることができるレコメンドシステム及びレコメンド方法を提供することができるという効果を奏する。 According to the present invention, there is an effect that it is possible to provide a recommendation system and a recommendation method capable of recommending a behavior recommended for a specific individual.

本実施形態に係るレコメンドシステムの概要を示す図である。It is a figure showing an outline of a recommendation system concerning this embodiment. レコメンドシステムの機能構成を示すブロック図である。It is a block diagram which shows the function structure of a recommendation system. ユーザ端末に表示されるメッセージ送受信用の画面の一例を示す図である。It is a figure which shows an example of the screen for message transmission / reception displayed on a user terminal. 汎用機械学習システムの構成例を示す図である。It is a figure which shows the structural example of a general purpose machine learning system. 個人機械学習システムの構成例を示す図である。It is a figure which shows the structural example of a personal machine learning system. メタ学習アルゴリズムの概要を示す図である。It is a figure which shows the outline of a meta learning algorithm. ユーザの行動履歴データを概念的に示す図である。It is a figure which shows notionally action history data of a user.

［レコメンドシステムＳの概要］
図１は、本実施形態に係るレコメンドシステムＳの概要を示す図である。レコメンドシステムＳは、ユーザの気分を推定し、推定した気分の内容に基づいて、ユーザに推奨する行動をレコメンドすることができるシステムである。レコメンドシステムＳは、例えば美味しい料理を食べることで元気になる傾向にあるユーザの気分が悪い状態であると推定した場合に、仲の良い友人と一緒にレストランに行くことをユーザに推奨する。 [Outline of Recommendation System S]
FIG. 1 is a diagram showing an outline of a recommendation system S according to the present embodiment. The recommendation system S is a system capable of estimating a user's mood and recommending an action recommended to the user based on the content of the estimated mood. The recommendation system S recommends the user to go to a restaurant with a good friend when it is estimated that the user, who is likely to get better by eating delicious food, is in a bad mood, for example.

レコメンドシステムＳは、汎用機械学習システムＳ１と、個人機械学習システムＳ２とを備える。汎用機械学習システムＳ１は、複数の人の行動内容と気分との関係を示すデータセットに基づいて、ユーザに推奨する行動内容を提示する汎用機械学習モデルを作成する。汎用機械学習システムＳ１は、例えば、多数の人から取得したデータセットに基づいてＧＡＩＬ（Generative Adversarial Imitation Learning）を用いて学習することにより、一般的なユーザの気分を推定し、推定した結果に基づいて推奨行動を決定する。 The recommendation system S includes a general-purpose machine learning system S1 and a personal machine learning system S2. The general-purpose machine learning system S1 creates a general-purpose machine learning model that presents recommended action contents to the user based on a data set indicating the relationship between the action contents of a plurality of people and the mood. The general-purpose machine learning system S1 estimates a general user's mood by learning using GAIL (Generative Adversarial Imitation Learning) based on a data set acquired from a large number of people, and based on the estimated result, for example. To determine recommended actions.

個人機械学習システムＳ２は、汎用機械学習システムＳ１が作成した汎用機械学習モデルに基づいて、例えばＧＡＩＬを用いることなく、特定のユーザ用の個人機械学習モデルを作成する。以下の説明において、個人機械学習システムＳ２が個人機械学習モデルを作成した対象の特定のユーザをユーザＵとする。 The personal machine learning system S2 creates a personal machine learning model for a specific user based on the general-purpose machine learning model created by the general-purpose machine learning system S1 without using GAIL, for example. In the following description, a specific user for whom the individual machine learning system S2 created the individual machine learning model is referred to as a user U.

個人機械学習システムＳ２は、汎用機械学習システムＳ１から取得した汎用機械学習モデルを初期の個人機械学習モデルとする。その後、個人機械学習システムＳ２は、ユーザＵが実行した行動と、行動を実行した後の満足度との関係を示す情報を教師データとして用いて学習することにより、個人機械学習モデルを更新する。 The personal machine learning system S2 uses the general-purpose machine learning model acquired from the general-purpose machine learning system S1 as an initial personal machine learning model. After that, the individual machine learning system S2 updates the individual machine learning model by learning using the information indicating the relationship between the action executed by the user U and the satisfaction degree after the action is executed as the teacher data.

以下、図１を参照しながら、レコメンドシステムＳにおける処理の流れを説明する。まず、汎用機械学習システムＳ１は、多数のユーザに関するデータセット（例えば、多数のユーザの行動内容と気分との関係を示すデータセット）に基づいて作成した汎用機械学習モデルを個人機械学習システムＳ２に通知する（図１における（１））。汎用機械学習システムＳ１は、定期的に汎用機械学習モデルを個人機械学習システムＳ２に通知してもよく、汎用機械学習モデルを更新するたびに、更新後の汎用機械学習モデルを個人機械学習システムＳ２に通知してもよい。 Hereinafter, the flow of processing in the recommendation system S will be described with reference to FIG. First, the general-purpose machine learning system S1 uses a general-purpose machine learning model created on the basis of a data set regarding a large number of users (for example, a data set indicating the relationship between the behavior content and the mood of a large number of users) in the personal machine learning system S2. Notify ((1) in FIG. 1). The general-purpose machine learning system S1 may periodically notify the general-purpose machine learning model to the individual machine learning system S2, and every time the general-purpose machine learning model is updated, the updated general-purpose machine learning model is updated to the individual machine learning system S2. May be notified.

続いて、個人機械学習システムＳ２は、ユーザＵが使用するユーザ端末１５（例えばスマートフォン、タブレット又はコンピュータ）から、ユーザＵの場所等を示すユーザ状態情報とユーザＵの気分を示す気分情報とを関連付けて取得する（図１における（２））。個人機械学習システムＳ２は、取得したユーザＵに関する情報に基づいて、初期の個人機械学習モデルを作成する（図１における（３））。 Subsequently, the personal machine learning system S2 associates the user state information indicating the location of the user U and the mood information indicating the mood of the user U from the user terminal 15 (for example, a smartphone, a tablet, or a computer) used by the user U. To obtain ((2) in FIG. 1). The personal machine learning system S2 creates an initial personal machine learning model based on the acquired information about the user U ((3) in FIG. 1).

その後、個人機械学習システムＳ２は、ユーザＵの気分情報を取得すると、取得した気分情報を個人機械学習モデルに入力し、個人機械学習モデルから出力される推奨行動の内容をユーザＵにレコメンドする（図１における（４））。個人機械学習システムＳ２は、ユーザＵの気分情報を取得せず、ユーザＵのユーザ状態情報に基づいてユーザＵの気分を推定してもよい。 After that, when the personal machine learning system S2 acquires the mood information of the user U, the acquired mood information is input to the individual machine learning model, and the recommended behavior content output from the individual machine learning model is recommended to the user U ( (4) in FIG. 1. The personal machine learning system S2 may estimate the mood of the user U based on the user state information of the user U without acquiring the mood information of the user U.

続いて、個人機械学習システムＳ２は、推奨行動をレコメンドした後に、ユーザＵの状態を示すユーザ状態情報又はユーザＵの気分を示す気分情報をユーザＵのユーザ端末から取得する（図１における（５））。個人機械学習システムＳ２は、ユーザ状態情報又は気分情報に基づいて、ユーザＵの満足度を特定する。個人機械学習システムＳ２は、ユーザＵに推奨行動をレコメンドする前のユーザＵの気分と、レコメンドした推奨行動と、特定した満足度とを教師データとして個人機械学習モデルに入力することにより、個人機械学習モデルを更新する（図１における（６））。 Subsequently, the personal machine learning system S2 acquires the user state information indicating the state of the user U or the mood information indicating the mood of the user U from the user terminal of the user U after recommending the recommended action ((5 in FIG. 1). )). The personal machine learning system S2 specifies the satisfaction level of the user U based on the user state information or the mood information. The personal machine learning system S2 inputs the mood of the user U before recommending the recommended behavior to the user U, the recommended recommended behavior, and the specified satisfaction level as teacher data to the personal machine learning model to input the personal machine learning model. The learning model is updated ((6) in FIG. 1).

以上説明したように、個人機械学習システムＳ２は、汎用機械学習システムＳ１から提供された汎用機械学習モデルに基づいて作成した個人機械学習モデルを用いて、ユーザＵの気分に応じた推奨行動をレコメンドする。そして、ユーザＵにレコメンドした推奨行動の内容と、ユーザＵに推奨行動をレコメンドした後のユーザＵの満足度とに基づいて、個人機械学習モデルを更新する。個人機械学習システムＳ２は、ユーザＵの満足度に基づいて、レコメンド前後のユーザの気分の変化を特定し、気分の変化に基づいて個人機械学習モデルを更新してもよい。 As described above, the personal machine learning system S2 recommends recommended actions according to the mood of the user U by using the personal machine learning model created based on the general-purpose machine learning model provided by the general-purpose machine learning system S1. To do. Then, the personal machine learning model is updated based on the content of the recommended action recommended to the user U and the satisfaction level of the user U after recommending the recommended action to the user U. The personal machine learning system S2 may identify a change in the user's mood before and after the recommendation based on the satisfaction of the user U, and update the personal machine learning model based on the change in the mood.

個人機械学習システムＳ２は、例えば、推奨行動をレコメンドした後にユーザＵの気分が改善した度合いが第１閾値よりも大きい場合、又は推奨行動をレコメンドした後にユーザＵの気分が改善した度合いが第１閾値以下の第２閾値よりも小さい場合に、個人機械学習モデルを更新する。個人機械学習システムＳ２は、推奨行動をレコメンドした後にユーザＵの気分が改善した度合いが想定範囲内（例えば第２閾値以上第１閾値以下の範囲内）である場合に、個人機械学習モデルを更新しない。レコメンドシステムＳがこのような構成を有することで、特定の個人における推奨行動と気分との関係を示す教師データが大量にない場合であっても、短期間で精度の高い個人機械学習モデルを作成することができる。 In the personal machine learning system S2, for example, when the degree of improvement in the mood of the user U after recommending the recommended behavior is larger than the first threshold value, or the degree of improvement in the mood of the user U after recommending the recommended behavior is first. When it is smaller than the second threshold value which is equal to or smaller than the threshold value, the individual machine learning model is updated. The personal machine learning system S2 updates the personal machine learning model when the degree to which the mood of the user U has improved after recommending the recommended behavior is within the expected range (for example, within the range of the second threshold to the first threshold). do not do. The recommendation system S having such a configuration creates a highly accurate personal machine learning model in a short period of time even if there is not a large amount of teacher data indicating the relationship between the recommended behavior and the mood of a specific individual. can do.

［レコメンドシステムＳの構成］
図２は、レコメンドシステムＳの機能構成を示すブロック図である。レコメンドシステムＳは、汎用モデル作成部１１と、個人モデル作成部１２と、気分特定部１３と、レコメンド部１４と、ユーザ端末１５と、設定受付部１６とを有する。汎用モデル作成部１１は、例えば図１における汎用機械学習システムＳ１に含まれている。個人モデル作成部１２、気分特定部１３及びレコメンド部１４は、例えば図１における個人機械学習システムＳ２に含まれている。 [Structure of recommendation system S]
FIG. 2 is a block diagram showing a functional configuration of the recommendation system S. The recommendation system S includes a general-purpose model creating unit 11, an individual model creating unit 12, a mood identifying unit 13, a recommending unit 14, a user terminal 15, and a setting receiving unit 16. The general-purpose model creating unit 11 is included in the general-purpose machine learning system S1 in FIG. 1, for example. The individual model creating unit 12, the mood identifying unit 13, and the recommendation unit 14 are included in, for example, the individual machine learning system S2 in FIG.

汎用モデル作成部１１は、複数の人の行動内容と気分との関係を示すデータセットに基づいて、ユーザに推奨する行動内容を提示する汎用機械学習モデルを作成するとともに、作成した汎用機能学習モデルを記憶するユニットである。汎用モデル作成部１１は、例えばＧＡＩＬを用いることにより汎用機械学習モデルを作成する。汎用モデル作成部１１は、記憶している汎用機械学習モデルを個人モデル作成部１２に提供する。 The general-purpose model creating unit 11 creates a general-purpose machine learning model that presents recommended action contents to the user based on a data set indicating the relationship between the action contents of a plurality of people and the mood, and the created general-purpose function learning model. Is a unit for storing. The general-purpose model creating unit 11 creates a general-purpose machine learning model by using GAIL, for example. The general-purpose model creating unit 11 provides the stored general-purpose machine learning model to the individual model creating unit 12.

個人モデル作成部１２は、汎用機械学習モデルに基づいて、ユーザＵ用の個人機械学習モデルを作成するとともに、作成した個人機械学習モデルを記憶するユニットである。個人モデル作成部１２は、汎用モデル作成部１１から提供された汎用機械学習モデルを更新することにより個人機械学習モデルを作成する。個人モデル作成部１２は、例えばＧＡＩＬを用いることなく個人機械学習モデルを作成する。個人モデル作成部１２は、例えばＬＳＴＭ（Long Short-term Memory）を用いたメタ学習により、汎用機械学習システムＳ１が作成した汎用機械学習モデルをユーザＵに最適化して個人機械学習モデルを作成する。 The personal model creating unit 12 is a unit that creates a personal machine learning model for the user U based on the general-purpose machine learning model and stores the created personal machine learning model. The personal model creating unit 12 creates the personal machine learning model by updating the general-purpose machine learning model provided by the general-purpose model creating unit 11. The individual model creation unit 12 creates an individual machine learning model without using GAIL, for example. The individual model creating unit 12 creates the individual machine learning model by optimizing the general-purpose machine learning model created by the general-purpose machine learning system S1 for the user U by meta-learning using LSTM (Long Short-term Memory), for example.

気分特定部１３は、特定のユーザであるユーザＵの気分を特定する。気分特定部１３は、例えばユーザＵが使用するユーザ端末１５においてユーザＵが入力した気分情報に基づいて、ユーザＵの気分を特定する。気分特定部１３は、ユーザ端末１５から送信されたユーザ端末１５の位置を示す情報、ユーザ端末１５の周囲の天候を示す情報、及びユーザ端末１５により撮影された画像等のように、ユーザＵの行動履歴を示す情報に基づいてユーザＵの気分を推定することにより、ユーザＵの気分を特定してもよい。 The mood identifying unit 13 identifies the mood of the user U who is the particular user. The mood specifying unit 13 specifies the mood of the user U based on the mood information input by the user U at the user terminal 15 used by the user U, for example. The mood identifying unit 13 detects the position of the user terminal 15 transmitted from the user terminal 15, information indicating the weather around the user terminal 15, an image captured by the user terminal 15, and the like of the user U. The mood of the user U may be specified by estimating the mood of the user U based on the information indicating the action history.

気分特定部１３は、特定したユーザＵの気分を個人モデル作成部１２に入力する。気分特定部１３は、例えば、レコメンド部１４がユーザＵに推奨行動をレコメンドした後に特定したユーザＵの気分を示すフィードバック情報を個人モデル作成部１２に入力する。 The mood identifying unit 13 inputs the identified mood of the user U to the individual model creating unit 12. The mood identifying unit 13 inputs, to the individual model creating unit 12, feedback information indicating the mood of the user U identified after the recommending unit 14 recommends the recommended behavior to the user U, for example.

なお、気分特定部１３がユーザＵの気分を特定するための方法としては、各種の方法を適用することができる。気分特定部１３は、例えば、ワンホットエンコーダーモデル（ＯＨＥ：One-Hot Encoder）又はカテゴリーベクターモデル（Cat2Vec）を使用して、ユーザＵの行動の内容を数値化する。気分特定部１３は、数値化した内容を、予めユーザの行動と感情との関係を学習した再帰型ニューラルネットワークにより構成される機械学習モデルに入力することにより、ユーザＵの気分を特定することができる。 As a method for the mood identifying unit 13 to identify the mood of the user U, various methods can be applied. The mood identifying unit 13 uses, for example, a one-hot encoder model (OHE: One-Hot Encoder) or a category vector model (Cat2Vec) to digitize the content of the behavior of the user U. The mood specifying unit 13 can specify the mood of the user U by inputting the quantified contents into a machine learning model configured by a recursive neural network in which the relationship between the user's actions and emotions is learned in advance. it can.

レコメンド部１４は、気分特定部１３が特定したユーザＵの気分を示す気分情報を個人機械学習モデルに入力することにより個人機械学習モデルから出力される推奨行動の内容をユーザＵにレコメンドする。レコメンド部１４は、例えばＡＩエージェントを含んでいる。レコメンド部１４は、ユーザＵの場所及びユーザＵがいる場所の天候等のようにユーザＵの状態を示すユーザ状態情報をさらに取得し、ユーザ状態情報及び気分情報を個人機械学習モデルに入力してもよい。レコメンド部１４は、個人機械学習モデルから出力された推奨行動の内容をユーザ端末１５に通知する。 The recommendation unit 14 recommends to the user U the content of the recommended action output from the individual machine learning model by inputting the mood information indicating the mood of the user U identified by the mood identifying unit 13 into the individual machine learning model. The recommendation unit 14 includes, for example, an AI agent. The recommendation unit 14 further acquires user state information indicating the state of the user U, such as the weather of the place of the user U and the place where the user U is, and inputs the user state information and the mood information to the personal machine learning model. Good. The recommendation unit 14 notifies the user terminal 15 of the content of the recommended action output from the individual machine learning model.

ユーザ端末１５は、ユーザＵが用いる情報端末であり、情報を表示するディスプレイ、情報を入力するための操作デバイス（例えばタッチパネル）、及び情報を送信するための通信デバイスを有する。ユーザ端末１５は、ユーザＵが気分を示す気分情報を入力するための画面を表示し、ユーザＵが入力した気分情報をレコメンド部１４に送信する。ユーザ端末１５は、レコメンド部１４との間で、チャット形式でメッセージをやり取りするメッセージ送受信アプリケーションソフトウェアにより気分情報の入力を受け付けて、入力された気分情報を送信してもよい。 The user terminal 15 is an information terminal used by the user U, and has a display for displaying information, an operation device (for example, a touch panel) for inputting information, and a communication device for transmitting information. The user terminal 15 displays a screen for inputting mood information indicating the mood of the user U, and transmits the mood information input by the user U to the recommendation unit 14. The user terminal 15 may receive an input of mood information by message transmission / reception application software that exchanges messages in a chat format with the recommendation unit 14, and may transmit the input mood information.

図３は、ユーザ端末１５に表示されるメッセージ送受信用の画面の一例を示す図である。図３に示す例においては、レコメンド部１４がチャットボット機能を有していることが想定されており、レコメンド部１４がユーザ端末１５に送信したメッセージと、ユーザＵが入力したメッセージとが交互に表示されている。レコメンド部１４は、ユーザＵの気分が悪いことを気分特定部１３が特定した場合に、気分を良くするための推奨行動の内容をユーザ端末１５に送信する。図３に示す例においては、レコメンド部１４は、ユーザＵがたくさん働いて疲れていると推定したことにより、疲れを癒やすことにつながる推奨行動として、レストランＸにＡさんと行くことを推奨している。 FIG. 3 is a diagram showing an example of a message transmission / reception screen displayed on the user terminal 15. In the example shown in FIG. 3, it is assumed that the recommendation unit 14 has a chatbot function, and the message transmitted by the recommendation unit 14 to the user terminal 15 and the message input by the user U are alternately arranged. It is displayed. When the mood identifying unit 13 identifies that the user U is in a bad mood, the recommendation unit 14 transmits the content of the recommended action for improving the mood to the user terminal 15. In the example illustrated in FIG. 3, the recommendation unit 14 estimates that the user U is working a lot and is tired, and thus recommends going to the restaurant X with Mr. A as a recommended action that leads to healing the fatigue. ing.

設定受付部１６は、個人機械学習モデルの更新感度の設定を受け付ける。更新感度は、ユーザＵが推奨行動を実行したことによる気分の変化内容の期待値と、実際のユーザＵの気分の変化内容との差分の大きさに対して個人機械学習モデルを変化させる度合を示す指標である。設定受付部１６は、受け付けた更新感度を個人モデル作成部１２に通知する。個人モデル作成部１２は、例えば、差分に更新感度を乗算した値の大きさに基づいて、個人機械学習モデルを更新する。 The setting reception unit 16 receives the setting of the update sensitivity of the individual machine learning model. The update sensitivity is the degree to which the personal machine learning model is changed with respect to the magnitude of the difference between the expected value of the mood change content caused by the user U performing the recommended action and the actual mood change content of the user U. It is an index to show. The setting reception unit 16 notifies the personal model creation unit 12 of the received update sensitivity. The individual model creation unit 12 updates the individual machine learning model, for example, based on the size of the value obtained by multiplying the difference by the update sensitivity.

更新感度が大きい場合、個人モデル作成部１２は、上記の差分が小さくても個人機械学習モデルを更新するので、個人モデル作成部１２は、個人機械学習モデルを頻繁に更新することができる。更新感度が小さい場合、個人モデル作成部１２は個人機械学習モデルを頻繁に更新しないので、例外的な事象が発生したことにより個人機械学習モデルが不適切に更新されてしまうことを防止できる。 When the update sensitivity is high, the individual model creating unit 12 updates the individual machine learning model even if the difference is small, and thus the individual model creating unit 12 can update the individual machine learning model frequently. When the update sensitivity is low, the personal model creating unit 12 does not frequently update the personal machine learning model, and thus it is possible to prevent the personal machine learning model from being inappropriately updated due to an exceptional event.

［個人機械学習モデルの更新］
個人モデル作成部１２は、気分特定部１３から入力されるフィードバック情報に基づいて個人機械学習モデルを更新する。個人モデル作成部１２は、例えば、気分特定部１３が所定の数のフィードバック情報を生成するたびに個人機械学習モデルを更新する。フィードバック情報は、例えばユーザＵが推奨行動を実行した後のユーザＵの満足度を示す情報を含む。フィードバック情報は、ユーザＵが推奨行動を実行する前のユーザＵの気分と、レコメンド部１４がユーザＵにレコメンドした推奨行動の内容と、ユーザＵが推奨行動を実行した後のユーザＵの気分と、を示す情報を含んでもよい。フィードバック情報は、ユーザＵが推奨行動を実行する前後のユーザＵの気分の変化量を示す情報を含んでもよい。 [Update of individual machine learning model]
The individual model creating unit 12 updates the individual machine learning model based on the feedback information input from the mood identifying unit 13. The individual model creation unit 12 updates the individual machine learning model, for example, every time the mood identification unit 13 generates a predetermined number of feedback information. The feedback information includes, for example, information indicating the satisfaction level of the user U after the user U executes the recommended action. The feedback information includes the mood of the user U before the user U performs the recommended action, the content of the recommended action recommended by the recommendation unit 14 to the user U, and the mood of the user U after the user U performs the recommended action. , May be included. The feedback information may include information indicating the amount of change in the mood of the user U before and after the user U executes the recommended action.

個人モデル作成部１２は、推奨行動をレコメンド部１４がレコメンドする前のユーザＵの気分と、レコメンド部１４がレコメンドした推奨行動の内容と、ユーザＵが推奨行動を実行した後のユーザＵの満足度との関係とに基づいて、個人機械学習モデルを更新する。個人モデル作成部１２は、推奨行動をレコメンド部１４がレコメンドする前のユーザＵの気分と、ユーザＵが実行した推奨行動の内容と、ユーザＵが推奨行動を実行した後のユーザＵの満足度との関係とに基づいて、個人機械学習モデルを更新してもよい。 The individual model creation unit 12 determines the mood of the user U before the recommendation unit 14 recommends the recommended behavior, the content of the recommended behavior recommended by the recommendation unit 14, and the satisfaction of the user U after the user U executes the recommended behavior. The personal machine learning model is updated based on the relationship with the degree. The individual model creation unit 12 determines the mood of the user U before the recommendation unit 14 recommends the recommended behavior, the content of the recommended behavior performed by the user U, and the satisfaction level of the user U after the user U performed the recommended behavior. The personal machine learning model may be updated based on the relationship with.

また、個人モデル作成部１２は、ユーザＵの気分が変化したことを示すフィードバック情報に基づいて個人機械学習モデルを更新してもよい。気分特定部１３は、例えば、レコメンド部１４が推奨行動をユーザ端末１５に送信してから、推奨行動を実行するために要すると推定される時間が経過した後にユーザＵの気分が変化したことを検出した場合、推奨行動を送信する前のユーザＵの気分、推奨行動を送信した後のユーザＵの気分、及び推奨行動の内容を教師データとして個人モデル作成部１２に入力する。個人モデル作成部１２は、入力された教師データに基づいて再学習することにより、個人機械学習モデルを更新する。 Further, the individual model creation unit 12 may update the individual machine learning model based on the feedback information indicating that the mood of the user U has changed. For example, the mood identifying unit 13 notifies that the mood of the user U has changed after the time estimated to be required to execute the recommended action has passed after the recommendation unit 14 transmitted the recommended action to the user terminal 15. When detected, the mood of the user U before transmitting the recommended behavior, the mood of the user U after transmitting the recommended behavior, and the content of the recommended behavior are input to the personal model creation unit 12 as teacher data. The individual model creation unit 12 updates the individual machine learning model by re-learning based on the input teacher data.

個人モデル作成部１２は、レコメンド部１４が推奨行動をユーザ端末１５に送信してから、推奨行動を実行するために要すると推定される時間が経過した後にユーザＵの気分が変化した量が、予め想定される変化量よりも小さい場合に、個人機械学習モデルを更新してもよい。このようにするために、気分特定部１３は、レコメンド部１４が推奨行動を送信する前のユーザＵの気分と推奨行動を送信した後のユーザＵの気分との差が所定の量よりも小さい場合に、推奨行動を送信する前のユーザＵの気分、推奨行動を送信した後のユーザＵの気分、及び推奨行動の内容を教師データとして個人モデル作成部１２に入力する。このようにすることで、ユーザＵの気分を改善するために効果が大きい推奨行動を出力するように個人機械学習モデルを改善することができる。 The amount of change in the mood of the user U after the time estimated to be required to execute the recommended action has passed from the recommendation unit 14 transmitting the recommended action to the user terminal 15 in the individual model creation unit 12 The personal machine learning model may be updated when it is smaller than the change amount assumed in advance. In order to do so, the mood identifying unit 13 has a difference between the mood of the user U before the recommendation unit 14 transmits the recommended action and the mood of the user U after the recommended action is transmitted smaller than a predetermined amount. In this case, the mood of the user U before transmitting the recommended behavior, the mood of the user U after transmitting the recommended behavior, and the content of the recommended behavior are input to the personal model creation unit 12 as teacher data. By doing so, it is possible to improve the personal machine learning model so as to output a recommended action that is highly effective for improving the mood of the user U.

［汎用機械学習システムＳ１及び個人機械学習システムＳ２の構成例］
図４は、汎用機械学習システムＳ１の構成例を示す図である。図５は、個人機械学習システムＳ２の構成例を示す図である。汎用機械学習システムＳ１は、ＧＡＩＬアルゴリズムを使用する。一方、個人機械学習システムＳ２は、強化学習（ＲＬ：Reinforcement Learning）フレームワークにおいて、模倣学習（Imitation Learning）アルゴリズムを使用する。 [Configuration example of general-purpose machine learning system S1 and individual machine learning system S2]
FIG. 4 is a diagram showing a configuration example of the general-purpose machine learning system S1. FIG. 5 is a diagram showing a configuration example of the personal machine learning system S2. The general-purpose machine learning system S1 uses the GAIL algorithm. On the other hand, the individual machine learning system S2 uses an imitation learning algorithm in a RL (Reinforcement Learning) framework.

汎用機械学習システムＳ１は、一般的なユーザの気分（嬉しい、悲しい、普通）を特定のファクターとマッチングさせるための汎用ポリシーを学習することを目的としている。特定のファクターは、例えば、場所、社会環境、日時及び行動内容の少なくともいずれかである。汎用機械学習システムＳ１におけるポリシーは、ＧＡＩＬアルゴリズムを用いて、全てのユーザのデータ（Ｃ３）に基づいて学習される。 The general-purpose machine learning system S1 aims to learn a general-purpose policy for matching a general user's mood (happy, sad, ordinary) with a specific factor. The specific factor is, for example, at least one of place, social environment, date and time, and action content. The policy in the general-purpose machine learning system S1 is learned based on the data (C3) of all users using the GAIL algorithm.

汎用機械学習システムＳ１は、敵対的ゲーム（Adversarial Game）の判別器（Discriminator）（Ｃ５）及び汎用機械学習モデル（Ｃ４）という２つの主要な機能ブロックを有する。ＧＡＩＬの目標は、エキスパートのデモンストレーションを模倣することにより学習することである。デモンストレーションは、多数のユーザから集められた履歴データにより表される。履歴データは、例えば、過去の行動内容と気分との関係を示すデータセットである。ＧＡＩＬは、モデルが不要な模倣学習アルゴリズムであり、高次元環境における複雑なふるまいを模倣する従来のモデルが不要な方法に比べて、顕著にパフォーマンスが高い。 The general-purpose machine learning system S1 has two main functional blocks: a discriminator (C5) for an adversarial game (Adversarial Game) and a general-purpose machine learning model (C4). GAIL's goal is to learn by imitating expert demonstrations. Demonstrations are represented by historical data collected from many users. The history data is, for example, a data set indicating the relationship between past behavior content and mood. GAIL is a model-free imitation learning algorithm that performs significantly better than traditional model-free methods that mimic complex behavior in high-dimensional environments.

個人機械学習システムＳ２の目的は、汎用機械学習システムＳ１から取得した汎用機械学習モデルを、特定のユーザＵから得られる少数のサンプルに基づいて更新することである。個人機械学習システムＳ２は、主に４つの要素により構成されている。ユーザの少数の行動履歴データ（Ｃ８）は、汎用機械学習システムＳ１で学習されたネットワークを更新するために用いられる。個人機械学習モデル（Ｃ６）は、メタ学習を実行する模倣学習器（Ｃ７）を用いて、順次更新される。個人機械学習モデル（Ｃ６）及び模倣学習器（Ｃ７）は、図２に示した個人モデル作成部１２に対応する。 The purpose of the personal machine learning system S2 is to update the general-purpose machine learning model acquired from the general-purpose machine learning system S1 based on a small number of samples obtained from a specific user U. The personal machine learning system S2 is mainly composed of four elements. The small number of user action history data (C8) is used to update the network learned by the general-purpose machine learning system S1. The individual machine learning model (C6) is sequentially updated by using the imitation learning device (C7) that executes meta learning. The personal machine learning model (C6) and the imitation learning device (C7) correspond to the personal model creating unit 12 shown in FIG.

図６は、メタ学習アルゴリズムの概要を示す図である。図７は、ユーザの行動履歴データを概念的に示す図である。メタ学習モデルは、行動履歴データの微分と損失関数（Loss）により規定される。個人機械学習モデル（Ｃ６）が使用する深層ニューラルネットワークの学習に用いられる標準的な最適化アルゴリズムは、以下の式により表される。
θ_ｔ＝θ_ｔ−１−α_ｔ（∇θ_ｔ−１）Ｌ_ｔ−１
ここで、α_ｔは、上述の更新感度に対応する係数である。図６における白い四角は、上記の最適化処理をするオプティマイザである。 FIG. 6 is a diagram showing an outline of the meta learning algorithm. FIG. 7 is a diagram conceptually showing the action history data of the user. The meta-learning model is defined by the derivative of action history data and the loss function (Loss). A standard optimization algorithm used for learning the deep neural network used by the personal machine learning model (C6) is represented by the following equation.
_{_{θ t = θ t-1 -α}} t (∇θ t-1) L t-1
Here, α _t is a coefficient corresponding to the above update sensitivity. The white squares in FIG. 6 are optimizers that perform the above optimization processing.

ここで、行動履歴データ（Ｃ８）は、異なるコンテキストにおけるユーザＵの履歴データである。ユーザＵの気分をコンテキストだとすると、ユーザＵの履歴データは、図７に示す構造を有する。それぞれの気分に対して、ユーザＵがいる場所、ユーザＵの行動内容、周囲の人々及び日時といった異なる項目に関係する少数のサンプルが関連付けられている。 Here, the action history data (C8) is history data of the user U in different contexts. If the mood of the user U is the context, the history data of the user U has the structure shown in FIG. 7. A small number of samples relating to different items such as the place where the user U is, the behavior content of the user U, the people around him, and the date and time are associated with each mood.

ＡＩエージェント（Ｃ９）は、図２に示したレコメンド部１４に対応しており、推奨行動を生成する。ＡＩエージェント（Ｃ９）は、例えば、ユーザＵの履歴データに含まれる一以上の項目（すなわち、場所、行動内容、周囲の人々及び日時）に関連付けられた複数の推奨行動候補のリストＬから、レコメンドする推奨行動を選択する。複数の推奨行動候補のリストＬは、個人機械学習モデル（Ｃ６）から送信される更新されたポリシーに基づいて決定される。ＡＩエージェント（Ｃ９）は、ユーザＵの気分が悪い状態であると予測した場合に、気分を改善することができる推奨行動候補を選択する。 The AI agent (C9) corresponds to the recommendation unit 14 shown in FIG. 2 and generates a recommended action. The AI agent (C9) recommends, for example, from a list L of a plurality of recommended action candidates associated with one or more items (that is, place, action content, people around and date and time) included in the history data of the user U. Select the recommended action to take. The list L of the plurality of recommended action candidates is determined based on the updated policy transmitted from the individual machine learning model (C6). The AI agent (C9) selects a recommended action candidate that can improve the mood when it is predicted that the user U is in a bad mood.

［レコメンドシステムＳによる効果］
以上説明したように、レコメンドシステムＳは、複数のユーザの気分に基づいて、取得した気分に基づいて推奨する行動内容を提示する汎用機械学習モデルを作成する汎用モデル作成部１１と、汎用機械学習モデルに基づいて、特定ユーザ用の個人機械学習モデルを作成する個人モデル作成部１２と、を有する。 [Effects of the recommendation system S]
As described above, the recommendation system S includes a general-purpose model creating unit 11 that creates a general-purpose machine learning model that presents recommended action contents based on the acquired moods based on the moods of a plurality of users, and the general-purpose machine learning. A personal model creating unit 12 that creates a personal machine learning model for a specific user based on the model.

レコメンド部１４は、気分特定部１３が特定した特定ユーザの気分を示す気分情報を個人機械学習モデルに入力することにより個人機械学習モデルから出力される推奨行動をレコメンドする。個人モデル作成部１２は、特定ユーザの満足度を示すフィードバック情報に基づいて個人機械学習モデルを更新する。レコメンドシステムＳがこのように構成されていることにより、特定のユーザＵの行動履歴を示すデータが大量にない場合であっても、個人に推奨される行動をレコメンドするために使用可能な個人機械学習モデルを作成することができる。 The recommendation unit 14 recommends the recommended behavior output from the individual machine learning model by inputting the mood information indicating the mood of the specific user identified by the mood identifying unit 13 to the individual machine learning model. The individual model creation unit 12 updates the individual machine learning model based on the feedback information indicating the degree of satisfaction of the specific user. Since the recommendation system S is configured in this way, even if there is not a large amount of data indicating the behavior history of the specific user U, a personal machine that can be used to recommend a behavior recommended to an individual. A learning model can be created.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されず、その要旨の範囲内で種々の変形及び変更が可能である。例えば、装置の分散・統合の具体的な実施の形態は、以上の実施の形態に限られず、その全部又は一部について、任意の単位で機能的又は物理的に分散・統合して構成することができる。また、複数の実施の形態の任意の組み合わせによって生じる新たな実施の形態も、本発明の実施の形態に含まれる。組み合わせによって生じる新たな実施の形態の効果は、もとの実施の形態の効果を合わせ持つ。 Although the present invention has been described using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments, and various modifications and changes can be made within the scope of the gist thereof. is there. For example, the specific embodiment of the distribution / integration of the device is not limited to the above-described embodiment, and all or a part thereof may be functionally or physically distributed / integrated in arbitrary units. You can Further, a new embodiment that occurs due to an arbitrary combination of a plurality of embodiments is also included in the embodiment of the present invention. The effect of the new embodiment produced by the combination also has the effect of the original embodiment.

１１汎用モデル作成部
１２個人モデル作成部
１３気分特定部
１４レコメンド部
１５ユーザ端末
１６設定受付部 11 general-purpose model creation unit 12 individual model creation unit 13 mood identification unit 14 recommendation unit 15 user terminal 16 setting reception unit

Claims

A general-purpose model creation unit that creates a general-purpose machine learning model that presents recommended activity contents to the user based on a data set indicating the relationship between the activity contents of multiple people and mood,
A personal model creating unit for creating a personal machine learning model for a specific user based on the general-purpose machine learning model;
A mood identifying unit that identifies the mood of the particular user,
A recommendation unit that recommends recommended behavior output from the individual machine learning model by inputting mood information indicating the mood of the specific user identified by the mood identification unit to the individual machine learning model,
Have
The mood identifying unit inputs feedback information indicating satisfaction of the specific user after the recommending unit recommends the recommended behavior to the specific user, in the personal model creating unit,
The recommendation system, wherein the individual model creation unit updates the individual machine learning model based on the feedback information.

The personal model creation unit updates the personal machine learning model each time the mood identification unit generates a predetermined number of the feedback information,
The recommendation system according to claim 1.

The personal model creating unit updates the personal machine learning model based on the feedback information indicating that the mood of the user has changed,
The recommendation system according to claim 1.

The feedback information is the mood of the specific user before the specific user executes the recommended action, the content of the recommended action that the recommendation unit recommends to the specific user, and the specific user executes the recommended action. Including the information indicating the mood of the specific user after
The personal model creation unit, the mood of the specific user before the recommendation unit recommends the recommended behavior indicated by the feedback information, the recommended behavior recommended by the recommendation unit, and the specific user the recommended Updating the personal machine learning model based on a relationship with the mood of the specific user after performing an action,
The recommendation system according to any one of claims 1 to 3.

An index indicating the degree to which the personal machine learning model is changed with respect to the magnitude of the difference between the expected value of the mood change content caused by the user performing the recommended action and the actual mood change content of the user. Further has a setting reception unit that receives the setting of the update sensitivity,
The personal model creation unit updates the personal machine learning model based on the magnitude of a value obtained by multiplying the difference by the update sensitivity,
The recommendation system according to claim 4.

The mood identifying unit identifies the mood information by estimating the mood of the specific user based on the action history of the specific user,
The recommendation system according to any one of claims 1 to 5.

The general-purpose model creating unit creates the general-purpose machine learning model by using GAIL,
The personal model creating unit creates the personal machine learning model without using GAIL,
The recommendation system according to any one of claims 1 to 6.

Creating a general-purpose machine learning model that presents recommended action content to the user based on a data set indicating the relationship between the action content of multiple people and mood;
Creating a personal machine learning model for a specific user based on the general-purpose machine learning model;
Acquiring mood information indicating the mood of the specific user,
Recommending the recommended behavior output from the individual machine learning model by inputting the acquired mood information of the specific user into the individual machine learning model;
Updating the personal machine learning model based on feedback information indicating satisfaction of the specific user acquired after recommending the recommended behavior to the specific user,
Recommendation method having.