JP6250852B1

JP6250852B1 - Determination program, determination apparatus, and determination method

Info

Publication number: JP6250852B1
Application number: JP2017051796A
Authority: JP
Inventors: 純也笹本
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2017-03-16
Filing date: 2017-03-16
Publication date: 2017-12-20
Anticipated expiration: 2037-03-16
Also published as: JP2018155882A

Abstract

【課題】ユーザビリティの高いセッション管理を行うこと。【解決手段】本願に係る判定プログラムは、収集手順と、判定手順と、をコンピュータに実行させる。収集手順は、周囲の環境音を収集する。判定手順は、収集手順によって収集された環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する。例えば、収集手順は、定常的に発せられる周囲の環境音を収集し、判定手順は、定常的に発せられる周囲の環境音が収集されている間のユーザの所定の行動が、１つのセッションに含まれると判定する。【選択図】図１To perform session management with high usability. A determination program according to the present application causes a computer to execute a collection procedure and a determination procedure. The collection procedure collects ambient environmental sounds. The determination procedure determines whether or not the predetermined action of the user is included in one session based on the continuity of the environmental sound collected by the collection procedure. For example, the collection procedure collects ambient sound that is constantly emitted, and the determination procedure determines that a predetermined action of the user is collected in one session while the ambient sound that is regularly emitted is collected. It is determined that it is included. [Selection] Figure 1

Description

本発明は、判定プログラム、判定装置及び判定方法に関する。 The present invention relates to a determination program, a determination apparatus, and a determination method.

インターネット等の通信において、ユーザの一連の行動（セッション）は、セッション情報としてサーバ等に取得される。サーバは、例えば、ユーザのクッキー（cookie）等を参照し、あるショッピングサイトにおける一連の行動が、ある一人のユーザ（言い換えれば、当該ユーザが操作する端末装置）によって行われていると判定する。 In communication such as the Internet, a series of user actions (sessions) is acquired as session information by a server or the like. For example, the server refers to a user cookie or the like, and determines that a series of actions at a certain shopping site is performed by a certain user (in other words, a terminal device operated by the user).

なお、セッションに関する技術として、例えば、クライアントとサーバとを接続する仮想私設網において、安定的な通信を確保しながら低コストでセッション情報の資源枯渇を防止するための技術が知られている。 As a technique related to a session, for example, in a virtual private network connecting a client and a server, a technique for preventing resource information depletion at low cost while ensuring stable communication is known.

特開２００５−１１０３０２号公報JP 2005-110302 A

ところで、ユーザが音声認識技術を利用してショッピングサイト等で購買を行う際には、ユーザが意図する一連の行動を一つのセッションとして認識させることが難しい場合がある。例えば、音声認識では、ユーザの発した音声が途中で途切れた場合に、そのユーザの行動が終了したものとしてセッションを閉じる場合がある。すなわち、ユーザは、意図した音声を一度に連続して発さなければ、言い直しを要求されたり、ショッピングサイトへのログインを再度求められたりする場合がある。この場合、ユーザは、意図しない余計な音声入力を行う手間を掛けなくてはならない。このように、ネットワーク上において音声認識技術を利用する場合に、ユーザビリティの高いセッション管理が行われていないという現状がある。 By the way, when a user makes a purchase at a shopping site or the like using voice recognition technology, it may be difficult to recognize a series of actions intended by the user as one session. For example, in voice recognition, when a voice uttered by a user is interrupted, the session may be closed assuming that the user's action has ended. That is, the user may be required to rephrase or be asked to log in to the shopping site again if the intended voice is not continuously emitted at once. In this case, the user has to spend time and effort to perform an unintended extra voice input. As described above, there is a current situation that session management with high usability is not performed when a speech recognition technology is used on a network.

本願は、上記に鑑みてなされたものであって、ユーザビリティの高いセッション管理を行うことができる判定プログラム、判定装置及び判定方法を提供することを目的とする。 The present application has been made in view of the above, and an object thereof is to provide a determination program, a determination apparatus, and a determination method capable of performing session management with high usability.

本願に係る判定プログラムは、周囲の環境音を収集する収集手順と、前記収集手順によって収集された環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する判定手順と、をコンピュータに実行させることを特徴とする。 The determination program according to the present application determines whether or not a predetermined action of a user is included in one session based on a collection procedure for collecting ambient environmental sounds and a continuity of environmental sounds collected by the collection procedure. A determination procedure for determining is executed by a computer.

実施形態の一態様によれば、ユーザビリティの高いセッション管理を行うことができるという効果を奏する。 According to one aspect of the embodiment, there is an effect that session management with high usability can be performed.

図１は、実施形態に係る判定処理の一例を示す図である。FIG. 1 is a diagram illustrating an example of a determination process according to the embodiment. 図２は、実施形態に係る判定処理システムの構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example of a determination processing system according to the embodiment. 図３は、実施形態に係る判定装置の構成例を示す図である。FIG. 3 is a diagram illustrating a configuration example of the determination apparatus according to the embodiment. 図４は、実施形態に係るセッション記憶部の一例を示す図である。FIG. 4 is a diagram illustrating an example of the session storage unit according to the embodiment. 図５は、実施形態に係る行動テーブルの一例を示す図である。FIG. 5 is a diagram illustrating an example of an action table according to the embodiment. 図６は、実施形態に係る効果テーブルの一例を示す図である。FIG. 6 is a diagram illustrating an example of an effect table according to the embodiment. 図７は、実施形態に係るユーザ端末の構成例を示す図である。FIG. 7 is a diagram illustrating a configuration example of a user terminal according to the embodiment. 図８は、実施形態に係る処理手順を示すフローチャートである。FIG. 8 is a flowchart illustrating a processing procedure according to the embodiment. 図９は、変形例に係る判定装置の構成例を示す図である。FIG. 9 is a diagram illustrating a configuration example of a determination apparatus according to a modification. 図１０は、変形例に係る登録テーブルの一例を示す図である。FIG. 10 is a diagram illustrating an example of a registration table according to the modification. 図１１は、変形例に係る照合テーブルの一例を示す図である。FIG. 11 is a diagram illustrating an example of a collation table according to the modification. 図１２は、変形例に係る処理手順を示すフローチャートである。FIG. 12 is a flowchart illustrating a processing procedure according to the modification. 図１３は、判定装置の機能を実現するコンピュータの一例を示すハードウェア構成図である。FIG. 13 is a hardware configuration diagram illustrating an example of a computer that realizes the function of the determination apparatus.

以下に、本願に係る判定プログラム、判定装置及び判定方法を実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る判定プログラム、判定装置及び判定方法が限定されるものではない。また、各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略される。 Hereinafter, modes for carrying out a determination program, a determination device, and a determination method according to the present application (hereinafter referred to as “embodiments”) will be described in detail with reference to the drawings. Note that the determination program, the determination apparatus, and the determination method according to the present application are not limited to the embodiment. In addition, the embodiments can be appropriately combined within a range that does not contradict processing contents. In the following embodiments, the same portions are denoted by the same reference numerals, and redundant description is omitted.

〔１．判定処理の一例〕
まず、図１を用いて、実施形態に係る判定処理の一例について説明する。図１は、実施形態に係る判定処理の一例を示す図である。図１では、本願に係る判定プログラムによって動作するサーバ装置である判定装置１００によって、実施形態に係る判定処理が行われる流れについて説明する。より具体的には、図１では、本願に係る判定装置１００によって、ネットワーク上のユーザの所定の行動が、１つのセッションに含まれるか否かを判定する処理が行われる一例を示す。 [1. Example of judgment process)
First, an example of the determination process according to the embodiment will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of a determination process according to the embodiment. FIG. 1 illustrates a flow in which the determination process according to the embodiment is performed by the determination apparatus 100 that is a server apparatus that operates according to the determination program according to the present application. More specifically, FIG. 1 shows an example in which the determination apparatus 100 according to the present application performs processing for determining whether or not a predetermined action of a user on the network is included in one session.

なお、実施形態では、セッションとは、ネットワーク上のユーザの一連の行動を示す。例えば、セッションは、ユーザが所定の目的（意図）を持った行動を行う期間を意味する。また、セッションは、その一連の行動が同一のユーザによって行われていることを意味する。 In the embodiment, the session indicates a series of user actions on the network. For example, a session means a period during which a user performs an action with a predetermined purpose (intention). A session means that the series of actions are performed by the same user.

例えば、セッションは、予め設定された所定時間（例えば３０分）で定義付けられてもよいし、ユーザがショッピングサイトにログインしてからログオフするまでの間と定義付けられてもよい。また、セッションは、ユーザがショッピングサイトにログインしてから、所定の行動（例えば、任意の商品の注文手続きを完了させたこと）が行われるまでの間と定義付けられてもよい。あるいは、セッションは、ある行動が行われてから所定時間（例えば５分）が経過しないうちに次の行動が行われた場合にはセッションが継続していると判定され、次の行動が行われない場合にはセッションが終了したと判定されるように定義付けられてもよい。具体的には、判定装置１００は、ユーザがユーザ端末１０への操作を５分以上の間隔を空けずに継続している場合に、セッションが継続していると判定してもよい。また、判定装置１００は、セッションの継続中において、ユーザの操作が最後に行われてから５分以上ユーザの操作がなかった場合に、セッションを終了したとして、新たなセッションを開始してもよい。 For example, the session may be defined in a predetermined time (for example, 30 minutes) set in advance, or may be defined as a period from when the user logs in to the shopping site until the user logs off. Further, the session may be defined as a period from when the user logs in to the shopping site until a predetermined action (for example, completion of an ordering procedure for an arbitrary product) is performed. Alternatively, if a next action is performed before a predetermined time (for example, 5 minutes) has elapsed since a certain action was performed, the session is determined to be continued and the next action is performed. If not, it may be defined so that it is determined that the session has ended. Specifically, the determination apparatus 100 may determine that the session is continued when the user continues the operation on the user terminal 10 without leaving an interval of 5 minutes or more. In addition, the determination apparatus 100 may start a new session, assuming that the session is ended when there is no user operation for 5 minutes or more after the user's operation was last performed during the continuation of the session. .

実施形態において、判定装置１００は、ユーザから送信される音声を認識し、所定のサービスを提供するものとする。図１の例では、判定装置１００は、送信された音声を認識し、所定の商品の注文を受け付けるショッピングサービスを提供する。なお、このようなサービスは、判定装置１００ではなく、判定装置１００と通信する所定のサーバ（例えば、ウェブサーバ）によって提供されてもよい。 In the embodiment, the determination apparatus 100 recognizes a voice transmitted from a user and provides a predetermined service. In the example of FIG. 1, the determination apparatus 100 provides a shopping service that recognizes a transmitted voice and receives an order for a predetermined product. Such a service may be provided not by the determination device 100 but by a predetermined server (for example, a web server) that communicates with the determination device 100.

判定装置１００は、例えば、ユーザから送信された音声が、任意の商品名と注文を意図する語とを含む場合、当該商品がユーザから注文されたと認識する。具体的には、判定装置１００は、音声を「お米買って。」と認識した場合、「お米」を、任意の商品名と認識し、「買って。」を、注文を意図する語として認識する。すなわち、判定装置１００は、ショッピングサービスにおいてユーザがお米を注文したと認識し、注文に関する手続き処理を行う。なお、音声認識については、種々の既知の技術を用いることが可能であるため、説明を省略する。 For example, when the voice transmitted from the user includes an arbitrary product name and a word intended to be ordered, the determination apparatus 100 recognizes that the product is ordered from the user. Specifically, when the determination apparatus 100 recognizes the voice as “buy rice”, it recognizes “rice” as an arbitrary product name, and “buy”. Recognize as That is, the determination apparatus 100 recognizes that the user has ordered rice in the shopping service, and performs a procedure process related to the order. Note that various known techniques can be used for speech recognition, and thus description thereof is omitted.

一般に、音声認識を利用しない注文の場合、サービスを提供するサーバは、所定の識別情報に基づいて、ユーザのセッションを判定する。例えば、サービスを提供するサーバは、ショッピングページにアクセスしたユーザＵ１のクッキー情報を取得し、クッキー情報に基づいてセッションを判定する。これにより、サービスを提供するサーバは、例えば、ショッピングサイトにログインしたことや、お米をカートに入れたことや、カートに入れたお米を注文する等の行動が、ユーザＵ１によって行われた一連の行動であると判定できる。 In general, in the case of an order that does not use voice recognition, a server that provides a service determines a user's session based on predetermined identification information. For example, the server that provides the service acquires the cookie information of the user U1 who has accessed the shopping page, and determines the session based on the cookie information. As a result, the server that provides the service performs actions such as logging in to a shopping site, putting rice in a cart, and ordering rice in a cart, by the user U1. It can be determined that this is a series of actions.

一方、音声認識を利用した注文が行われる場合、サービスを提供するサーバは、ユーザの所定の行動が、１つのセッションに含まれるか否かを判定できない場合がある。例えば、ユーザが「お米」と言ったあとに、何らかの別の行動を行ったことにより、所定時間、音声が途切れたとする。そして、ユーザは、所定時間ののちに、「買って。」と発話したとする。この場合、サービスを提供するサーバは、先のユーザの発声と、後で行われた発声とが、１つのセッションに含まれるか否かを判定することが難しい。具体的には、サービスを提供するサーバは、「買って。」という命令に対応する商品名を認識することができない。このため、ユーザは、再度音声の入力を行うこととなる。場合によっては、ユーザは、セッションが終了したと判定され、再度のログインを求められることもある。このように、セッションが適切に判定できない場合、サービス側は、サービスのユーザビリティを低下させるおそれがある。 On the other hand, when an order using voice recognition is performed, the server that provides the service may not be able to determine whether or not the predetermined action of the user is included in one session. For example, it is assumed that the voice is interrupted for a predetermined time due to some other action after the user says “rice”. Then, it is assumed that the user utters “Buy” after a predetermined time. In this case, it is difficult for the server providing the service to determine whether or not the utterance of the previous user and the utterance performed later are included in one session. Specifically, the server providing the service cannot recognize the product name corresponding to the command “Buy.”. For this reason, the user inputs voice again. In some cases, the user may be determined to have completed the session and asked to log in again. As described above, when the session cannot be determined appropriately, the service side may reduce the usability of the service.

そこで、実施形態に係る判定装置１００は、実施形態に係る判定処理によって、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する。具体的には、判定装置１００は、ユーザが利用する端末装置であるユーザ端末１０を介して、ユーザの周囲の環境音を収集する。そして、判定装置１００は、収集した環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する。これにより、判定装置１００は、ユーザの置かれた状況に基づき適切にセッションの判定を行うので、ユーザビリティの高いセッション管理を行うことができる。以下、図１を用いて、実施形態に係る判定処理を流れに沿って説明する。 Therefore, the determination apparatus 100 according to the embodiment determines whether or not the predetermined action of the user is included in one session by the determination process according to the embodiment. Specifically, the determination apparatus 100 collects environmental sounds around the user via the user terminal 10 that is a terminal apparatus used by the user. And the determination apparatus 100 determines whether a user's predetermined | prescribed action is contained in one session based on the continuity of the collected environmental sound. Thereby, since the determination apparatus 100 appropriately determines a session based on a situation where the user is placed, session management with high usability can be performed. Hereinafter, the determination process according to the embodiment will be described along the flow with reference to FIG. 1.

図１に示すユーザ端末１０は、ユーザによって利用される情報処理端末である。図１では、ユーザ端末１０は、例えばスマートフォン（Smartphone）である。図１の例では、ユーザ端末１０は、ユーザの一例であるユーザＵ１によって利用されるものとする。なお、以下では、ユーザをユーザ端末１０と読み替える場合がある。例えば、「ユーザＵ１が音声を送信する」という記載は、実際には、「ユーザＵ１が利用するユーザ端末１０が音声を送信する」という状況を示す場合がある。 A user terminal 10 shown in FIG. 1 is an information processing terminal used by a user. In FIG. 1, the user terminal 10 is, for example, a smartphone. In the example of FIG. 1, the user terminal 10 is used by a user U1 that is an example of a user. Hereinafter, the user may be read as the user terminal 10. For example, the description “user U1 transmits voice” may actually indicate a situation “user terminal 10 used by user U1 transmits voice”.

図１の例において、ユーザ端末１０は、周囲の環境音や、ユーザＵ１からの音声入力を検知する（ステップＳ０１）。例えば、ユーザ端末１０は、内部に備えられたマイクロフォンによりこれらの音を検知する。なお、以下では、環境音や音声などの音に関する情報を、音情報と総称する場合がある。 In the example of FIG. 1, the user terminal 10 detects ambient environmental sounds and voice input from the user U1 (step S01). For example, the user terminal 10 detects these sounds with a microphone provided therein. In the following, information related to sounds such as environmental sounds and sounds may be collectively referred to as sound information.

実施形態では、環境音とは、ユーザＵ１が何らかの意図をもってユーザ端末１０に入力した音声以外の音をいう。例えば、環境音の一例は、ユーザＵ１が拠点とする場所における暗騒音となりうる音である。具体的には、環境音は、ユーザＵ１の自宅５０における、エアコン６０の稼働音や、テレビ７０が出力する音である。また、環境音は、自宅５０における、水道から水が流れる音や、ドアの開け閉めの音などの生活音であってもよい。また、環境音は、自然環境に関する音でもよい。例えば、環境音は、自宅５０内で検知できる外の雨音や、風の音等でもよい。また、環境音は、人工的に発せられる音でもよい。例えば、環境音は、ユーザＵ１がマウスをクリックする音や、キーボードを押下する音等であってもよい。 In the embodiment, the environmental sound refers to a sound other than the voice input to the user terminal 10 by the user U1 with some intention. For example, an example of the environmental sound is a sound that can be background noise in a place where the user U1 is based. Specifically, the environmental sound is an operating sound of the air conditioner 60 or a sound output from the television 70 at the home 50 of the user U1. Further, the environmental sound may be a living sound such as a sound of water flowing from the water supply at home 50 or a sound of opening and closing the door. The environmental sound may be a sound related to the natural environment. For example, the environmental sound may be an external rain sound that can be detected in the home 50, a wind sound, or the like. The environmental sound may be a sound that is artificially emitted. For example, the environmental sound may be a sound when the user U1 clicks the mouse, a sound when the keyboard is pressed, or the like.

また、ユーザＵ１からの音声入力とは、ユーザＵ１が何らかの意図をもってユーザ端末１０に入力した音声をいう。図１の例では、ユーザＵ１からの音声入力は、例えば、ユーザ端末１０を介して、判定装置１００が提供するショッピングサービスを利用するために発せられる音声である。なお、ユーザＵ１からの音声入力は、ユーザ端末１０に対する指示の音声であってもよい。例えば、ユーザＵ１からの音声入力は、ユーザ端末１０を起動するための音声や、判定装置１００が提供するサービスへアクセスすることを指示する音声や、サービスにログインするためにユーザ端末１０の認証（例えば、声紋認証）を受けるための音声等であってもよい。 Further, the voice input from the user U1 refers to a voice that the user U1 inputs to the user terminal 10 with some intention. In the example of FIG. 1, the voice input from the user U 1 is, for example, voice that is uttered to use a shopping service provided by the determination apparatus 100 via the user terminal 10. Note that the voice input from the user U 1 may be a voice for instructing the user terminal 10. For example, a voice input from the user U1 is a voice for starting the user terminal 10, a voice for instructing access to a service provided by the determination apparatus 100, or authentication of the user terminal 10 for logging in to the service ( For example, it may be voice for receiving voiceprint authentication.

ユーザ端末１０は、これらの検知した音情報を、ネットワークを介して、判定装置１００に送信する（ステップＳ０２）。判定装置１００は、送信された音情報を収集する（ステップＳ０３）。図１の例では、判定装置１００は、ユーザ端末１０の周囲の環境音とともに、ユーザＵ１の音声入力である「お米・・・」という音情報を収集する。 The user terminal 10 transmits the detected sound information to the determination device 100 via the network (step S02). The determination apparatus 100 collects the transmitted sound information (step S03). In the example of FIG. 1, the determination apparatus 100 collects sound information “rice...” That is a voice input of the user U 1 along with environmental sounds around the user terminal 10.

ステップＳ０３の後、所定時間が経過したものとする（ステップＳ０４）。例えば、ユーザＵ１が、「お米・・・」という音声を発した後に、他に購入すべき商品がなかったかを確認するためユーザ端末１０に保存されていたメモを閲覧したことで、所定時間（例えば、数分間）が経過したものとする。また、この間、ユーザＵ１は、音声を発せず、ユーザ端末１０に表示されていたメモを閲覧していたものとする。 It is assumed that a predetermined time has elapsed after step S03 (step S04). For example, after the user U1 utters the voice “rice ...”, the user U1 browses a memo stored in the user terminal 10 to confirm whether there is any other product to be purchased, and thus for a predetermined time. It is assumed that (for example, several minutes) has elapsed. Further, during this time, it is assumed that the user U1 is browsing the memo displayed on the user terminal 10 without making a voice.

ユーザ端末１０は、ステップＳ０４を経て、さらに、周囲の環境音やユーザＵ１からの音声入力を検知する（ステップＳ０５）。ステップＳ０５において、ユーザ端末１０は、ユーザＵ１の「買って。」という音声入力を検知する。また、ユーザ端末１０は、ステップＳ０１からステップＳ０５までと同様に、エアコン６０が稼働する音や、テレビ７０から出力される音などの環境音を検知しているものとする。 Through step S04, the user terminal 10 further detects ambient environmental sound and voice input from the user U1 (step S05). In step S 05, the user terminal 10 detects a voice input “buy” of the user U 1. Further, it is assumed that the user terminal 10 detects environmental sounds such as a sound of operating the air conditioner 60 and a sound output from the television 70, as in steps S01 to S05.

ユーザ端末１０は、ステップＳ０２と同様、検知した音情報を判定装置１００に送信する（ステップＳ０６）。判定装置１００は、送信された音情報を収集する。 Similarly to step S02, the user terminal 10 transmits the detected sound information to the determination device 100 (step S06). The determination apparatus 100 collects the transmitted sound information.

そして、判定装置１００は、収集された環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する。具体的には、判定装置１００は、連続した環境音のもとで行われた一連の行動を１つのセッションと判定する（ステップＳ０７）。図１の例では、判定装置１００は、ステップＳ０１で検知された環境音と、ステップＳ０４を経て、ステップＳ０５で検知された環境音とに連続性がある場合に、ステップＳ０１で発せられた「お米・・・」という音声入力と、ステップＳ０５で発せられた「買って。」という音声入力とが１つのセッションに含まれると判定する。 And the determination apparatus 100 determines whether a user's predetermined | prescribed action is contained in one session based on the continuity of the collected environmental sound. Specifically, the determination apparatus 100 determines a series of actions performed under a continuous environmental sound as one session (step S07). In the example of FIG. 1, the determination device 100 emits “step S01” when there is continuity between the environmental sound detected in step S01 and the environmental sound detected in step S05 via step S04. It is determined that the voice input “rice ...” and the voice input “buy.” Issued in step S05 are included in one session.

判定装置１００は、音声入力が１つのセッションに含まれると判定した場合、「買って。」という注文を意図する語を認識し、その注文の対象の商品名として、ステップＳ０１で発せられた「お米・・・」を認識する。すなわち、判定装置１００は、ステップＳ０１で行われた音声入力という行動と、ステップＳ０５で行われた音声入力という行動との間に所定時間が経過し、かつ、判定装置１００に対する何らかの入力が行われなくとも、環境音の連続性に基づいて、ユーザＵ１のセッションを維持する。 When the determination apparatus 100 determines that the voice input is included in one session, the determination apparatus 100 recognizes a word intended for an order “buy.”, And is issued as the product name of the order in “S01”. "Rice ..." That is, the determination apparatus 100 has a predetermined time between the action of voice input performed in step S01 and the action of voice input performed in step S05, and some input is performed on the determination apparatus 100. Even if not, the session of the user U1 is maintained based on the continuity of the environmental sound.

なお、環境音の連続性について詳しくは後述するが、例えば、判定装置１００は、ユーザ端末１０に検知された定常的に発せられる周囲の環境音を数秒毎に収集し、収集した環境音の音圧や周波数、波形等が所定の閾値を超えない場合に、環境音の連続性が保たれていると判定する。なお、判定装置１００は、音が発せられる方向や音源までの距離等を推定し、推定した情報が所定の閾値を超えない場合に、環境音の連続性が保たれていると判定してもよい。 Although the continuity of environmental sounds will be described in detail later, for example, the determination apparatus 100 collects ambient environmental sounds that are constantly emitted and detected by the user terminal 10 every few seconds, and the collected environmental sound sounds. When the pressure, frequency, waveform or the like does not exceed a predetermined threshold, it is determined that the continuity of the environmental sound is maintained. Note that the determination apparatus 100 estimates the direction in which sound is emitted, the distance to the sound source, and the like, and determines that the continuity of the environmental sound is maintained when the estimated information does not exceed a predetermined threshold. Good.

このように、実施形態に係る判定装置１００は、周囲の環境音を収集し、収集した環境音の連続性に基づいて、ユーザＵ１の所定の行動が１つのセッションに含まれるか否かを判定する。 As described above, the determination apparatus 100 according to the embodiment collects ambient environmental sounds, and determines whether or not the predetermined action of the user U1 is included in one session based on the continuity of the collected environmental sounds. To do.

すなわち、判定装置１００は、環境音に基づいて、音声認識を利用したサービスに対して行われる複数の行動が、ユーザＵ１という同一のユーザによって行われている一連の行動であることを判定する。具体的には、判定装置１００は、ユーザＵ１から送信される音声が途中で途切れた場合でも、環境音が連続している場合には、ユーザＵ１が移動したり、何か別の行動をしようとしたりしているのではなく、単にユーザ端末１０への入力を待機しているだけであると判定する。また、判定装置１００は、ユーザＵ１のログイン後、環境音が連続している場合には、ユーザ端末１０を利用するユーザに変化がないと判定する。このように、判定装置１００は、ユーザＵ１が、意図した音声を一度に連続して発さなくとも、環境音を利用することで、セッションを維持し続けることができる。これにより、ユーザＵ１は、セッションが途切れたことによる再度のログインや、再度の音声入力を行う手間を省くことができる。結果として、判定装置１００は、ユーザＵ１がネットワーク上において音声認識技術を利用する場合に、ユーザビリティの高いセッション管理を行うことができる。 That is, the determination apparatus 100 determines, based on the environmental sound, that the plurality of actions performed on the service using voice recognition are a series of actions performed by the same user, user U1. Specifically, even if the sound transmitted from the user U1 is interrupted in the middle, the determination device 100 may move or take some other action if the environmental sound is continuous. It is determined that it is merely waiting for an input to the user terminal 10. Moreover, the determination apparatus 100 determines that there is no change in the user who uses the user terminal 10 when the environmental sound continues after the user U1 logs in. Thus, the determination apparatus 100 can continue to maintain the session by using the environmental sound even if the user U1 does not continuously emit the intended sound at once. As a result, the user U1 can save the trouble of performing re-login due to the session being interrupted or performing re-speech input. As a result, the determination apparatus 100 can perform session management with high usability when the user U1 uses voice recognition technology on the network.

また、判定装置１００によれば、ユーザＵ１の音声入力と併せて、自然とユーザ端末１０によって検知される環境音を処理に利用するため、ユーザＵ１やユーザ端末１０に、特に何らかの処理を実行させて、セッションを維持することを要しない。すなわち、判定装置１００は、ユーザＵ１からの音声を受け付ける際に、当然に収集される環境音を用いて処理を行うため、ユーザＵ１に余計な負担を掛けることなく、ユーザビリティの高いセッション管理を行うことができる。 Moreover, according to the determination apparatus 100, in order to use the environmental sound naturally detected by the user terminal 10 for processing together with the voice input of the user U1, the user U1 and the user terminal 10 are caused to perform some processing in particular. You don't need to maintain a session. That is, since the determination apparatus 100 performs processing using the environmental sound that is naturally collected when accepting the sound from the user U1, the determination apparatus 100 performs session management with high usability without imposing an extra burden on the user U1. be able to.

また、判定装置１００は、環境音と、既存のセッション管理の処理とを組み合わせてもよい。例えば、判定装置１００は、環境音が連続している場合であっても、サービスの利用に関して比較的長い時間（例えば１時間など）が経過した場合には、セッションを終了させてもよい。また、ユーザＵ１が、移動しながらユーザ端末１０に対して音声入力を行う場合も想定される。この場合、判定装置１００は、環境音は連続しないものの、ユーザＵ１の音声入力が継続しているならば、ユーザＵ１の音声入力の連続性を優先して判定し、セッションを維持してもよい。 Moreover, the determination apparatus 100 may combine environmental sound and existing session management processing. For example, even when the environmental sound is continuous, the determination device 100 may end the session when a relatively long time (for example, 1 hour) has elapsed with respect to the use of the service. It is also assumed that the user U1 performs voice input to the user terminal 10 while moving. In this case, the determination device 100 may determine the priority of the continuity of the voice input of the user U1 and maintain the session if the voice input of the user U1 continues while the environmental sound is not continuous. .

また、判定装置１００は、環境音の連続性を利用して、コンテンツの効果測定を行ってもよい。例えば、ユーザＵ１は、ユーザ端末１０やテレビ７０を介して、コンテンツ（例えば、動画による広告コンテンツや、テレビコマーシャル）を閲覧する場合がある。そして、判定装置１００は、ユーザＵ１がコンテンツを閲覧したのち、環境音が連続している間に、そのコンテンツに関する何らかの行動をユーザＵ１がとったという情報を収集する。例えば、ユーザＵ１は、そのコンテンツが宣伝する商品について「あの商品いいね」とつぶやく場合がある。判定装置１００は、ユーザ端末１０が検知した環境音とともに、このような好意的なつぶやきを音声入力として収集する。そして、判定装置１００は、例えば商品名をキーとして、コンテンツと、ユーザＵ１の好意的なつぶやきを対応付ける。この場合、判定装置１００は、当該コンテンツがユーザＵ１に対して効果を発揮したと判定する。 Further, the determination apparatus 100 may perform content effect measurement using the continuity of environmental sounds. For example, the user U1 may browse content (for example, advertisement content by a moving image or television commercial) via the user terminal 10 or the television 70. Then, after the user U1 browses the content, the determination apparatus 100 collects information that the user U1 took some action regarding the content while the environmental sound is continuous. For example, the user U1 may tweet “I like that product” for the product advertised by the content. The determination apparatus 100 collects such favorable tweets as voice inputs together with the environmental sound detected by the user terminal 10. Then, for example, the determination apparatus 100 associates the content with a favorable tweet of the user U1 using the product name as a key. In this case, the determination apparatus 100 determines that the content has exerted an effect on the user U1.

一般に、コンテンツがユーザＵ１に対して効果を発揮したかは、その因果関係を証明することが難しいため、真に効果を発揮したか否かを測定することは難しい。判定装置１００によれば、環境音が連続している間に、そのコンテンツに対して何らかの行動をとったという情報を収集するので、コンテンツの配信が動機付けとなってユーザＵ１に何らかの行動をとらせた、という推定を高い確度で行うことができる。 In general, it is difficult to prove whether or not the content is effective for the user U1 because it is difficult to prove the causal relationship. According to the determination apparatus 100, information indicating that some action has been taken on the content is collected while the environmental sound is continuous. Therefore, the content distribution is motivated to take some action on the user U1. It can be estimated with high accuracy.

なお、図１の例では、実施形態に係る判定処理を判定装置１００が行う例を示したが、より正確には、実施形態に係る判定処理は、判定装置１００内で実行される判定プログラムによって実行される。以下、このような判定プログラムに従って、上述した判定処理を実行する判定装置１００等について詳細に説明する。 In the example of FIG. 1, an example in which the determination apparatus 100 performs the determination process according to the embodiment is shown, but more accurately, the determination process according to the embodiment is performed by a determination program executed in the determination apparatus 100. Executed. Hereinafter, the determination apparatus 100 that executes the above-described determination process according to such a determination program will be described in detail.

〔２．判定処理システムの構成〕
次に、図２を用いて、実施形態に係る判定装置１００が含まれる判定処理システム１の構成について説明する。図２は、実施形態に係る判定処理システム１の構成例を示す図である。図２に例示するように、実施形態に係る判定処理システム１には、ユーザ端末１０と、判定装置１００とが含まれる。これらの各種装置は、ネットワークＮを介して、有線又は無線により通信可能に接続される。また、判定処理システム１には、複数のユーザ端末１０が含まれてもよい。すなわち、ユーザは、１台のユーザ端末１０のみならず、複数台のユーザ端末１０を所有し、利用してもよい。 [2. Judgment processing system configuration]
Next, the configuration of the determination processing system 1 including the determination apparatus 100 according to the embodiment will be described with reference to FIG. FIG. 2 is a diagram illustrating a configuration example of the determination processing system 1 according to the embodiment. As illustrated in FIG. 2, the determination processing system 1 according to the embodiment includes a user terminal 10 and a determination device 100. These various devices are communicably connected via a network N by wire or wireless. The determination processing system 1 may include a plurality of user terminals 10. That is, the user may own and use not only one user terminal 10 but also a plurality of user terminals 10.

ユーザ端末１０は、デスクトップ型ＰＣ（Personal Computer）や、ノート型ＰＣや、タブレット端末や、スマートフォンを含む携帯電話機、ＰＤＡ（Personal Digital Assistant）等の情報処理端末である。また、ユーザ端末１０には、眼鏡型や時計型の情報処理端末であるウェアラブルデバイス（wearable device）も含まれる。さらに、ユーザ端末１０には、情報処理機能を有する種々のスマート機器が含まれてもよい。例えば、ユーザ端末１０には、ＴＶ（Television）や冷蔵庫、掃除機などのスマート家電や、自動車などのスマートビークル（Smart vehicle）や、ドローン（drone）、家庭用ロボットなどが含まれてもよい。また、ユーザ端末１０には、ユーザからの音声入力を検知し、所定の処理を行う各装置が含まれてもよい。例えば、ユーザ端末１０は、音声を検知して動作するスピーカーや照明装置等であってもよい。 The user terminal 10 is an information processing terminal such as a desktop PC (Personal Computer), a notebook PC, a tablet terminal, a mobile phone including a smartphone, or a PDA (Personal Digital Assistant). The user terminal 10 also includes a wearable device that is a glasses-type or watch-type information processing terminal. Furthermore, the user terminal 10 may include various smart devices having an information processing function. For example, the user terminal 10 may include a smart home appliance such as a TV (television), a refrigerator, and a vacuum cleaner, a smart vehicle such as an automobile, a drone, and a home robot. Further, the user terminal 10 may include devices that detect voice input from the user and perform predetermined processing. For example, the user terminal 10 may be a speaker or a lighting device that operates by detecting sound.

ユーザ端末１０は、マイクロフォン等の集音装置を有し、ユーザからの音声入力や、ユーザ端末１０の周囲の環境音を検知する。なお、ユーザ端末１０は、マイクロフォンを内蔵するのではなく、マイクロフォンと有線又は無線の通信で接続されたり、音声情報をデータとして入力されたりすることによって、ユーザの音声や環境音を検知してもよい。 The user terminal 10 includes a sound collection device such as a microphone, and detects voice input from the user and environmental sounds around the user terminal 10. Note that the user terminal 10 does not have a built-in microphone, but can detect a user's voice or environmental sound by being connected to the microphone by wired or wireless communication or by inputting voice information as data. Good.

判定装置１００は、ユーザ端末１０の周囲の環境音を収集し、収集した環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定するサーバ装置である。判定装置１００は、実施形態に係る判定プログラムを内部で動作させることにより、実施形態に係る判定処理を実行する。 The determination apparatus 100 is a server apparatus that collects environmental sounds around the user terminal 10 and determines whether a predetermined action of the user is included in one session based on the continuity of the collected environmental sounds. . The determination apparatus 100 executes the determination process according to the embodiment by operating the determination program according to the embodiment internally.

〔３．判定装置の構成〕
次に、図３を用いて、実施形態に係る判定装置１００の構成について説明する。図３は、実施形態に係る判定装置１００の構成例を示す図である。図３に示すように、判定装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。なお、判定装置１００は、判定装置１００を利用する管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示するための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [3. (Configuration of judgment device)
Next, the configuration of the determination apparatus 100 according to the embodiment will be described with reference to FIG. FIG. 3 is a diagram illustrating a configuration example of the determination apparatus 100 according to the embodiment. As illustrated in FIG. 3, the determination apparatus 100 includes a communication unit 110, a storage unit 120, and a control unit 130. The determination device 100 includes an input unit (for example, a keyboard and a mouse) that receives various operations from an administrator who uses the determination device 100, and a display unit (for example, a liquid crystal display) that displays various types of information. You may have.

（通信部１１０について）
通信部１１０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。かかる通信部１１０は、ネットワークＮと有線又は無線で接続され、ネットワークＮを介して、ユーザ端末１０との間で情報の送受信を行う。 (About the communication unit 110)
The communication unit 110 is realized by, for example, a NIC (Network Interface Card). The communication unit 110 is connected to the network N by wire or wirelessly, and transmits / receives information to / from the user terminal 10 via the network N.

（記憶部１２０について）
記憶部１２０は、例えば、ＲＡＭ、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。図３に示すように、記憶部１２０は、セッション記憶部１２１と、効果測定記憶部１２２とを有する。以下、記憶部１２０に含まれる各記憶部について順に説明する。なお、重複する項目に関する説明は、適宜省略する。 (About the storage unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory device such as a RAM or a flash memory, or a storage device such as a hard disk or an optical disk. As illustrated in FIG. 3, the storage unit 120 includes a session storage unit 121 and an effect measurement storage unit 122. Hereinafter, each storage unit included in the storage unit 120 will be described in order. In addition, the description regarding the overlapping item is abbreviate | omitted suitably.

（セッション記憶部１２１について）
セッション記憶部１２１は、セッションに関する情報を記憶する。ここで、図４に、実施形態に係るセッション記憶部１２１の一例を示す。図４は、実施形態に係るセッション記憶部１２１の一例を示す図である。図４に示した例では、セッション記憶部１２１は、「セッションＩＤ」、「ユーザＩＤ」、「環境音情報」、「行動情報」といった項目を有する。また、行動情報は、「入力手段」、「収集データ」、「内容」、「日時」といった小項目を有する。 (Regarding session storage unit 121)
The session storage unit 121 stores information related to the session. Here, FIG. 4 shows an example of the session storage unit 121 according to the embodiment. FIG. 4 is a diagram illustrating an example of the session storage unit 121 according to the embodiment. In the example illustrated in FIG. 4, the session storage unit 121 includes items such as “session ID”, “user ID”, “environmental sound information”, and “behavior information”. Also, the behavior information has small items such as “input means”, “collected data”, “content”, and “date and time”.

「セッションＩＤ」は、セッションを識別する識別情報である。「ユーザＩＤ」は、ユーザを識別する識別情報である。なお、実施形態では、図４に示すような識別情報を参照符号として用いる場合がある。例えば、ユーザＩＤ「Ｕ１」によって識別されるユーザを「ユーザＵ１」と表記する場合がある。 “Session ID” is identification information for identifying a session. “User ID” is identification information for identifying a user. In the embodiment, identification information as shown in FIG. 4 may be used as a reference symbol. For example, the user identified by the user ID “U1” may be referred to as “user U1”.

「環境音情報」は、収集された環境音に関する情報を示す。「行動情報」は、セッションにおけるユーザの所定の行動を示す。「入力手段」は、ユーザの行動における入力手段を示す。具体的には、入力手段は、ユーザがユーザ端末１０に対して情報を入力する際に用いた手段（音声入力やタッチパネルに対する操作等）を示す。「収集データ」は、ユーザの行動を示す情報として収集された具体的なデータを示す。「内容」は、ユーザの行動の内容を示す。「日時」は、ユーザの行動が行われた日時を示す。 “Environmental sound information” indicates information relating to the collected environmental sound. “Behavior information” indicates a predetermined behavior of the user in the session. “Input means” indicates an input means in the user's action. Specifically, the input means indicates means (such as voice input or operation on the touch panel) used when the user inputs information to the user terminal 10. “Collected data” indicates specific data collected as information indicating user behavior. “Content” indicates the content of the user's action. “Date and time” indicates the date and time when the user's action was performed.

なお、図４に示す例では、項目に記憶される情報として、「環境音データ＃１」や、「音声データ＃１」や、「時間＃１」といった概念的な情報を示しているが、実際には、任意の形式の音声ファイルや、日時を示す値等が記憶される。 In the example shown in FIG. 4, conceptual information such as “environmental sound data # 1”, “audio data # 1”, and “time # 1” is shown as information stored in the item. Actually, an audio file of an arbitrary format, a value indicating date and time, and the like are stored.

すなわち、図４に示したデータの一例は、セッションＩＤ「ＳＥ０１」で識別されるセッションＳＥ０１は、ユーザＩＤ「Ｕ１」で識別されるユーザＵ１が行った一連の行動に関する情報を含むことを示している。また、セッションＳＥ０１において、環境音情報として「環境音データ＃１」が収集された際に、ユーザＵ１は、入力手段として「音声」を用いて「ログイン」を行っており、その際に収集されたデータは「音声データ＃１」であり、ログインが行われた日時は「時間＃１」であることを示している。 That is, the example of the data illustrated in FIG. 4 indicates that the session SE01 identified by the session ID “SE01” includes information regarding a series of actions performed by the user U1 identified by the user ID “U1”. Yes. In session SE01, when “environmental sound data # 1” is collected as environmental sound information, user U1 performs “login” using “voice” as an input means, and is collected at that time. The data is “voice data # 1”, and the date and time when login was performed is “time # 1”.

また、ユーザＵ１は、その後、「音声」を用いて「リクエスト」を行っており、その際に収集されたデータは「音声データ＃２」である。この行動は、例えば図１で示したように、「お米・・・」とユーザＵ１が発声したこと等によりサービス（すなわち、判定装置１００）に対するリクエストとして記憶される。また、ユーザＵ１は、その後、「タッチパネル」を用いて「ウェブ閲覧」を行っており、その際に収集されたデータは「操作データ＃１」である。この行動は、例えば図１で示したように、ユーザＵ１があるリクエストを行った後に、環境音が連続している間に行われた場合に、同一のセッションＳＥ０１の行動として記憶される。また、ユーザＵ１は、その後、「音声」を用いて「リクエスト」を行っており、その際に収集されたデータは「音声データ＃３」である。この行動は、例えば図１で示したように、環境音が連続している間に行われた場合に、「買って。」とユーザＵ１が発声したこと等によりサービスに対するリクエストとして記憶される。また、この行動は、同一のセッションＳＥ０１の行動として記憶される。 The user U1 subsequently makes a “request” using “voice”, and the data collected at that time is “voice data # 2”. For example, as shown in FIG. 1, this action is stored as a request for the service (that is, the determination device 100) due to the fact that the user U1 utters “rice ...”. Further, the user U1 subsequently performs “web browsing” using the “touch panel”, and the data collected at that time is “operation data # 1”. For example, as shown in FIG. 1, this behavior is stored as the behavior of the same session SE01 when the user U1 makes a certain request and is performed while the environmental sound is continuous. The user U1 subsequently makes a “request” using “voice”, and the data collected at that time is “voice data # 3”. For example, as shown in FIG. 1, this action is stored as a request for a service when the user U1 utters “Buy.” When it is performed while environmental sounds are continuous. This behavior is stored as the behavior of the same session SE01.

なお、図４の例では、説明のために、ユーザの所定の行動が行われる度に環境音情報が記憶される例を示したが、判定装置１００は、所定時間ごと（例えば３秒ごと）に環境音情報を収集してもよい。そして、判定装置１００は、ユーザの何らかの行動が記憶されない場合であっても、環境音の連続性が保持されると判定している間は、１つのセッションを継続させるようにしてもよい。 In the example of FIG. 4, for the sake of explanation, an example in which environmental sound information is stored every time a user's predetermined action is performed is shown. However, the determination apparatus 100 is configured to perform every predetermined time (eg, every 3 seconds). Environmental sound information may be collected. And even if it is a case where a user's some action is not memorize | stored, as long as it determines with the continuity of environmental sound being hold | maintained, the determination apparatus 100 may be made to continue one session.

（効果測定記憶部１２２について）
効果測定記憶部１２２は、コンテンツの効果測定に関する情報を記憶する。効果測定記憶部１２２は、データテーブルとして、行動テーブル１２３と、効果テーブル１２４とを有する。 (Regarding the effect measurement storage unit 122)
The effect measurement storage unit 122 stores information related to content effect measurement. The effect measurement storage unit 122 includes an action table 123 and an effect table 124 as data tables.

（行動テーブル１２３について）
行動テーブル１２３は、コンテンツの配信と、コンテンツ配信後のユーザの行動に関する情報を記憶する。ここで、図５に、実施形態に係る行動テーブル１２３の一例を示す。図５は、実施形態に係る行動テーブル１２３の一例を示す図である。図５に示した例では、行動テーブル１２３は、「セッションＩＤ」、「ユーザＩＤ」、「環境音情報」、「配信情報」、「行動情報」といった項目を有する。また、配信情報の項目は、「コンテンツＩＤ」、「配信日時」、「メディア」といった小項目を有する。また、行動情報の項目は、「入力手段」、「収集データ」、「関連コンテンツＩＤ」、「内容」といった小項目を有する。 (About the action table 123)
The behavior table 123 stores information regarding content distribution and user behavior after content distribution. Here, FIG. 5 shows an example of the action table 123 according to the embodiment. FIG. 5 is a diagram illustrating an example of the behavior table 123 according to the embodiment. In the example illustrated in FIG. 5, the behavior table 123 includes items such as “session ID”, “user ID”, “environmental sound information”, “distribution information”, and “behavior information”. In addition, the item of distribution information has small items such as “content ID”, “distribution date / time”, and “media”. The action information item has small items such as “input means”, “collected data”, “related content ID”, and “content”.

「配信情報」は、ユーザに配信されたコンテンツに関する情報を示す。「コンテンツＩＤ」は、コンテンツを識別する識別情報を示す。「配信日時」は、コンテンツがユーザに配信された日時を示す。「メディア」は、コンテンツが配信されたメディアを示す。メディアは、例えば、ユーザ端末１０で表示されるウェブページであってもよいし、ユーザ端末１０以外のテレビやラジオ等であってもよい。例えば、判定装置１００は、ユーザ端末１０から送信されるユーザの行動履歴を収集する際に、配信されたコンテンツや、コンテンツが配信された日時に関する情報を収集する。また、メディアがテレビやラジオである場合、判定装置１００は、予めテレビやラジオにおいてコンテンツが配信される配信情報を外部サーバから取得してもおいてもよいし、収集された音情報に基づいて、コンテンツを特定してもよい。 “Distribution information” indicates information related to the content distributed to the user. “Content ID” indicates identification information for identifying content. “Distribution date / time” indicates the date / time when the content was distributed to the user. “Media” indicates media on which content is distributed. The media may be, for example, a web page displayed on the user terminal 10 or a television or radio other than the user terminal 10. For example, when collecting the user action history transmitted from the user terminal 10, the determination apparatus 100 collects information regarding the distributed content and the date and time when the content was distributed. When the medium is a television or radio, the determination apparatus 100 may acquire distribution information for distributing content on the television or radio from an external server in advance, or based on collected sound information. The content may be specified.

「行動情報」は、コンテンツに反応したユーザの行動に関する情報を示す。「関連コンテンツＩＤ」は、ユーザの行動に関連していると推定されるコンテンツの識別情報を示す。なお、関連コンテンツＩＤとコンテンツＩＤとは、共通した識別情報が記憶されるものとする。「内容」は、ユーザの行動の内容を示す。 “Behavior information” indicates information related to the user's behavior in response to the content. “Related content ID” indicates identification information of content presumed to be related to user behavior. It is assumed that common identification information is stored for the related content ID and the content ID. “Content” indicates the content of the user's action.

判定装置１００は、種々の手法を用いて、ユーザの行動と、配信されたコンテンツとが関連するか否かを判定する。例えば、判定装置１００は、ユーザが商品名をつぶやいた際には、商品名を認識し、認識した商品名に対応するコンテンツ（例えば、その商品を宣伝する広告コンテンツ）を特定する。そして、判定装置１００は、そのユーザのツイート（つぶやき）と、コンテンツとが関連すると判定する。また、判定装置１００は、ユーザが商品を注文した場合に、同一セッションにおいて、当該商品に関するコンテンツ（例えばテレビコマーシャル）がユーザに対して配信されていた場合には、ユーザの注文という行動と、コンテンツとが関連すると判定する。 The determination apparatus 100 determines whether or not the user's action is related to the distributed content using various methods. For example, when the user tweeted the product name, the determination apparatus 100 recognizes the product name and specifies content corresponding to the recognized product name (for example, advertising content for promoting the product). Then, the determination apparatus 100 determines that the user's tweet (tweet) is associated with the content. In addition, when the user orders a product, the determination device 100, when content related to the product (for example, a television commercial) is distributed to the user in the same session, Are related.

すなわち、図５に示したデータの一例では、セッションＳＥ０２において、ユーザＵ１は、環境音データ＃２１が収集された時間＃２１のタイミングで、「テレビ」を介して、コンテンツＩＤ「Ｃ０１」で識別されるコンテンツＣ０１の配信を受けたことを示している。また、ユーザＵ１は、環境音データ＃２２が収集された時間＃２２のタイミングで、テレビを介して、コンテンツＣ０２の配信を受けたことを示している。 That is, in the example of the data shown in FIG. 5, in the session SE02, the user U1 is identified by the content ID “C01” via “TV” at the timing of the time # 21 when the environmental sound data # 21 is collected. This indicates that the distribution of the content C01 to be received has been received. In addition, the user U1 indicates that the content C02 has been distributed via the television at the time # 22 when the environmental sound data # 22 is collected.

また、図５に示したデータの一例では、ユーザＵ１が、セッションＳＥ０２において、音声データ＃２１というツイートを行っており、そのツイートに関連するコンテンツは、コンテンツＣ０１であることを示している。また、ユーザＵ１は、音声データ＃２２というリクエスト（例えば商品の購入）を判定装置１００に行っており、そのリクエストに関連するコンテンツは、コンテンツＣ０２であることを示している。 Further, in the example of the data shown in FIG. 5, it is indicated that the user U1 has tweeted the voice data # 21 in the session SE02, and the content related to the tweet is the content C01. In addition, the user U1 makes a request (for example, purchase of a product) of the audio data # 22 to the determination apparatus 100, and the content related to the request is the content C02.

なお、上記の例では、ユーザの行動としてツイートやリクエスト等を示したが、ユーザの行動はこれらに限られない。例えば、判定装置１００は、ユーザの行動として、コンテンツが宣伝する商品のウェブページにアクセスしたことや、商品に関する書き込みをＳＮＳ（Social Networking Service）に行ったことや、コンテンツを選択（タッチやクリック）したことや、商品をウィッシュリストに追加したことなど、種々の行動を収集してもよい。 In the above example, tweets and requests are shown as user actions, but user actions are not limited to these. For example, the determination device 100 selects, as a user's action, access to a web page of a product advertised by the content, writing about the product to an SNS (Social Networking Service), or selection of the content (touch or click). Various actions may be collected, such as having done or adding a product to the wish list.

（効果テーブル１２４について）
効果テーブル１２４は、コンテンツの効果に関する情報を記憶する。ここで、図６に、実施形態に係る効果テーブル１２４の一例を示す。図６は、実施形態に係る効果テーブル１２４の一例を示す図である。図６に示した例では、効果テーブル１２４は、「コンテンツＩＤ」、「ツイート率」、「ＣＶＲ（Conversion Rate）」といった項目を有する。 (About the effect table 124)
The effect table 124 stores information related to content effects. Here, FIG. 6 shows an example of the effect table 124 according to the embodiment. FIG. 6 is a diagram illustrating an example of the effect table 124 according to the embodiment. In the example illustrated in FIG. 6, the effect table 124 includes items such as “content ID”, “tweet rate”, and “CVR (Conversion Rate)”.

「ツイート率」は、ユーザに配信されたコンテンツに対して、ユーザがツイートを行った割合を示す。「ＣＶＲ」は、ユーザに配信されたコンテンツに対して、コンテンツの提供主に何らかの利益がもたらされた割合を示す。コンバージョンは、例えば、ユーザがコンテンツで宣伝された商品を購入したり、申込みを行ったり、資料請求を行ったり、コンテンツの提供主のウェブページにアクセスしたりした行動等が該当する。 The “tweet rate” indicates the rate at which the user tweeted the content distributed to the user. “CVR” indicates a rate at which some benefit is provided to the content provider with respect to the content distributed to the user. The conversion corresponds to, for example, an action in which a user purchases a product advertised with content, makes an application, requests a material, or accesses a web page of a content provider.

すなわち、図６に示したデータの一例では、コンテンツＣ０１のツイート率が「ツイート率＃１」であり、ＣＶＲが「ＣＶＲ＃１」であることを示している。 That is, the example of the data illustrated in FIG. 6 indicates that the tweet rate of the content C01 is “tweet rate # 1” and the CVR is “CVR # 1”.

（制御部１３０について）
図３に戻って説明を続ける。制御部１３０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、判定装置１００内部の記憶装置に記憶されている各種プログラム（例えば、判定プログラム）がＲＡＭ（Random Access Memory)を作業領域として実行されることにより実現される。また、制御部１３０は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (About the control unit 130)
Returning to FIG. 3, the description will be continued. The control unit 130 is a controller, for example, various programs (for example, determination programs) stored in a storage device inside the determination apparatus 100 by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like. Is implemented by using a RAM (Random Access Memory) as a work area. The control unit 130 is a controller, and is realized by an integrated circuit such as an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA).

図３に示すように、制御部１３０は、収集部１３１と、抽出部１３２と、判定部１３３と、測定部１３４とを有し、以下に説明する情報処理の機能や作用を実現または実行する。なお、制御部１３０の内部構成は、図３に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。また、制御部１３０が有する各処理部の接続関係は、図３に示した接続関係に限られず、他の接続関係であってもよい。また、制御部１３０が有する各処理部は、本願に係る判定プログラムによって実行される各手順に対応する。例えば、収集部１３１が実行する処理は、判定プログラムが判定装置１００に実行させる収集手順に対応する。同様に、抽出部１３２が実行する処理は、判定プログラムが判定装置１００に実行させる抽出手順に対応し、判定部１３３が実行する処理は、判定プログラムが判定装置１００に実行させる判定手順に対応し、測定部１３４が実行する処理は、判定プログラムが判定装置１００に実行させる測定手順に対応する。 As illustrated in FIG. 3, the control unit 130 includes a collection unit 131, an extraction unit 132, a determination unit 133, and a measurement unit 134, and implements or executes the information processing functions and operations described below. . Note that the internal configuration of the control unit 130 is not limited to the configuration illustrated in FIG. 3, and may be another configuration as long as the information processing described below is performed. Further, the connection relationship between the processing units included in the control unit 130 is not limited to the connection relationship illustrated in FIG. 3, and may be another connection relationship. Each processing unit included in the control unit 130 corresponds to each procedure executed by the determination program according to the present application. For example, the process executed by the collection unit 131 corresponds to a collection procedure that the determination program causes the determination apparatus 100 to execute. Similarly, the process executed by the extraction unit 132 corresponds to the extraction procedure that the determination program causes the determination device 100 to execute, and the process that the determination unit 133 executes corresponds to the determination procedure that the determination program causes the determination device 100 to execute. The process executed by the measurement unit 134 corresponds to a measurement procedure that the determination program causes the determination apparatus 100 to execute.

（収集部１３１について）
収集部１３１は、各種情報を収集する。例えば、収集部１３１は、ユーザ及びユーザ端末１０の周囲の環境音を収集する。 (About the collection unit 131)
The collection unit 131 collects various information. For example, the collection unit 131 collects environmental sounds around the user and the user terminal 10.

具体的には、収集部１３１は、ネットワークＮを介して、ユーザ端末１０が検知した環境音を収集する。なお、収集部１３１は、ユーザ端末１０が送信する音情報を収集し、音情報に含まれる環境音を収集してもよいし、ユーザ端末１０をクロール（crawl）することで、ユーザ端末１０内に保持されている音情報を収集し、音情報に含まれる環境音を収集してもよい。 Specifically, the collection unit 131 collects environmental sounds detected by the user terminal 10 via the network N. Note that the collection unit 131 may collect sound information transmitted by the user terminal 10 and collect environmental sounds included in the sound information, or crawl the user terminal 10 so that the user terminal 10 Sound information held in the sound information may be collected, and environmental sounds included in the sound information may be collected.

収集部１３１は、定常的に発せられる周囲の環境音を収集する。例えば、収集部１３１は、音が発せられる方向、音源までの距離、収集する音の波形、収集する音の音量（音圧）の少なくとも一つが定常的である周囲の環境音を収集する。より具体的には、収集部１３１は、ユーザの自宅等における、エアコン６０の稼働音や、テレビ７０が出力する音等を環境音として収集する。 The collection unit 131 collects ambient environmental sounds that are regularly emitted. For example, the collection unit 131 collects ambient environmental sounds in which at least one of the direction in which the sound is emitted, the distance to the sound source, the waveform of the collected sound, and the volume (sound pressure) of the collected sound is steady. More specifically, the collection unit 131 collects the operating sound of the air conditioner 60 at the user's home or the like, the sound output from the television 70, and the like as environmental sounds.

例えば、収集部１３１は、所定時間ごとに環境音を継続的に収集する。例えば、収集部１３１は、判定装置１００の管理者によって設定される所定時間ごと（例えば３秒や５秒ごと）に環境音を収集し続ける。この場合、収集される環境音は、ある瞬間の音情報のみならず、３秒や５秒の継続した音情報であってもよい。 For example, the collection unit 131 continuously collects environmental sounds every predetermined time. For example, the collection unit 131 continues to collect environmental sounds every predetermined time set by the administrator of the determination apparatus 100 (for example, every 3 seconds or 5 seconds). In this case, the collected environmental sound may be not only sound information at a certain moment but also continuous sound information of 3 seconds or 5 seconds.

収集部１３１は、環境音を含む音情報として、音圧レベルや、周波数や、推定される音源の数（環境音を構成していると推定される機器の数等）や、音圧や波形における周期等を収集する。なお、これらの音情報の収集について、収集部１３１は、適宜、既知の解析技術を利用して音情報を収集するようにしてもよい。 The collection unit 131 uses sound pressure level, frequency, the number of estimated sound sources (number of devices estimated to constitute the environmental sound, etc.), sound pressure, and waveform as sound information including environmental sound. Collect period etc. Note that regarding the collection of the sound information, the collection unit 131 may appropriately collect the sound information using a known analysis technique.

また、収集部１３１は、環境音のみならず、ユーザが発した音声情報を収集してもよい。具体的には、収集部１３１は、サービスを利用するためにユーザが発する音声情報を収集する。また、収集部１３１は、ユーザが発する音声を断続的に収集してもよい。この場合、後述する判定部１３３によって、断続的に収集した音声が、１つの意図を構成する音声情報であると判定される場合もある。 Further, the collection unit 131 may collect not only the environmental sound but also sound information emitted by the user. Specifically, the collection unit 131 collects voice information issued by the user in order to use the service. Further, the collection unit 131 may intermittently collect voices uttered by the user. In this case, the determination unit 133 described later may determine that the voice collected intermittently is voice information that constitutes one intention.

なお、収集部１３１は、音情報とともに、ユーザ端末１０が備えるセンサによって収集される各種情報を収集してもよい。例えば、収集部１３１は、ユーザ端末１０によって検知される環境情報を収集する。また、収集部１３１は、ユーザ端末１０自体のデバイス情報や、ユーザ端末１０と通信する外部装置に関する情報等を収集してもよい。 Note that the collection unit 131 may collect various information collected by a sensor included in the user terminal 10 together with the sound information. For example, the collection unit 131 collects environment information detected by the user terminal 10. The collection unit 131 may collect device information of the user terminal 10 itself, information about an external device that communicates with the user terminal 10, and the like.

具体的には、収集部１３１は、ユーザ端末１０によって検知される情報として、ユーザ端末１０（あるいは、ユーザ端末１０を利用するユーザ）が所在する位置を示す位置情報、ユーザ端末１０の周囲の温度、湿度情報、環境光の強さを示す光情報等を収集する。また、収集部１３１は、ユーザ端末１０が備えるカメラで撮影された写真や映像に基づいて、ユーザ端末１０の周囲の環境情報を収集してもよい。例えば、収集部１３１は、カメラで撮影された画像情報や、画像情報に含まれる位置情報、撮影された日時等に基づいて、ユーザ端末１０の周囲の環境情報を収集する。 Specifically, the collection unit 131 includes, as information detected by the user terminal 10, position information indicating a position where the user terminal 10 (or a user who uses the user terminal 10) is located, a temperature around the user terminal 10. Humidity information, light information indicating the intensity of ambient light, etc. are collected. Further, the collection unit 131 may collect environmental information around the user terminal 10 based on a photograph or video taken by a camera provided in the user terminal 10. For example, the collection unit 131 collects environmental information around the user terminal 10 based on image information captured by the camera, position information included in the image information, date and time of capture, and the like.

また、収集部１３１は、ユーザ端末１０自体のデバイス情報として、ユーザ端末１０のＣＰＵや、ＯＳ（Operating System）、メモリ等に関する情報、アンテナ等のネットワーク機能、インストールされたソフトウェア、使用されるブラウザソフトウェア、ユーザ端末１０が備える入力手段（例えば、マイクロフォンや、タッチパネル、指紋データを収集可能な指紋リーダ）等の情報を収集する。 In addition, the collection unit 131 includes, as device information of the user terminal 10 itself, information on the CPU, OS (Operating System), memory, and the like of the user terminal 10, network functions such as an antenna, installed software, and browser software to be used , Information such as input means (for example, a microphone, a touch panel, or a fingerprint reader capable of collecting fingerprint data) included in the user terminal 10 is collected.

また、収集部１３１は、ユーザ端末１０の動作状況を収集してもよい。例えば、収集部１３１は、ユーザ端末１０が起動状態にあるか否か、また、起動状態であれば、画面のＯＮ／ＯＦＦの状態や、ユーザ端末１０が移動／静止している状態か等の情報を収集する。かかる情報は、例えば、ユーザ端末１０にインストールされた所定のセンシング（sensing）機能を有するアプリによって収集され、ユーザ端末１０内部に保持される。また、収集部１３１は、ユーザ端末１０の動作状況として、ユーザ端末１０によって観測される加速度等を収集してもよい。 The collection unit 131 may collect the operating status of the user terminal 10. For example, the collection unit 131 determines whether or not the user terminal 10 is in an activated state, and if the user terminal 10 is in an activated state, whether the screen is on / off, whether the user terminal 10 is in a moving / still state, and the like Collect information. Such information is collected by, for example, an application having a predetermined sensing function installed in the user terminal 10 and is held in the user terminal 10. Further, the collection unit 131 may collect accelerations and the like observed by the user terminal 10 as the operation status of the user terminal 10.

また、収集部１３１は、ユーザ端末１０と通信する外部装置に関する情報として、ユーザ端末１０と相互の通信状態にある外部装置を識別する情報や、確立している通信の種類や周波数帯域等を収集する。 In addition, the collection unit 131 collects information for identifying an external device in communication with the user terminal 10 as well as information on the external device that communicates with the user terminal 10 and the type and frequency band of established communication. To do.

また、収集部１３１は、ユーザの行動に関する情報を収集してもよい。例えば、収集部１３１は、ユーザがユーザ端末１０を利用して閲覧したサービスページの情報や、検索サービスに送信したクエリの情報等を収集してもよい。 Further, the collection unit 131 may collect information related to user behavior. For example, the collection unit 131 may collect information on a service page browsed by the user using the user terminal 10, information on a query transmitted to the search service, and the like.

また、収集部１３１は、ユーザの個人情報を収集してもよい。例えば、収集部１３１は、所定のサービスを利用するための登録情報として、ユーザの個人情報を受け付ける。なお、収集部１３１は、ユーザの個人情報をウェブサーバ等のサービス側から収集してもよい。 The collection unit 131 may collect user personal information. For example, the collection unit 131 accepts user personal information as registration information for using a predetermined service. The collection unit 131 may collect user personal information from a service side such as a web server.

また、収集部１３１は、コンテンツの配信に関する情報を収集してもよい。例えば、収集部１３１は、コンテンツが配信されるメディア（テレビやラジオ等）や、コンテンツの配信日時等の情報を外部サーバから収集する。 The collection unit 131 may collect information related to content distribution. For example, the collection unit 131 collects information such as media (television, radio, etc.) to which content is distributed and content distribution date and time from an external server.

収集部１３１は、収集した情報を記憶部１２０内の各記憶部に格納する。また、収集部１３１は、記憶部１２０内に既に格納されている情報を適宜収集してもよい。 The collection unit 131 stores the collected information in each storage unit in the storage unit 120. Further, the collection unit 131 may appropriately collect information already stored in the storage unit 120.

（抽出部１３２について）
抽出部１３２は、収集部１３１によって収集された音情報から、環境音又はユーザの所定の行動を示す音情報を抽出する。 (About the extraction unit 132)
The extraction unit 132 extracts environmental information or sound information indicating a predetermined action of the user from the sound information collected by the collection unit 131.

抽出部１３２は、既知の音声解析処理に基づいて、ユーザ端末１０から収集した音情報のうち、ユーザの所定の行動を示す音声情報を抽出する。具体的には、抽出部１３２は、ユーザの行動を示す語として予め登録されている語に対応する音声が抽出された場合に、当該音声を、ユーザの所定の行動を示す音声情報として抽出する。 Based on a known voice analysis process, the extraction unit 132 extracts voice information indicating the user's predetermined behavior from the sound information collected from the user terminal 10. Specifically, when a voice corresponding to a word registered in advance as a word indicating the user's action is extracted, the extraction unit 132 extracts the voice as voice information indicating the user's predetermined action. .

例えば、抽出部１３２は、ユーザ端末１０から収集した音情報に、判定装置１００が提供するサービスを利用するためにユーザが入力する語（例えば、判定装置１００が提供するサービス名や、「ハロー」などの呼びかけであってもよい）が含まれている場合、その語と、前後の音声を、ユーザがサービスにログインを要求している音声情報として抽出する。あるいは、抽出部１３２は、ユーザ端末１０から収集した音情報に、サービスにおいて商品の注文を意図する語として登録されている語（例えば、「買って。」や、「買いたい。」や、「欲しい。」など）が含まれている場合、ユーザが任意の商品を注文しようとする行動に対応するものとして、その語と、前後の音声情報を抽出する。 For example, the extraction unit 132 uses, for sound information collected from the user terminal 10, a word input by the user to use a service provided by the determination device 100 (for example, a service name provided by the determination device 100 or “Hello”). The word and the previous and next voices are extracted as voice information that the user requests to log in to the service. Alternatively, the extraction unit 132 may include words registered in the sound information collected from the user terminal 10 as words intended to order products in the service (for example, “Buy.”, “I want to buy”, “ If the user wants to order an arbitrary product, the word and the audio information before and after that are extracted.

なお、抽出部１３２は、予め登録されているユーザの声紋に基づいて、その音声がユーザ本人により発せられた音声か否かを判定してもよい。抽出部１３２は、ユーザ本人と判定された場合のみ、処理対象とした音声を、当該ユーザの所定の行動に対応する音声情報として抽出してもよい。これにより、抽出部１３２は、ユーザ以外の者が発した音声を、ユーザの所定の行動に対応する音声情報として誤って抽出することを防止できる。 Note that the extraction unit 132 may determine whether or not the voice is a voice uttered by the user himself / herself based on a user's voiceprint registered in advance. Only when it is determined that the user is the user, the extraction unit 132 may extract the voice to be processed as voice information corresponding to the predetermined action of the user. Thereby, the extraction part 132 can prevent extracting accidentally the audio | voice which the person other than the user uttered as audio | voice information corresponding to a user's predetermined | prescribed action.

また、抽出部１３２は、ユーザの所定の行動を示す情報として抽出された音声情報以外の音情報を、環境音として抽出する。また、抽出部１３２は、ユーザの所定の行動を示す音声情報と、環境音が同時に発せられる場合（例えば、環境音が、ユーザの発した音声に対する暗騒音として検知される場合）には、ユーザの所定の行動に対応する音情報と、環境音に対応する音情報とを分離して認識し、それぞれを抽出してもよい。 Further, the extraction unit 132 extracts sound information other than the sound information extracted as information indicating the user's predetermined behavior as environmental sound. In addition, when the sound information indicating the user's predetermined action and the environmental sound are emitted simultaneously (for example, when the environmental sound is detected as background noise with respect to the sound emitted by the user), the extraction unit 132 is The sound information corresponding to the predetermined action and the sound information corresponding to the environmental sound may be separately recognized and extracted.

また、抽出部１３２は、音情報を解析し、ユーザの所定の行動を示す音声情報や、環境音を抽出するにあたり、既知の技術を用いて、所定の学習モデルを生成してもよい。例えば、抽出部１３２は、生成したモデルに音情報を入力することにより、その音情報に含まれるユーザの所定の行動を示す音声情報を抽出したり、環境音を抽出したりしてもよい。例えば、モデルは、ユーザが発する音声の癖等を予め学習したモデルであってもよいし、例えばディープラーニング等の各種学習技術によって学習が継続的に行われるものであってもよい。 In addition, the extraction unit 132 may analyze sound information and generate a predetermined learning model using a known technique when extracting voice information indicating a user's predetermined action or environmental sound. For example, the extraction unit 132 may extract sound information indicating a predetermined action of the user included in the sound information or extract environmental sound by inputting sound information into the generated model. For example, the model may be a model that learns in advance a voice utterance or the like produced by the user, or may be one in which learning is continuously performed by various learning techniques such as deep learning.

（判定部１３３について）
判定部１３３は、環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する。 (About the determination unit 133)
The determination unit 133 determines whether or not the predetermined action of the user is included in one session based on the continuity of the environmental sound.

例えば、判定部１３３は、定常的に発せられる周囲の環境音が収集されている間のユーザの所定の行動が、１つのセッションに含まれると判定する。より具体的には、判定部１３３は、定常的に発せられる周囲の環境音が所定の閾値を超えて変化した後に観測されたユーザの所定の行動は、１つのセッションに含まれないと判定する。言い換えれば、判定部１３３は、定常的に発せられる周囲の環境音が所定の閾値を超えない間に観測されたユーザの所定の行動は、１つのセッションに含まれると判定する。 For example, the determination unit 133 determines that a predetermined action of the user while ambient ambient sounds that are regularly emitted are collected is included in one session. More specifically, the determination unit 133 determines that the predetermined action of the user observed after the ambient environmental sound that is constantly emitted exceeds a predetermined threshold is not included in one session. . In other words, the determination unit 133 determines that the predetermined action of the user observed while the ambient environmental sound that is constantly emitted does not exceed the predetermined threshold is included in one session.

なお、この場合の所定の閾値には、種々の種別の情報が含まれてもよい。例えば、判定部１３３は、音圧や、周波数や、音の大小の周期や、波形等の各々に関して所定の閾値を設け、それらのうち少なくとも一つが閾値を超えて観測された場合に、環境音が連続しなくなったと判定してもよい。また、判定部１３３は、音圧や周波数等を含めた波形として環境音を捉え、ある時点で観測された環境音の波形と、現時点の環境音の波形とが非類似である場合に、環境音が連続しなくなったと判定してもよい。なお、類似か非類似かの基準は、既知の音声解析の技術に基づいて、任意に設定されてもよい。 Note that the predetermined threshold value in this case may include various types of information. For example, the determination unit 133 provides a predetermined threshold value for each of sound pressure, frequency, sound magnitude period, waveform, and the like, and when at least one of them is observed exceeding the threshold value, the environmental sound It may be determined that is no longer continuous. In addition, the determination unit 133 captures the environmental sound as a waveform including sound pressure, frequency, and the like, and when the environmental sound waveform observed at a certain time point and the current environmental sound waveform are dissimilar, It may be determined that the sound is no longer continuous. Note that the criteria for similarity or dissimilarity may be arbitrarily set based on known speech analysis techniques.

また、判定部１３３は、収集部１３１によって収集された環境音の連続性に基づいて、ユーザが発した断続的な音声が、一連の音声入力であるか否かを判定してもよい。すなわち、判定部１３３は、収集された音声が途切れている場合であっても、その音声とともに観測される環境音が連続している場合には、それらの音声を一連の音声入力として判定してもよい。図１で示した例では、判定部１３３は、ステップＳ０１でユーザＵ１が発した「お米・・・」という音声と、ステップＳ０５でユーザＵ１が発した「買って。」という音声とが、一連の音声入力であると判定する。 In addition, the determination unit 133 may determine whether or not the intermittent sound generated by the user is a series of sound inputs based on the continuity of the environmental sound collected by the collection unit 131. That is, even if the collected sound is interrupted, the determination unit 133 determines that the sound is a series of sound inputs when the environmental sound observed along with the sound is continuous. Also good. In the example illustrated in FIG. 1, the determination unit 133 includes a voice “rice ...” uttered by the user U1 in step S01 and a voice “buy.” Uttered by the user U1 in step S05. It is determined that the input is a series of voice inputs.

また、判定部１３３は、環境音が所定の時間（例えば、３０分や１時間など）を超えて連続する場合には、所定の時間の間に行われたユーザの所定の行動が１つのセッションに含まれると判定し、当該所定の時間を超えて行われたユーザの所定の行動は１つのセッションに含まれないと判定してもよい。すなわち、判定部１３３は、環境音の連続性のみならず、時間によるセッション管理を併用してもよい。 In addition, when the environmental sound continues for a predetermined time (for example, 30 minutes or 1 hour), the determination unit 133 indicates that the predetermined action of the user performed during the predetermined time is one session. It may be determined that the predetermined action of the user performed over the predetermined time is not included in one session. That is, the determination unit 133 may use not only the continuity of the environmental sound but also session management based on time.

また、判定部１３３は、環境音が不連続となった場合（環境音が所定の閾値を超えて変化した場合）であっても、ユーザの音声入力が継続している間には、その音声入力は１つのセッションに含まれると判定してもよい。 Further, even when the environmental sound becomes discontinuous (when the environmental sound has changed beyond a predetermined threshold), the determination unit 133 does not stop the sound while the user's voice input continues. The input may be determined to be included in one session.

また、判定部１３３は、ユーザ端末１０から収集された音情報のうちに、コンテンツの配信に関する情報が含まれているか否かを判定してもよい。判定部１３３は、環境音に、コンテンツが出力する音情報が含まれているか否かを判定することで、コンテンツが配信されたか否かを判定してもよい。あるいは、判定部１３３は、音情報のみならず、例えば、ユーザにコンテンツが配信されたことを示す情報（例えば、判定装置１００が提供するサービスにおいて、宣伝のための動画コンテンツが配信されたことを示す情報）に基づいて、コンテンツがユーザに配信されたことを判定してもよい。 Further, the determination unit 133 may determine whether or not the sound information collected from the user terminal 10 includes information related to content distribution. The determination unit 133 may determine whether or not the content has been distributed by determining whether or not the sound information output by the content is included in the environmental sound. Alternatively, the determination unit 133 may include not only the sound information but also information indicating that the content has been distributed to the user (for example, the fact that the video content for promotion has been distributed in the service provided by the determination device 100). It may be determined that the content has been distributed to the user.

（測定部１３４について）
測定部１３４は、判定部１３３によって、ユーザの所定の行動と、ユーザへのコンテンツの提供とが１つのセッションに含まれると判定された場合に、ユーザに提供されたコンテンツの効果を測定する。 (About measuring unit 134)
The measurement unit 134 measures the effect of the content provided to the user when the determination unit 133 determines that the predetermined action of the user and the provision of the content to the user are included in one session.

例えば、測定部１３４は、ユーザの所定の行動が、コンテンツを視聴したと推定される行動であるか否かに基づいて、ユーザに提供されたコンテンツの効果を測定する。具体的には、測定部１３４は、配信されたコンテンツに対して、ユーザが何らかの音声を発したという行動に基づいて、コンテンツの効果を測定する。より具体的には、測定部１３４は、コンテンツの配信後に、コンテンツの商品名や、商品をほめる言葉や、商品に関心を示す言葉等をつぶやいた場合に、当該コンテンツがユーザに視聴されたと推定する。そして、測定部１３４は、コンテンツが視聴されたことにより、コンテンツが効果を発揮したとみなして、当該コンテンツの効果を測定する。 For example, the measurement unit 134 measures the effect of the content provided to the user based on whether or not the predetermined behavior of the user is an action estimated to have viewed the content. Specifically, the measurement unit 134 measures the effect of the content based on the action that the user has made some sound on the distributed content. More specifically, the measurement unit 134 presumes that the content has been viewed by the user when the content product name, a word praising the product, a word indicating interest in the product, or the like is tweeted after the content is distributed. To do. Then, the measuring unit 134 measures the effect of the content by regarding that the content has been effective by viewing the content.

また、測定部１３４は、ユーザの所定の行動が、コンテンツに関するコンバージョンと成り得るか否かに基づいて、ユーザに提供されたコンテンツの効果を測定してもよい。コンバージョンは、例えば、ユーザがコンテンツで宣伝された商品を購入したり、申込みを行ったり、資料請求を行ったり、コンテンツの提供主のウェブページにアクセスしたりした行動等が該当する。 Further, the measurement unit 134 may measure the effect of the content provided to the user based on whether or not the predetermined action of the user can be a conversion related to the content. The conversion corresponds to, for example, an action in which a user purchases a product advertised with content, makes an application, requests a material, or accesses a web page of a content provider.

測定部１３４は、ユーザに対して配信されたコンテンツの識別情報や、コンテンツが配信された数や、コンテンツに対してツイートがあった割合（ツイート率）や、コンテンツに対してコンバージョンがあった割合（ＣＶＲ）等を記憶部１２０に格納する。なお、測定部１３４は、測定処理の際に、当該ユーザの性別や年齢等の属性に関する情報を取得してもよい。これにより、測定部１３４は、例えば、コンテンツが特に効果を発揮する年代や性別等に関する情報についても合わせて測定することができる。 The measuring unit 134 includes identification information of the content distributed to the user, the number of content distributions, the rate of tweeting for the content (tweet rate), and the rate of conversion for the content. (CVR) and the like are stored in the storage unit 120. Note that the measurement unit 134 may acquire information regarding attributes such as the sex and age of the user during the measurement process. Thereby, the measurement part 134 can also measure together about the information regarding the age, sex, etc. in which a content exhibits an effect especially, for example.

また、測定部１３４は、測定した効果に関する情報を、コンテンツの提供主に送信してもよい。コンテンツの提供主は、かかる情報を参照することで、配信されたコンテンツがユーザに対して効果を発揮したか否かを確認することができる。 Further, the measurement unit 134 may transmit information regarding the measured effect to the content provider. By referring to such information, the content provider can confirm whether or not the distributed content has been effective for the user.

〔４．ユーザ端末の構成〕
次に、図７を用いて、実施形態に係るユーザ端末１０の構成について説明する。図７は、実施形態に係るユーザ端末１０の構成例を示す図である。図７に示すように、ユーザ端末１０は、通信部１１と、入力部１２と、表示部１３と、検知部１４と、記憶部１５と、制御部１６とを有する。 [4. Configuration of user terminal]
Next, the configuration of the user terminal 10 according to the embodiment will be described with reference to FIG. FIG. 7 is a diagram illustrating a configuration example of the user terminal 10 according to the embodiment. As illustrated in FIG. 7, the user terminal 10 includes a communication unit 11, an input unit 12, a display unit 13, a detection unit 14, a storage unit 15, and a control unit 16.

通信部１１は、ネットワークＮと有線又は無線で接続され、判定装置１００との間で情報の送受信を行う。例えば、通信部１１は、ＮＩＣ等によって実現される。 The communication unit 11 is connected to the network N by wire or wireless, and transmits / receives information to / from the determination device 100. For example, the communication unit 11 is realized by a NIC or the like.

入力部１２は、ユーザから各種操作を受け付ける入力装置である。例えば、入力部１２は、ユーザ端末１０に備えられた操作キー等によって実現される。表示部１３は、各種情報を表示するための表示装置である。例えば、表示部１３は、液晶ディスプレイ等によって実現される。なお、ユーザ端末１０にタッチパネルが採用される場合には、入力部１２の一部と表示部１３とは一体化される。 The input unit 12 is an input device that receives various operations from the user. For example, the input unit 12 is realized by an operation key or the like provided in the user terminal 10. The display unit 13 is a display device for displaying various information. For example, the display unit 13 is realized by a liquid crystal display or the like. In addition, when a touch panel is employ | adopted for the user terminal 10, a part of input part 12 and the display part 13 are integrated.

検知部１４は、ユーザ端末１０に関する各種情報を検知する。具体的には、検知部１４は、ユーザが発する音声や、ユーザ端末１０の周囲の環境音を検知する。例えば、検知部１４は、マイクロフォン等の集音手段であり、音が入力された場合に、その音を音情報として取得する。 The detection unit 14 detects various information related to the user terminal 10. Specifically, the detection unit 14 detects voice emitted by the user and environmental sounds around the user terminal 10. For example, the detection unit 14 is sound collection means such as a microphone, and when sound is input, acquires the sound as sound information.

また、検知部１４は、ユーザ端末１０に対するユーザの操作や、ユーザ端末１０の所在する位置情報や、ユーザ端末１０と接続されている機器に関する情報や、ユーザ端末１０における環境等を検知してもよい。 Further, the detection unit 14 may detect a user operation on the user terminal 10, position information of the user terminal 10, information about a device connected to the user terminal 10, an environment in the user terminal 10, and the like. Good.

例えば、検知部１４は、入力部１２に入力された情報に基づいて、ユーザの操作を検知する。すなわち、検知部１４は、入力部１２に画面をタッチする操作の入力があったことや、音声の入力があったこと等を検知する。また、検知部１４は、ユーザによって所定のアプリが起動されたことを検知してもよい。かかるアプリがユーザ端末１０内の撮像装置を動作させるアプリである場合、検知部１４は、ユーザによって撮像機能が利用されていることを検知する。また、検知部１４は、ユーザ端末１０内に備えられた加速度センサやジャイロセンサ等で検知されたデータに基づき、ユーザ端末１０自体が動かされているといった操作を検知してもよい。 For example, the detection unit 14 detects a user operation based on information input to the input unit 12. In other words, the detection unit 14 detects that there has been an input of an operation for touching the screen to the input unit 12, an input of audio, or the like. Moreover, the detection part 14 may detect that the predetermined application was started by the user. When such an application is an application that operates the imaging device in the user terminal 10, the detection unit 14 detects that the imaging function is used by the user. Further, the detection unit 14 may detect an operation in which the user terminal 10 itself is moved based on data detected by an acceleration sensor, a gyro sensor, or the like provided in the user terminal 10.

また、検知部１４は、ユーザ端末１０の現在位置を検知してもよい。具体的には、検知部１４は、ＧＰＳ（Global Positioning System）衛星から送出される電波を受信し、受信した電波に基づいてユーザ端末１０の現在位置を示す位置情報（例えば、緯度及び経度）を取得する。また、位置情報は、ユーザ端末１０が備える光学式センサや、赤外線センサや、磁気センサ等によって取得されてもよい。 The detection unit 14 may detect the current position of the user terminal 10. Specifically, the detection unit 14 receives radio waves transmitted from a GPS (Global Positioning System) satellite, and obtains position information (for example, latitude and longitude) indicating the current position of the user terminal 10 based on the received radio waves. get. The position information may be acquired by an optical sensor, an infrared sensor, a magnetic sensor, or the like included in the user terminal 10.

また、検知部１４は、ユーザ端末１０に接続される外部装置を検知してもよい。例えば、検知部１４は、外部装置との相互の通信パケットのやり取りなどに基づいて、外部装置を検知する。そして、検知部１４は、検知した外部装置をユーザ端末１０と接続される端末として認識する。また、検知部１４は、外部装置との接続の種類を検知してもよい。例えば、検知部１４は、外部装置と有線で接続されているか、無線通信で接続されているかを検知する。また、検知部１４は、無線通信で用いられている通信方式等を検知してもよい。また、検知部１４は、外部装置が発する電波を検知する電波センサや、電磁波を検知する電磁波センサ等によって取得される情報に基づいて、外部装置を検知してもよい。 Further, the detection unit 14 may detect an external device connected to the user terminal 10. For example, the detection unit 14 detects the external device based on exchange of communication packets with the external device. Then, the detection unit 14 recognizes the detected external device as a terminal connected to the user terminal 10. The detection unit 14 may detect the type of connection with the external device. For example, the detection unit 14 detects whether it is connected to an external device by wire or wireless communication. Moreover, the detection part 14 may detect the communication system etc. which are used by radio | wireless communication. The detection unit 14 may detect the external device based on information acquired by a radio wave sensor that detects a radio wave emitted by the external device, an electromagnetic wave sensor that detects an electromagnetic wave, or the like.

なお、ユーザ端末１０が外部機器と接続される場合、ユーザからの音声は、外部機器によって検知されてもよい。外部機器とは、例えば、音声アシスト機能を有する家電等であり、ユーザ端末１０や判定装置１００と通信可能なスマート機器である。 When the user terminal 10 is connected to an external device, the voice from the user may be detected by the external device. The external device is, for example, a home appliance having a voice assist function, and is a smart device that can communicate with the user terminal 10 and the determination device 100.

また、検知部１４は、ユーザ端末１０における周囲の環境を検知してもよい。検知部１４は、ユーザ端末１０に備えられた各種センサや機能を利用し、環境に関する情報を検知する。例えば、検知部１４は、ユーザ端末１０の周囲の照度を検知する照度センサや、ユーザ端末１０の物理的な動きを検知する加速度センサ（又は、ジャイロセンサなど）や、ユーザ端末１０の周囲の湿度を検知する湿度センサや、ユーザ端末１０の所在位置における磁場を検知する地磁気センサ等を利用する。そして、検知部１４は、各種センサを用いて、種々の情報を検知する。例えば、検知部１４は、ユーザ端末１０の周囲における騒音レベルや、ユーザ端末１０の周囲が撮像に適する照度であるか等を検知する。さらに、検知部１４は、カメラで撮影された写真や映像に基づいて周囲の環境情報を検知してもよい。 Further, the detection unit 14 may detect the surrounding environment of the user terminal 10. The detection unit 14 uses various sensors and functions provided in the user terminal 10 to detect information about the environment. For example, the detection unit 14 includes an illuminance sensor that detects illuminance around the user terminal 10, an acceleration sensor (or gyro sensor, etc.) that detects physical movement of the user terminal 10, and humidity around the user terminal 10. A humidity sensor for detecting the magnetic field, a geomagnetic sensor for detecting a magnetic field at the location of the user terminal 10, and the like are used. And the detection part 14 detects various information using various sensors. For example, the detection unit 14 detects a noise level around the user terminal 10 and whether the surroundings of the user terminal 10 have illuminance suitable for imaging. Further, the detection unit 14 may detect surrounding environment information based on a photograph or video taken by the camera.

記憶部１５は、各種情報を記憶する。記憶部１５は、例えば、ＲＡＭ、フラッシュメモリ等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。例えば、記憶部１５は、検知部１４によって検知された音情報を、音が検知された日時と対応付けて記憶する。 The storage unit 15 stores various information. The storage unit 15 is realized by, for example, a semiconductor memory device such as a RAM or a flash memory, or a storage device such as a hard disk or an optical disk. For example, the storage unit 15 stores the sound information detected by the detection unit 14 in association with the date and time when the sound was detected.

制御部１６は、コントローラであり、例えば、ＣＰＵやＭＰＵ等によって、ユーザ端末１０内部の記憶装置に記憶されている各種プログラムがＲＡＭを作業領域として実行されることにより実現される。また、制御部１６は、コントローラであり、例えば、ＡＳＩＣやＦＰＧＡ等の集積回路により実現される。 The control unit 16 is a controller, and is realized, for example, by executing various programs stored in a storage device inside the user terminal 10 using the RAM as a work area by a CPU, an MPU, or the like. The control unit 16 is a controller, and is realized by an integrated circuit such as an ASIC or FPGA, for example.

図７に示すように、制御部１６は、取得部１６１と、送信部１６２とを有し、以下に説明する情報処理の機能や作用を実現または実行する。なお、制御部１６の内部構成は、図７に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As illustrated in FIG. 7, the control unit 16 includes an acquisition unit 161 and a transmission unit 162, and realizes or executes functions and operations of information processing described below. Note that the internal configuration of the control unit 16 is not limited to the configuration illustrated in FIG. 7, and may be another configuration as long as the information processing described below is performed.

取得部１６１は、各種情報を取得する。例えば、取得部１６１は、検知部１４を制御することにより、検知部１４によって検知される各種情報を取得する。具体的には、取得部１６１は、ユーザ又はユーザ端末１０の周囲の環境音や、ユーザが発した音声等を含む、音情報を取得する。 The acquisition unit 161 acquires various types of information. For example, the acquisition unit 161 acquires various types of information detected by the detection unit 14 by controlling the detection unit 14. Specifically, the acquisition unit 161 acquires sound information including environmental sounds around the user or the user terminal 10, voices uttered by the user, and the like.

取得部１６１は、所定の時間毎に音情報を取得するようにしてもよい。例えば、取得部１６１は、上述した検知部１４を制御すること等により、音情報を取得する。取得部１６１は、ユーザからの音声入力がない場合には、例えば、所定時間ごと（３秒ごとや、５秒ごとや、１０秒ごと等）の長さの音情報（環境音）を取得する。あるいは、取得部１６１は、検知部１４によってユーザからの音声入力が検知された場合には、ユーザが発した音声に対応した長さの音情報を取得する。なお、取得部１６１が取得する音情報の長さは、判定装置１００によって設定されてもよい。 The acquisition unit 161 may acquire sound information every predetermined time. For example, the acquisition unit 161 acquires sound information by controlling the detection unit 14 described above. When there is no voice input from the user, for example, the acquisition unit 161 acquires sound information (environmental sound) having a length of every predetermined time (every 3 seconds, every 5 seconds, every 10 seconds, etc.). . Or the acquisition part 161 acquires the sound information of the length corresponding to the audio | voice which the user uttered, when the audio | voice input from a user is detected by the detection part 14. FIG. Note that the length of the sound information acquired by the acquisition unit 161 may be set by the determination device 100.

送信部１６２は、各種情報を送信する。例えば、送信部１６２は、判定装置１００からの要求に応じて、取得部１６１によって取得された音情報を判定装置１００に送信する。 The transmission part 162 transmits various information. For example, the transmission unit 162 transmits the sound information acquired by the acquisition unit 161 to the determination device 100 in response to a request from the determination device 100.

〔５．処理手順〕
次に、図８を用いて、実施形態に係る判定装置１００による処理の手順について説明する。図８は、実施形態に係る処理手順を示すフローチャートである。 [5. Processing procedure)
Next, a procedure of processing performed by the determination apparatus 100 according to the embodiment will be described with reference to FIG. FIG. 8 is a flowchart illustrating a processing procedure according to the embodiment.

図８に示すように、判定装置１００は、ユーザ端末１０から、周囲の環境音を含む音情報を収集する（ステップＳ１０１）。そして、判定装置１００は、音情報から音声情報（すなわち、ユーザの所定の行動を示す情報）を抽出する（ステップＳ１０２）。 As illustrated in FIG. 8, the determination apparatus 100 collects sound information including ambient environmental sounds from the user terminal 10 (step S101). And the determination apparatus 100 extracts audio | voice information (namely, information which shows a user's predetermined action) from sound information (step S102).

そして、判定装置１００は、音声情報に対応するユーザの所定の行動が、連続した環境音のもとで行われた行動か否かを判定する（ステップＳ１０３）。ユーザの所定の行動が、連続した環境音のもとで行われた行動であると判定した場合（ステップＳ１０３；Ｙｅｓ）、判定装置１００は、その所定の行動が、１つのセッションに含まれる行動であると判定する（ステップＳ１０４）。そして、判定装置１００は、音情報（環境音）を収集する処理を繰り返す。 And the determination apparatus 100 determines whether the predetermined | prescribed action of the user corresponding to audio | voice information is the action performed under the continuous environmental sound (step S103). When it is determined that the predetermined action of the user is an action performed under a continuous environmental sound (step S103; Yes), the determination apparatus 100 is an action in which the predetermined action is included in one session. (Step S104). And the determination apparatus 100 repeats the process which collects sound information (environmental sound).

一方、ユーザの所定の行動が、連続した環境音のもとで行われた行動でないと判定した場合（ステップＳ１０３；Ｎｏ）、判定装置１００は、その所定の行動が、別のセッションの行動であると判定する（ステップＳ１０５）。この場合、判定装置１００は、例えば、直前のセッションを終了させ、新たなセッションを開始し、当該行動は新たなセッションに含まれると判定する。 On the other hand, when it is determined that the predetermined action of the user is not an action performed under a continuous environmental sound (step S103; No), the determination apparatus 100 determines that the predetermined action is an action of another session. It is determined that there is (step S105). In this case, for example, the determination apparatus 100 ends the previous session, starts a new session, and determines that the action is included in the new session.

〔６．変形例〕
上述した実施形態に係る処理は、上記実施形態以外にも種々の異なる形態にて実施されてよい。以下では、判定装置１００又は判定処理システム１の他の実施形態（変形例）について説明する。 [6. (Modification)
The processing according to the above-described embodiment may be performed in various different forms other than the above-described embodiment. Hereinafter, another embodiment (modified example) of the determination apparatus 100 or the determination processing system 1 will be described.

〔６−１．環境音によるセッション管理のバリエーション〕
上記実施形態では、判定装置１００は、ユーザの所定の行動が、連続した環境音のもとで行われた行動か否かを判定することにより、セッションを管理する例を示した。ここで、判定装置１００は、ユーザからの明示の行動がなくとも（ユーザが無言であっても）、ユーザがサービスの利用を継続していると推定することで、セッションを維持するといった処理を行ってもよい。この点について、図９乃至図１１を用いて説明する。なお、以下では、区別のため、変形例の構成を有する判定装置１００を判定装置１００Ａと表記するが、特に区別する必要のない場合には、判定装置１００と総称する。 [6-1. Variations in session management using environmental sounds)
In the embodiment described above, the determination apparatus 100 illustrates an example in which a session is managed by determining whether or not the predetermined action of the user is an action performed under continuous environmental sound. Here, even if there is no explicit action from the user (even if the user is silent), the determination apparatus 100 performs processing such as maintaining the session by estimating that the user continues to use the service. You may go. This point will be described with reference to FIGS. In the following, for the purpose of distinction, the determination device 100 having the configuration of the modified example is referred to as a determination device 100A, but is collectively referred to as the determination device 100 when it is not particularly necessary to distinguish.

例えば、判定装置１００Ａは、ユーザを認識するための音情報と環境音とを照合することにより、任意の行動の主体がユーザであるか否かを判定する。そして、判定装置１００Ａは、任意の行動の主体がユーザであると判定した場合には、任意の行動が１つのセッションに含まれると判定する。この場合、任意の行動とは、音声によるリクエストなどの明示的な行動のみならず、例えば、サービスにログインしたまま待機する、といった行動を含む。 For example, the determination apparatus 100A determines whether or not the subject of any action is the user by collating sound information for recognizing the user with environmental sound. When determining that the subject of the arbitrary action is the user, the determining apparatus 100A determines that the arbitrary action is included in one session. In this case, the arbitrary behavior includes not only an explicit behavior such as a voice request but also an behavior such as waiting while logged in to the service.

また、ユーザを認識するための音情報とは、例えば、ユーザによって予め登録される、ユーザ本人を認識するための音情報である。例えば、ユーザは、判定装置１００Ａが提供するサービスの利用にあたり、自身を認識するための音情報をサービス側に登録する。 The sound information for recognizing the user is sound information for recognizing the user himself / herself registered in advance by the user, for example. For example, when using the service provided by the determination apparatus 100A, the user registers sound information for recognizing the user on the service side.

具体的には、ユーザは、自身がユーザ端末１０を主に操作する場所（拠点）の環境音を予め登録する。あるいは、ユーザは、自身が発する音であって、環境音になりうる音を予め登録する。具体的には、ユーザは、自身の足音を環境音の一つとして登録する。あるいは、ユーザは、自身の家族の音声やペットの音声等を環境音の一つとして登録する。なお、これらの音情報の登録は、ユーザから明示的に行われることを要さず、例えば、環境音を収集した判定装置１００Ａによって、任意に登録が行われてもよい。言い換えれば、判定装置１００Ａは、ユーザを認識するための環境音を常時収集しておき、かかる環境音が観測された場合にはユーザ本人がサービスを利用している、と推定するための情報として、これらの音情報を予め登録しておく。 Specifically, the user registers in advance the environmental sound of the place (base) where he / she mainly operates the user terminal 10. Alternatively, the user registers in advance a sound that is emitted by the user and can be an environmental sound. Specifically, the user registers his / her footsteps as one of the environmental sounds. Alternatively, the user registers his / her family's voice, pet's voice, and the like as one of the environmental sounds. The registration of the sound information does not need to be explicitly performed by the user. For example, the sound information may be arbitrarily registered by the determination device 100A that collects environmental sounds. In other words, the determination apparatus 100A constantly collects environmental sounds for recognizing the user, and when such environmental sounds are observed, information for estimating that the user himself is using the service. The sound information is registered in advance.

そして、判定装置１００Ａは、予め登録されているユーザの足音や、ユーザとは異なる者（例えば、ユーザの家族や友人等）の音声や、ユーザが拠点とする場所の環境音の少なくともいずれか一つと、収集された環境音とを照合することにより、任意の行動の主体がユーザであるか否かを判定する。判定装置１００Ａは、収集された環境音において、予め登録された音情報が含まれる場合は、ユーザからの音声入力がしばらくの間行われなくても、そのユーザ端末１０を利用しているユーザは、ユーザ本人である蓋然性が高いものとして、セッションを維持する。すなわち、判定装置１００は、任意の行動の主体がユーザであると判定した場合に、その任意の行動が、継続している１つのセッションに含まれる行動であると判定する。これにより、ユーザは、ログインしたサービスにおいて、しばらく明示の音声入力を行わずとも、ユーザ本人を示すと推定される環境音が観測され続けている間は、セッションを維持させることができる。 Then, the determination apparatus 100 A is at least one of a user's footsteps registered in advance, a voice of a person different from the user (for example, a user's family and friends), and an environmental sound of a location where the user is based. Then, it is determined whether or not the subject of any action is the user by comparing the collected environmental sound. When the collected environmental sound includes pre-registered sound information, the determination device 100A determines that the user who uses the user terminal 10 does not perform voice input from the user for a while. The session is maintained as having a high probability of being the user himself / herself. That is, when the determination apparatus 100 determines that the subject of an arbitrary action is a user, the determination apparatus 100 determines that the arbitrary action is an action included in one continuing session. As a result, the user can maintain the session while the environmental sound estimated to indicate the user himself / herself is continuously observed in the logged-in service without performing explicit voice input for a while.

上記のように、変形例に係る判定装置１００Ａは、予め登録されたユーザ本人を示す音情報と、環境音との照合に基づいて、ユーザの本人性を検証することで、セッションを維持する。変形例に係る処理を行うにあたり、変形例に係る判定装置１００Ａは、図９に示す構成を有する。図９は、変形例に係る判定装置１００Ａの構成例を示す図である。図９に示すように、判定装置１００Ａは、照合情報記憶部１２５をさらに有する。 As described above, the determination apparatus 100A according to the modified example maintains the session by verifying the user's identity based on the comparison between the sound information indicating the user who is registered in advance and the environmental sound. In performing the process according to the modification, the determination apparatus 100A according to the modification has a configuration illustrated in FIG. FIG. 9 is a diagram illustrating a configuration example of a determination apparatus 100A according to a modification. As illustrated in FIG. 9, the determination apparatus 100 A further includes a collation information storage unit 125.

（照合情報記憶部１２５について）
照合情報記憶部１２５は、ユーザの照合に関する情報を記憶する。照合情報記憶部１２５は、データテーブルとして、登録テーブル１２６と、照合テーブル１２７とを有する。 (About the verification information storage unit 125)
The collation information storage unit 125 stores information related to user collation. The collation information storage unit 125 includes a registration table 126 and a collation table 127 as data tables.

（登録テーブル１２６について）
登録テーブル１２６は、ユーザを認識するために用いられる登録データに関する情報を記憶する。ここで、図１０に、変形例に係る登録テーブル１２６の一例を示す。図１０は、変形例に係る登録テーブル１２６の一例を示す図である。図１０に示した例では、登録テーブル１２６は、「ユーザＩＤ」、「登録情報」、「内容」といった項目を有する。 (Registration table 126)
The registration table 126 stores information related to registration data used for recognizing a user. Here, FIG. 10 shows an example of the registration table 126 according to the modification. FIG. 10 is a diagram illustrating an example of the registration table 126 according to the modification. In the example illustrated in FIG. 10, the registration table 126 includes items such as “user ID”, “registration information”, and “content”.

「登録情報」は、ユーザを認識するための音情報として登録された音情報を示す。「内容」は、ユーザを認識するための音情報に関する内容を示す。なお、図１０の例では、登録情報として、「登録データ＃１」といった概念的な情報を示しているが、実際には、登録データは、任意の形式の音声ファイル等が記憶される。 “Registration information” indicates sound information registered as sound information for recognizing the user. “Content” indicates content related to sound information for recognizing the user. In the example of FIG. 10, conceptual information such as “registration data # 1” is shown as the registration information. However, in practice, an audio file of an arbitrary format is stored as the registration data.

すなわち、図１０に示したデータの一例では、ユーザＵ１を認識するための登録情報として、「登録データ＃１」が登録されており、その内容は、「友人の音声」であることを示している。また、ユーザＵ１を認識するための登録情報としては、「登録データ＃２」や、「登録データ＃３」や、「登録データ＃４」も登録されており、その内容は、それぞれ、「同居人の音声」や、「本人の足音」や、「部屋の環境音」であることを示している。 That is, in the example of data shown in FIG. 10, “registration data # 1” is registered as registration information for recognizing the user U1, and the content is “friend's voice”. Yes. Also, as registration information for recognizing the user U1, “registration data # 2”, “registration data # 3”, and “registration data # 4” are also registered. This indicates that the sound is “person's voice”, “person's footsteps”, or “room environmental sound”.

（照合テーブル１２７について）
照合テーブル１２７は、ユーザの照合に関する情報を記憶する。ここで、図１１に、変形例に係る照合テーブル１２７の一例を示す。図１１は、変形例に係る照合テーブル１２７の一例を示す図である。図１１に示した例では、照合テーブル１２７は、「セッションＩＤ」、「ユーザＩＤ」、「環境音情報」、「照合結果」といった項目を有する。 (Regarding the collation table 127)
The collation table 127 stores information related to user collation. Here, FIG. 11 shows an example of the collation table 127 according to the modification. FIG. 11 is a diagram illustrating an example of the collation table 127 according to the modification. In the example illustrated in FIG. 11, the collation table 127 includes items such as “session ID”, “user ID”, “environmental sound information”, and “collation result”.

「照合情報」は、ユーザを認識するための音情報として登録されていたいずれかの登録データと、環境音とが照合されたか否かの結果を示す。例えば、照合情報に「○」が記憶されている場合、ユーザを認識するための音情報として登録されていたいずれかの登録データと、環境音とが照合されたことを示す。この場合、任意の行動（例えばサービスにログイン後、サービスを利用するために待機しているユーザの行動）は、ユーザＵ１本人によって行われていると推定される。このため、判定装置１００Ａは、セッションＳＥ０３を維持し、セッションを終了させないようにする。 “Verification information” indicates a result of whether or not any registered data registered as sound information for recognizing the user is compared with the environmental sound. For example, when “◯” is stored in the collation information, this indicates that any registered data registered as sound information for recognizing the user has been collated with the environmental sound. In this case, it is presumed that an arbitrary action (for example, an action of a user who waits for using the service after logging in to the service) is performed by the user U1 himself / herself. Therefore, the determination apparatus 100A maintains the session SE03 so that the session is not terminated.

すなわち、図１１に示したデータの一例では、セッションＳＥ０３は、ユーザＵ１に関するセッションであり、収集された環境音情報＃３１は、ユーザを認識するための音情報として登録されていたいずれかの登録データとの照合結果が「○」であることから、セッションＳＥ０３が維持されていることを示している。また、環境音情報＃３２から環境音情報＃３４までの間も、ユーザを認識するための音情報として登録されていたいずれかの登録データとの照合結果が「○」であることから、セッションＳＥ０３が維持されていることを示している。 That is, in the example of the data shown in FIG. 11, the session SE03 is a session related to the user U1, and the collected environmental sound information # 31 is any registration registered as sound information for recognizing the user. Since the collation result with the data is “◯”, it indicates that the session SE03 is maintained. In addition, since the result of collation with any registered data registered as sound information for recognizing the user is “◯” between environmental sound information # 32 and environmental sound information # 34, the session It shows that SE03 is maintained.

上記のように、判定装置１００Ａは、ユーザを認識するための音情報と環境音とを照合することにより、任意の行動の主体がユーザであるか否かを判定するとともに、主体がユーザであると判定した場合には、任意の行動が１つのセッションに含まれると判定する。 As described above, the determination apparatus 100A determines whether or not the subject of any action is the user by comparing the sound information for recognizing the user with the environmental sound, and the subject is the user. Is determined, it is determined that an arbitrary action is included in one session.

具体的には、判定装置１００Ａは、予め登録されているユーザの足音、ユーザとは異なる者の音声、ユーザが拠点とする場所の環境音の少なくともいずれか一つと、収集された環境音とを照合することにより、任意の行動の主体がユーザであるか否かを判定する。 Specifically, the determination apparatus 100 A uses at least one of a user's footstep registered in advance, a voice of a person different from the user, an environmental sound at a location where the user is based, and the collected environmental sound. By collating, it is determined whether the subject of arbitrary action is a user.

このように、判定装置１００Ａは、予め登録された、ユーザの本人性を示すと推定される音情報と、環境音との照合により、セッション管理を行ってもよい。これにより、判定装置１００Ａは、ユーザからの意識的な音声入力を受け付けずとも、同じ環境下でユーザ本人がサービスを利用し続けていることから、あえてセッションを切断せずに、セッションを維持することができる。このため、判定装置１００Ａは、ユーザビリティの高いセッション管理を行うことができる。 As described above, the determination apparatus 100A may perform session management by comparing the sound information registered in advance and estimated to indicate the user's identity with the environmental sound. Accordingly, the determination device 100A maintains the session without disconnecting the session because the user himself / herself continues to use the service in the same environment without receiving conscious voice input from the user. be able to. For this reason, the determination apparatus 100A can perform session management with high usability.

（変形例に係る処理手順について）
次に、図１２を用いて、変形例に係る判定装置１００Ａによる処理の手順について説明する。図１２は、変形例に係る処理手順を示すフローチャートである。 (Regarding processing procedure according to modification)
Next, a procedure of processing performed by the determination apparatus 100A according to the modification will be described with reference to FIG. FIG. 12 is a flowchart illustrating a processing procedure according to the modification.

図１２に示すように、判定装置１００Ａは、ユーザ判定のための音情報を登録する（ステップＳ２０１）。その後、判定装置１００Ａは、ユーザ端末１０から、周囲の環境音を含む音情報を収集する（ステップＳ２０２）。そして、判定装置１００Ａは、音情報から音声情報（すなわち、ユーザの任意の行動を示す情報）を抽出する（ステップＳ２０３）。 As illustrated in FIG. 12, the determination apparatus 100A registers sound information for user determination (step S201). Thereafter, the determination apparatus 100A collects sound information including ambient environmental sounds from the user terminal 10 (step S202). Then, the determination apparatus 100A extracts voice information (that is, information indicating an arbitrary action of the user) from the sound information (step S203).

そして、判定装置１００Ａは、環境音と予め登録された音情報とが合致するか否かを判定する（ステップＳ２０４）。環境音と予め登録された音情報とが合致すると判定した場合（ステップＳ２０４；Ｙｅｓ）、判定装置１００Ａは、任意の行動がユーザ本人の行動であると判定する（ステップＳ２０５）。そして、判定装置１００Ａは、その任意の行動が、１つのセッションに含まれる行動であると判定する（ステップＳ２０６）。言い換えれば、判定装置１００Ａは、セッションを切断せずに維持する。そして、判定装置１００Ａは、音情報（環境音）を収集する処理を繰り返す。 Then, the determination apparatus 100A determines whether or not the environmental sound matches the sound information registered in advance (step S204). When it is determined that the environmental sound matches the sound information registered in advance (step S204; Yes), the determination apparatus 100A determines that any action is the action of the user himself (step S205). Then, the determination apparatus 100A determines that the arbitrary action is an action included in one session (step S206). In other words, the determination apparatus 100A maintains the session without disconnecting it. Then, the determination apparatus 100A repeats the process of collecting sound information (environmental sound).

一方、環境音と予め登録された音情報とが合致しないと判定した場合（ステップＳ２０４；Ｎｏ）、判定装置１００Ａは、任意の行動がユーザ本人の行動でないと判定する（ステップＳ２０７）。そして、判定装置１００Ａは、その任意の行動が、別のセッションに含まれる行動であると判定する（ステップＳ２０８）。言い換えれば、判定装置１００Ａは、セッションを維持せずに新たなセッションを開始し、当該任意の行動は、新たなセッションに含まれると判定する。 On the other hand, when it is determined that the environmental sound and the pre-registered sound information do not match (step S204; No), the determination apparatus 100A determines that the arbitrary action is not the action of the user himself (step S207). Then, the determination apparatus 100A determines that the arbitrary action is an action included in another session (step S208). In other words, the determination apparatus 100A starts a new session without maintaining the session, and determines that the arbitrary action is included in the new session.

〔６−２．判定プログラム〕
上記実施形態では、本願に係る判定プログラムは、判定装置１００内部で実行されることを示した。しかし、本願に係る判定プログラムは、ユーザ端末１０内部で実行されてもよい。この場合、ユーザ端末１０は、上記実施形態で説明した判定装置１００が有する各処理部や、記憶部１２０に格納された各情報を記憶する記憶部１５を有する。 [6-2. Judgment program)
In the above embodiment, it has been shown that the determination program according to the present application is executed inside the determination apparatus 100. However, the determination program according to the present application may be executed inside the user terminal 10. In this case, the user terminal 10 includes each processing unit included in the determination apparatus 100 described in the above embodiment and a storage unit 15 that stores each piece of information stored in the storage unit 120.

〔６−３．ユーザ端末の数〕
上記実施形態では、ユーザ端末１０が１台の装置である例を示したが、ユーザ端末１０は１台に限られない。例えば、ユーザは、通信可能な端末装置を複数台所有することも想定される。この場合、判定装置１００は、ユーザが利用する複数のユーザ端末１０から、ユーザやユーザ端末１０の周囲の環境音を収集してもよい。 [6-3. Number of user terminals
In the said embodiment, although the user terminal 10 showed the example which is one apparatus, the user terminal 10 is not restricted to one. For example, it is assumed that the user has a plurality of terminal devices capable of communication. In this case, the determination apparatus 100 may collect environmental sounds around the user and the user terminal 10 from a plurality of user terminals 10 used by the user.

なお、判定装置１００は、複数のユーザ端末１０の識別において、必ずしも他の機器にも共通するようなグローバルな識別子を取得することを要さない。すなわち、判定装置１００は、実施形態において実行する処理において、ユーザ端末１０を一意に識別することが可能な識別子を取得しさえすればよく、必ずしも永続的に定まる識別子を取得しなくてもよい。 Note that the determination apparatus 100 does not necessarily need to acquire a global identifier that is common to other devices in identifying a plurality of user terminals 10. That is, the determination apparatus 100 only needs to acquire an identifier that can uniquely identify the user terminal 10 in the process executed in the embodiment, and does not necessarily need to acquire an identifier that is permanently determined.

〔６−４．ユーザ端末の構成〕
上記実施形態では、ユーザ端末１０の構成例について図７を用いて説明した。しかし、ユーザ端末１０は、図７で例示した全ての処理部を備えることを必ずしも要しない。例えば、ユーザ端末１０は、表示部１３を必ずしも備えていなくてもよい。また、ユーザ端末１０は、２以上の機器に分離されて図７を示す構成が実現されてもよい。例えば、ユーザ端末１０は、少なくとも検知部１４を有する音声検知装置と、少なくとも通信部１１を有する通信装置とに分離された構成を有する、２台以上の機器により実現されてもよい。 [6-4. Configuration of user terminal]
In the above embodiment, the configuration example of the user terminal 10 has been described with reference to FIG. However, the user terminal 10 does not necessarily need to include all the processing units illustrated in FIG. For example, the user terminal 10 does not necessarily include the display unit 13. Further, the user terminal 10 may be separated into two or more devices to realize the configuration illustrated in FIG. For example, the user terminal 10 may be realized by two or more devices having a configuration separated into at least a voice detection device having the detection unit 14 and a communication device having at least the communication unit 11.

〔６−５．音声入力の判定〕
上記実施形態では、判定装置１００が、環境音の連続性に基づいてセッションを管理することで、ユーザが発した断続的な音声が一連の音声入力であると判定する例を示した。ここで、判定装置１００は、音声を一連の音声入力と判定以外にも、種々の音声認識に関する処理を行ってもよい。 [6-5. (Determination of voice input)
In the said embodiment, the determination apparatus 100 showed the example which determines that the intermittent audio | voice which the user uttered is a series of audio | voice input by managing a session based on the continuity of environmental sound. Here, the determination apparatus 100 may perform various processes related to voice recognition in addition to a series of voice input and determination.

例えば、判定装置１００は、環境音の連続性に基づいて、ユーザが発する指示語を特定してもよい。具体的には、判定装置１００は、環境音が連続している間に発せられたユーザの指示語が、同じセッションで発せられた別の語を指しているものと判定する。例えば、判定装置１００は、「お米・・・」とユーザが発した音声を収集したのちに、「さっきのあれ買って。」という音声を収集したとする。この場合、判定装置１００は、「さっきのあれ」という指示語を、直前にユーザが発した商品名である「お米」と認識する。そして、判定装置１００は、認識した音声に対応する処理（この例では、お米を注文するという処理）を行う。 For example, the determination apparatus 100 may specify an instruction word issued by the user based on the continuity of the environmental sound. Specifically, the determination apparatus 100 determines that the user instruction word issued while the environmental sound is continuous indicates another word issued in the same session. For example, it is assumed that the determination apparatus 100 collects the voice “Rice ...” after collecting the voice uttered by the user “rice ...”. In this case, the determination apparatus 100 recognizes the instruction word “that time” as “rice”, which is a product name issued by the user immediately before. Then, the determination apparatus 100 performs a process corresponding to the recognized voice (a process of ordering rice in this example).

このように、判定装置１００は、環境音の連続性に基づいてセッションを管理することで、ユーザの音声が途切れた場合でも、前後の音声を一連の音声として、また、前後の音声を一連の会話として取り扱うことができる。これにより、判定装置１００は、サービスを利用するユーザの利便性を向上させることができる。 As described above, the determination apparatus 100 manages the session based on the continuity of the environmental sound, so that even when the user's voice is interrupted, the front and rear voices are set as a series of voices and the front and rear voices are set as a series. Can be handled as a conversation. Thereby, the determination apparatus 100 can improve the convenience of the user who uses a service.

〔６−６．環境音〕
上記実施形態では、判定装置１００が、環境音として、ユーザの自宅等で観測される機器の稼働音等を収集する例を示した。しかし、環境音の例はこれに限られず、判定装置１００は、種々の音を環境音として収集してもよい。 [6-6. (Environmental sound)
In the above-described embodiment, an example has been described in which the determination apparatus 100 collects operating sounds of devices observed at the user's home or the like as environmental sounds. However, the example of the environmental sound is not limited to this, and the determination apparatus 100 may collect various sounds as the environmental sound.

例えば、判定装置１００は、ユーザが自動車内でユーザ端末１０を利用している場合には、自動車のエンジン音等を環境音として収集してもよい。なお、判定装置１００は、環境音を収集する場合には、環境音を発する音源の距離や、方向や、音圧や、周波数等を収集し、そのうち、普遍的に観測される音情報を環境音として収集するようにしてもよい。これにより、判定装置１００は、突発的に異なる音（例えば、ユーザ以外の者が発した音声など）が音情報に混在した場合であっても、上記のように普遍的な音情報が観測できる限りは、環境音が連続していると判定してもよい。このように、判定装置１００は、上記実施形態で説明した処理を多様な状況で実現することができる。 For example, when the user uses the user terminal 10 in a car, the determination apparatus 100 may collect car engine sounds and the like as environmental sounds. Note that when the environmental sound is collected, the determination apparatus 100 collects the distance, direction, sound pressure, frequency, and the like of the sound source that emits the environmental sound. You may make it collect as a sound. Thereby, the determination apparatus 100 can observe universal sound information as described above even when suddenly different sounds (for example, sounds emitted by persons other than the user) are mixed in the sound information. As long as the environmental sound is continuous, it may be determined. As described above, the determination apparatus 100 can implement the processing described in the above embodiment in various situations.

〔７．ハードウェア構成〕
上述してきた実施形態に係る判定装置１００やユーザ端末１０は、例えば図１３に示すような構成のコンピュータ１０００によって実現される。以下、判定装置１００を例に挙げて説明する。図１３は、判定装置１００の機能を実現するコンピュータ１０００の一例を示すハードウェア構成図である。コンピュータ１０００は、ＣＰＵ１１００、ＲＡＭ１２００、ＲＯＭ１３００、ＨＤＤ１４００、通信インターフェイス（Ｉ／Ｆ）１５００、入出力インターフェイス（Ｉ／Ｆ）１６００、及びメディアインターフェイス（Ｉ／Ｆ）１７００を有する。 [7. Hardware configuration)
The determination apparatus 100 and the user terminal 10 according to the embodiment described above are realized by a computer 1000 having a configuration as shown in FIG. 13, for example. Hereinafter, the determination apparatus 100 will be described as an example. FIG. 13 is a hardware configuration diagram illustrating an example of a computer 1000 that implements the functions of the determination apparatus 100. The computer 1000 includes a CPU 1100, RAM 1200, ROM 1300, HDD 1400, communication interface (I / F) 1500, input / output interface (I / F) 1600, and media interface (I / F) 1700.

ＣＰＵ１１００は、ＲＯＭ１３００又はＨＤＤ１４００に記憶されたプログラムに基づいて動作し、各部の制御を行う。ＲＯＭ１３００は、コンピュータ１０００の起動時にＣＰＵ１１００によって実行されるブートプログラムや、コンピュータ１０００のハードウェアに依存するプログラム等を記憶する。 The CPU 1100 operates based on a program stored in the ROM 1300 or the HDD 1400 and controls each unit. The ROM 1300 stores a boot program executed by the CPU 1100 when the computer 1000 is started up, a program depending on the hardware of the computer 1000, and the like.

ＨＤＤ１４００は、ＣＰＵ１１００によって実行されるプログラム、及び、かかるプログラムによって使用されるデータ等を記憶する。通信インターフェイス１５００は、通信網５００（図２に示したネットワークＮに対応）を介して他の機器からデータを受信してＣＰＵ１１００へ送り、ＣＰＵ１１００が生成したデータを、通信網５００を介して他の機器へ送信する。 The HDD 1400 stores a program executed by the CPU 1100, data used by the program, and the like. The communication interface 1500 receives data from other devices via the communication network 500 (corresponding to the network N shown in FIG. 2) and sends the data to the CPU 1100, and the data generated by the CPU 1100 is transferred to other devices via the communication network 500. Send to device.

ＣＰＵ１１００は、入出力インターフェイス１６００を介して、ディスプレイやプリンタ等の出力装置、及び、キーボードやマウス等の入力装置を制御する。ＣＰＵ１１００は、入出力インターフェイス１６００を介して、入力装置からデータを取得する。また、ＣＰＵ１１００は、入出力インターフェイス１６００を介して生成したデータを出力装置へ出力する。 The CPU 1100 controls an output device such as a display and a printer and an input device such as a keyboard and a mouse via the input / output interface 1600. The CPU 1100 acquires data from the input device via the input / output interface 1600. Further, the CPU 1100 outputs the data generated via the input / output interface 1600 to the output device.

メディアインターフェイス１７００は、記録媒体１８００に記憶されたプログラム又はデータを読み取り、ＲＡＭ１２００を介してＣＰＵ１１００に提供する。ＣＰＵ１１００は、かかるプログラムを、メディアインターフェイス１７００を介して記録媒体１８００からＲＡＭ１２００上にロードし、ロードしたプログラムを実行する。記録媒体１８００は、例えばＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等である。 The media interface 1700 reads a program or data stored in the recording medium 1800 and provides it to the CPU 1100 via the RAM 1200. The CPU 1100 loads such a program from the recording medium 1800 onto the RAM 1200 via the media interface 1700, and executes the loaded program. The recording medium 1800 is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or PD (Phase change rewritable disk), a magneto-optical recording medium such as an MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. Etc.

例えば、コンピュータ１０００が実施形態に係る判定装置１００として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラム（例えば、実施形態に係る判定プログラム）を実行することにより、制御部１３０の機能を実現する。また、ＨＤＤ１４００には、記憶部１２０内のデータが記憶される。コンピュータ１０００のＣＰＵ１１００は、これらのプログラムを記録媒体１８００から読み取って実行するが、他の例として、他の装置から通信網５００を介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the determination apparatus 100 according to the embodiment, the CPU 1100 of the computer 1000 executes a program loaded on the RAM 1200 (for example, the determination program according to the embodiment), thereby Realize the function. The HDD 1400 stores data in the storage unit 120. The CPU 1100 of the computer 1000 reads these programs from the recording medium 1800 and executes them, but as another example, these programs may be acquired from other devices via the communication network 500.

〔８．その他〕
また、上記実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [8. Others]
In addition, among the processes described in the above embodiment, all or part of the processes described as being automatically performed can be performed manually, or the processes described as being performed manually can be performed. All or a part can be automatically performed by a known method. In addition, the processing procedures, specific names, and information including various data and parameters shown in the document and drawings can be arbitrarily changed unless otherwise specified. For example, the various types of information illustrated in each drawing is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、図３に示した抽出部１３２と、判定部１３３とは統合されてもよい。また、例えば、記憶部１２０に記憶される情報は、ネットワークＮを介して、外部に備えられた記憶装置に記憶されてもよい。 Further, each component of each illustrated apparatus is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed or arbitrarily distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured. For example, the extraction unit 132 and the determination unit 133 illustrated in FIG. 3 may be integrated. Further, for example, information stored in the storage unit 120 may be stored in a storage device provided outside via the network N.

また、例えば、上記実施形態では、判定装置１００が、ユーザから音情報を収集する収集処理と、セッションを判定する判定処理と、コンテンツの効果を測定する測定処理とを行う例を示した。しかし、上述した判定装置１００は、収集処理を行う収集装置と、判定処理を行う判定装置と、測定処理を行う測定装置に分離されてもよい。この場合、例えば、実施形態に係る判定装置１００による処理は、収集装置と、判定装置と、測定装置といった各装置を有する判定処理システム１によって実現される。 Further, for example, in the above-described embodiment, an example has been described in which the determination apparatus 100 performs a collection process for collecting sound information from a user, a determination process for determining a session, and a measurement process for measuring the effect of content. However, the determination device 100 described above may be separated into a collection device that performs collection processing, a determination device that performs determination processing, and a measurement device that performs measurement processing. In this case, for example, the processing by the determination device 100 according to the embodiment is realized by the determination processing system 1 including each device such as a collection device, a determination device, and a measurement device.

また、上述してきた各実施形態及び変形例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Moreover, each embodiment and modification which were mentioned above can be combined suitably in the range which does not contradict a process content.

〔９．効果〕
上述してきたように、実施形態に係る判定プログラムは、周囲の環境音を収集する収集手順と、収集手順によって収集された環境音の連続性に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する判定手順と、をコンピュータ（例えば、実施形態に係る判定装置１００）に実行させる。 [9. effect〕
As described above, the determination program according to the embodiment is based on the collection procedure of collecting ambient environmental sounds and the continuity of the environmental sounds collected by the collection procedure. A determination procedure for determining whether or not the image is included is executed by a computer (for example, the determination apparatus 100 according to the embodiment).

このように、実施形態に係る判定プログラムは、音声認識技術を利用してサービスが利用される際に、ユーザの音声とは異なる、周囲の環境音に基づいて、ユーザの所定の行動が１つのセッションに含まれるか否かを判定する。これにより、判定プログラムは、クッキー情報など、ユーザ本人であることを示す識別情報等を利用できない音声認識技術を利用したサービスにおいても、行動の連続性を判定することができる。このため、判定プログラムは、ユーザに再度のログインを求めたり、音声の再入力を求めたりすることなく、セッションを維持できる。結果として、判定プログラムは、ユーザに余計な負担を掛けることなく、ユーザビリティの高いセッション管理を行うことができる。 As described above, the determination program according to the embodiment has one predetermined action of the user based on the surrounding environmental sound different from the user's voice when the service is used using the voice recognition technology. Determine whether it is included in the session. Accordingly, the determination program can determine the continuity of behavior even in a service using a voice recognition technology that cannot use identification information indicating identity of the user such as cookie information. For this reason, the determination program can maintain a session without asking the user to log in again or asking for re-input of voice. As a result, the determination program can perform session management with high usability without imposing an extra burden on the user.

また、収集手順は、定常的に発せられる周囲の環境音を収集する。判定手順は、定常的に発せられる周囲の環境音が収集されている間のユーザの所定の行動が、１つのセッションに含まれると判定する。 In addition, the collection procedure collects ambient environmental sounds that are constantly emitted. The determination procedure determines that a predetermined action of the user while ambient ambient sounds that are regularly emitted are collected is included in one session.

このように、実施形態に係る判定プログラムは、エアコン６０やテレビ７０の稼働音など、定常的に観測される音を周囲の環境音として収集する。このため、判定プログラムは、環境音が変化したか否かを適切に判定できるので、確度の高いセッション管理を行うことができる。 As described above, the determination program according to the embodiment collects regularly observed sounds such as operating sounds of the air conditioner 60 and the television 70 as ambient environmental sounds. For this reason, since the determination program can appropriately determine whether or not the environmental sound has changed, session management with high accuracy can be performed.

また、判定手順は、定常的に発せられる周囲の環境音が所定の閾値を超えて変化した後に観測されたユーザの所定の行動は、１つのセッションに含まれないと判定する。 Further, the determination procedure determines that the predetermined action of the user observed after the ambient environmental sound that is regularly emitted exceeds a predetermined threshold is not included in one session.

このように、実施形態に係る判定プログラムは、環境音が変化した場合に、ユーザの行動が１つのセッションに含まれないと判定する。これにより、判定プログラムは、ユーザの一連の行動が終了したことを、環境音が変化するという自然なかたちで行うことができるため、ユーザにとって違和感のないセッション管理を行うことができる。 As described above, the determination program according to the embodiment determines that the user's action is not included in one session when the environmental sound changes. Accordingly, the determination program can perform the session management without any sense of incongruity for the user because the user's series of actions can be completed in a natural manner that the environmental sound changes.

また、収集手順は、音が発せられる方向、音源までの距離、収集する音の波形、収集する音の音量の少なくとも一つが定常的である周囲の環境音を収集する。 The collection procedure collects ambient environmental sounds in which at least one of the direction in which the sound is emitted, the distance to the sound source, the waveform of the sound to be collected, and the volume of the sound to be collected is steady.

このように、実施形態に係る判定プログラムは、種々の音の素性を含む環境音を収集する。これにより、実施形態に係る判定プログラムは、環境音の変化を正確に捉えることができるので、より適切なセッション管理を行うことができる。 Thus, the determination program according to the embodiment collects environmental sounds including various sound features. Thereby, since the determination program according to the embodiment can accurately capture changes in the environmental sound, more appropriate session management can be performed.

また、収集手順は、ユーザが発する音声を断続的に収集する。判定手順は、収集手順によって収集された環境音の連続性に基づいて、ユーザが発した断続的な音声が、一連の音声入力であるか否かを判定する。 Moreover, a collection procedure collects the voice which a user utters intermittently. The determination procedure determines whether or not the intermittent sound produced by the user is a series of sound inputs based on the continuity of the environmental sound collected by the collection procedure.

このように、実施形態に係る判定プログラムは、例えば、環境音が連続している場合には、途切れた音声を一連の音声として捉えるなどの柔軟な処理を行うことができる。このため、判定プログラムは、再度の音声入力等をユーザに要求する機会を減少させるので、ユーザビリティを向上させることができる。 As described above, the determination program according to the embodiment can perform flexible processing such as capturing a discontinuous sound as a series of sound when the environmental sound is continuous. For this reason, since the determination program reduces the opportunity of requesting the user to input voice again or the like, usability can be improved.

また、判定手順は、環境音が所定の時間を超えて連続する場合には、所定の時間の間に行われたユーザの所定の行動が１つのセッションに含まれると判定し、所定の時間を超えて行われたユーザの所定の行動は１つのセッションに含まれないと判定する。 In addition, the determination procedure determines that the user's predetermined action performed during the predetermined time is included in one session when the environmental sound continues beyond the predetermined time, and the predetermined time is It is determined that the predetermined action of the user performed beyond is not included in one session.

このように、実施形態に係る判定プログラムは、継続時間などを用いた既存のセッション管理と、環境音によるセッション管理とを組み合わせてもよい。これにより、判定プログラムは、より安全性の高いセッション管理を行うことができる。 As described above, the determination program according to the embodiment may combine existing session management using duration and the like with session management using environmental sound. Thereby, the determination program can perform session management with higher safety.

また、判定手順は、ユーザを認識するための音情報と環境音とを照合することにより、任意の行動の主体がユーザであるか否かを判定するとともに、主体がユーザであると判定した場合には、任意の行動が１つのセッションに含まれると判定する。 In addition, the determination procedure determines whether or not the subject of any action is the user by comparing the sound information for recognizing the user with the environmental sound, and determines that the subject is the user Determines that any action is included in one session.

このように、実施形態に係る判定プログラムは、ユーザからの意識的な音声入力を受け付けずとも、同じ環境下でユーザ本人がサービスを利用し続けていることから、あえてセッションを切断せずに、セッションを維持してもよい。かかる処理によっても、判定プログラムは、ユーザビリティの高いセッション管理を行うことができる。 As described above, the determination program according to the embodiment continues to use the service under the same environment without accepting conscious voice input from the user, so without intentionally disconnecting the session, Sessions may be maintained. Even by such processing, the determination program can perform session management with high usability.

また、判定手順は、予め登録されているユーザの足音、ユーザとは異なる者の音声、ユーザが拠点とする場所の環境音の少なくともいずれか一つと、収集手順によって収集された環境音とを照合することにより、任意の行動の主体がユーザであるか否かを判定する。 Also, the judgment procedure is to collate at least one of the user's footsteps registered in advance, the voice of a person different from the user, the environmental sound of the place where the user is based and the environmental sound collected by the collection procedure. By doing so, it is determined whether or not the subject of any action is a user.

このように、実施形態に係る判定プログラムは、種々の態様の音情報を登録してもよい。これにより、判定プログラムは、ユーザの本人性を高い精度で判定することができる。 Thus, the determination program according to the embodiment may register sound information of various modes. Thereby, the determination program can determine the user's identity with high accuracy.

また、実施形態に係る判定プログラムは、判定手順によって、ユーザの所定の行動と、ユーザへのコンテンツの提供とが１つのセッションに含まれると判定された場合に、ユーザに提供されたコンテンツの効果を測定する測定手順、をさらにコンピュータに実行させる。 In addition, the determination program according to the embodiment has the effect of the content provided to the user when the determination procedure determines that the predetermined action of the user and the provision of the content to the user are included in one session. The computer further executes a measurement procedure for measuring.

このように、実施形態に係る判定プログラムは、環境音の連続性を利用して、コンテンツの効果測定を行ってもよい。これは、環境音が連続している状況のもとで、配信されたコンテンツに関連する行動をとったユーザは、コンテンツの配信が動機付けとなって何らかの行動を起こした蓋然性が高いと判定できるという理由による。一般に、コマーシャルなどの広告コンテンツ等と、ユーザがユーザ端末１０につぶやいた行動とは、その関連性を証明することが難しい。一方、実施形態に係る判定プログラムによれば、コンテンツの配信から継続した環境音のもとで行われたユーザの行動（発したつぶやきや、音声入力による注文）を収集できるので、真にコンテンツが効果を発揮したか否かを正確に測定することができる。 As described above, the determination program according to the embodiment may perform content effect measurement using the continuity of the environmental sound. This is because a user who takes an action related to the distributed content under a situation where the environmental sound is continuous can determine that the probability of causing some action is motivated by the distribution of the content. That is why. In general, it is difficult to prove the relevance between advertising content such as commercials and the like and the action that the user tweeted to the user terminal 10. On the other hand, according to the determination program according to the embodiment, it is possible to collect user actions (tweets made and orders by voice input) performed under environmental sound that has continued from the distribution of the content. It is possible to accurately measure whether the effect has been achieved.

また、測定手順は、ユーザの所定の行動が、コンテンツを視聴したと推定される行動であるか否かに基づいて、ユーザに提供されたコンテンツの効果を測定する。 Further, the measurement procedure measures the effect of the content provided to the user based on whether or not the predetermined behavior of the user is an action estimated to have viewed the content.

このように、実施形態に係る判定プログラムは、例えばユーザのつぶやきなどの行動に基づいて、ユーザがコンテンツを視聴したか否かを推定し、推定した情報に基づいてコンテンツの効果測定を行ってもよい。これにより、判定プログラムは、配信されたコマーシャルや広告コンテンツがユーザに影響を及ぼしたか否かを確度良く捉えることができるので、例えばテレビ視聴率等の指標値と比べて、より正確にコンテンツの効果を測定することができる。 As described above, the determination program according to the embodiment estimates whether or not the user has viewed the content based on, for example, the user's tweet, and measures the content effect based on the estimated information. Good. As a result, the determination program can accurately grasp whether the distributed commercial or advertising content has affected the user, so that the content effect can be more accurately compared to an index value such as a TV audience rating. Can be measured.

また、測定手順は、ユーザの所定の行動が、コンテンツに関するコンバージョンと成り得るか否かに基づいて、ユーザに提供されたコンテンツの効果を測定する。 Further, the measurement procedure measures the effect of the content provided to the user based on whether or not the user's predetermined action can be a conversion related to the content.

このように、実施形態に係る判定プログラムは、コンバージョンを効果測定の要素としてもよい。判定プログラムは、コンバージョンのような、数値として成果が示しやすい要素を用いることで、より正確にコンテンツの効果を測定することができる。 Thus, the determination program according to the embodiment may use conversion as an element of effect measurement. The determination program can measure the effect of the content more accurately by using an element that easily shows the result as a numerical value, such as conversion.

以上、本願の実施形態を図面に基づいて詳細に説明したが、これは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 The embodiment of the present application has been described in detail with reference to the drawings. However, this is an exemplification, and various modifications and improvements are made based on the knowledge of those skilled in the art including the aspects described in the column of the disclosure of the invention. The present invention can be implemented in other forms.

また、上述してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、収集部は、収集手段や収集回路に読み替えることができる。 In addition, the “section (module, unit)” described above can be read as “means” or “circuit”. For example, the collection unit can be read as collection means or a collection circuit.

１判定処理システム
１０ユーザ端末
１００判定装置
１１０通信部
１２０記憶部
１２１セッション記憶部
１２２効果測定記憶部
１２３行動テーブル
１２４効果テーブル
１２５照合情報記憶部
１２６登録テーブル
１２７照合テーブル
１３０制御部
１３１収集部
１３２抽出部
１３３判定部
１３４測定部 DESCRIPTION OF SYMBOLS 1 Determination processing system 10 User terminal 100 Determination apparatus 110 Communication part 120 Storage part 121 Session storage part 122 Effect measurement storage part 123 Action table 124 Effect table 125 Collation information storage part 126 Registration table 127 Collation table 130 Control part 131 Collection part 132 Extraction Unit 133 determination unit 134 measurement unit

Claims

A collection procedure for collecting environmental sounds that are ambient sound around a user terminal, which is a terminal device used by the user, and that is emitted regularly .
The continuity of the environmental sound is determined based on whether or not the environmental sound that is regularly emitted collected by the collection procedure changes beyond a predetermined threshold, and based on the continuity of the environmental sound. a determination procedure for determining whether a predetermined action is included in a single session of the user,
The determination program characterized by causing a computer to execute.

Before Symbol decision procedure,
It is determined that the predetermined action of the user is included in the one session while the environmental sound that is constantly emitted does not change beyond a predetermined threshold .
The determination program according to claim 1, wherein:

The determination procedure is as follows:
It determines that the steadily emitted Ru ring Sakaion the predetermined action of the user observed after changes beyond a predetermined threshold, are not included in the one session,
The determination program according to claim 2, wherein:

The collection procedure includes:
Collect ambient environmental sounds in which at least one of the direction in which the sound is emitted, the distance to the sound source, the waveform of the sound to be collected, and the volume of the sound to be collected is stationary.
The determination program according to any one of claims 1 to 3, wherein:

The collection procedure includes:
The voice uttered by the user is collected intermittently, and the determination procedure is as follows:
Based on the continuity of the environmental sound collected by the collection procedure, it is determined whether or not the intermittent sound emitted by the user is a series of sound inputs.
The determination program according to any one of claims 1 to 4, wherein:

The determination procedure is as follows:
When the environmental sound continues beyond a predetermined time, it is determined that the predetermined action of the user performed during the predetermined time is included in one session, and the predetermined time is exceeded. It is determined that the predetermined action of the user performed is not included in one session.
The determination program according to any one of claims 1 to 5, wherein:

The determination procedure is as follows:
When comparing the sound information for recognizing the user and the environmental sound to determine whether the subject of any action is the user or not, and when determining that the subject is the user , Determining that the arbitrary action is included in one session,
The determination program according to any one of claims 1 to 6, wherein:

The determination procedure is as follows:
Collating at least one of the user's footsteps registered in advance, the voice of a person different from the user, and the environmental sound of the place where the user is based with the environmental sound collected by the collection procedure To determine whether the subject of the arbitrary action is the user,
The determination program according to claim 7.

A measurement procedure for measuring the effect of the content provided to the user when it is determined by the determination procedure that the predetermined action of the user and provision of the content to the user are included in one session;
The determination program according to claim 1, further causing a computer to execute.

The measurement procedure is as follows:
Measuring the effect of the content provided to the user based on whether the predetermined action of the user is an action presumed to have viewed the content;
The determination program according to claim 9.

The measurement procedure is as follows:
Measuring the effect of the content provided to the user based on whether the predetermined behavior of the user can be a conversion for the content;
The determination program according to claim 9 or 10, wherein:

A collection unit that collects environmental sounds that are ambient sound around the user terminal, which is a terminal device used by the user, and is steadily emitted ;
The continuity of the environmental sound is determined based on whether the ambient environmental sound collected by the collection unit changes beyond a predetermined threshold, and based on the continuity of the environmental sound. , a determination section for determining whether or not a predetermined action of the user are included in one session,
A determination apparatus comprising:

A determination method executed by a computer,
A collection step of collecting environmental sounds that are ambient sounds around a user terminal, which is a terminal device used by the user, and that are regularly emitted ;
The continuity of the environmental sound is determined based on whether or not the environmental sound that is constantly emitted collected by the collecting step changes beyond a predetermined threshold, and based on the continuity of the environmental sound. a determination step of determining whether or not a predetermined action is included in a single session of the user,
The determination method characterized by including.