JP2018022956A

JP2018022956A - Reaction estimation device, reaction estimation method, and program

Info

Publication number: JP2018022956A
Application number: JP2016151460A
Authority: JP
Inventors: 隆文奥山; Takafumi Okuyama; 敦子倉島; Atsuko Kurashima; 和久山岸; Kazuhisa Yamagishi
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2016-08-01
Filing date: 2016-08-01
Publication date: 2018-02-08

Abstract

PROBLEM TO BE SOLVED: To enable the estimation of reaction of a user due to quality deterioration in voice communication.SOLUTION: A reaction estimation device includes: a calculation unit for calculating a sound interruption rate, i.e. a ratio of a period which voice is interrupted in a communication period of voice via a network; and an estimation unit for estimating reaction of a user by applying the sound interruption rate calculated by the calculation unit to a relational expression indicating correspondence relation between the voice interruption rate and reaction of users due to interruption of voice.SELECTED DRAWING: Figure 2

Description

本発明は、リアクション推定装置、リアクション推定方法、及びプログラムに関する。 The present invention relates to a reaction estimation device, a reaction estimation method, and a program.

通信キャリアは、音声通話サービスを維持・改善していくために、サービスの品質を把握し、サービスの品質がユーザにどのような影響を与えているのかを検討することが重要である。特に、ユーザが特定の品質劣化事象を知覚する状況（例えば、会話が聞き取りづらいと感じるかなど。以下、「知覚状況」という。）や、品質劣化事象を知覚した結果としてどのように行動する意思を持つか（例えば、通話を一度切断して再度かけ直したいと思うかなど。以下、「行動意思」という。）を推定することは、サービスの改善において有益である。 In order to maintain and improve the voice call service, it is important for the communication carrier to grasp the quality of the service and examine how the service quality affects the user. In particular, the situation in which the user perceives a specific quality degradation event (for example, whether it is difficult to hear a conversation; hereinafter referred to as “perception situation”), and the willingness to act as a result of perceiving the quality degradation event. (For example, whether you want to hang up the call and try again) (hereinafter referred to as “behavior intention”) is useful in improving the service.

奥山隆文, 倉島敦子, 増田征貴, "VoLTEにおける受聴品質推定法の検討," 電子情報通信学会総大, B-11-21, pp.438, 2015年3月Okuyama Takafumi, Kurashima Atsuko, Masuda Yuki, "A Study on Estimation Method of Listening Quality in VoLTE," IEICE Societies, B-11-21, pp.438, March 2015

ユーザの知覚状況と行動意思とを把握するために、サービス利用時に直接ユーザへ知覚状況や行動意思を問うことは可能だが、ユーザビリティの低下や集計システムの設置に対する費用と時間がかかることなどが課題となる。そのため、ユーザに直接問うこと無くネットワーク内や端末でデータを取得して知覚状況と行動意思を推定することが望ましい。 Although it is possible to directly ask the user about their perception status and behavioral intention when using the service in order to grasp the user's perception status and behavioral intention, there are problems such as reduced usability and cost and time required for installation of the aggregation system It becomes. For this reason, it is desirable to estimate the perceived state and action intention by acquiring data within the network or terminal without directly asking the user.

パケットを用いて品質劣化の主観評価結果を推定する方法として、非特許文献１が提案されている。非特許文献１では、パケット損失の特性を考慮した受聴ＭＯＳ（Mean Opinion Score）の推定を行う。しかし、非特許文献１は受聴ＭＯＳを推定するにとどまっており、知覚状況や行動意思等のリアクションの推定に適用できないという不都合がある。 Non-Patent Document 1 has been proposed as a method for estimating a subjective evaluation result of quality degradation using a packet. In Non-Patent Document 1, a listening MOS (Mean Opinion Score) is estimated in consideration of packet loss characteristics. However, Non-Patent Document 1 only estimates the listening MOS, and has a disadvantage that it cannot be applied to estimation of reactions such as perception status and action intention.

本発明は、上記の点に鑑みてなされたものであって、音声通信の品質劣化によるユーザのリアクションの推定を可能とすることを目的とする。 The present invention has been made in view of the above points, and it is an object of the present invention to enable estimation of user reaction due to quality degradation of voice communication.

そこで上記課題を解決するため、リアクション推定装置は、ネットワークを介した音声の通信期間において音声が途切れた期間の割合である音途切れ率を算出する算出部と、前記音途切れ率と、音声が途切れることによるユーザのリアクションとの対応関係を示す関係式に、前記算出部によって算出された音途切れ率を当てはめて、ユーザのリアクションを推定する推定部と、を有する。 Therefore, in order to solve the above-described problem, the reaction estimation apparatus includes a calculation unit that calculates a sound interruption rate that is a ratio of a period in which sound is interrupted in a communication period of sound through the network, the sound interruption rate, and the sound is interrupted. An estimation unit that estimates a user's reaction by applying a sound interruption rate calculated by the calculation unit to a relational expression indicating a correspondence relationship with a user's reaction.

音声通信の品質劣化によるユーザのリアクションの推定を可能とすることができる。 It is possible to estimate the user reaction due to the deterioration of voice communication quality.

第１の実施の形態におけるリアクション推定装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the reaction estimation apparatus in 1st Embodiment. 第１の実施の形態におけるリアクション推定装置の機能構成例を示す図である。It is a figure which shows the function structural example of the reaction estimation apparatus in 1st Embodiment. 第１の実施の形態におけるリアクション推定装置が実行する処理手順の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the process sequence which the reaction estimation apparatus in 1st Embodiment performs. 第２の実施の形態におけるリアクション推定装置の機能構成例を示す図である。It is a figure which shows the function structural example of the reaction estimation apparatus in 2nd Embodiment. 第２の実施の形態におけるリアクション推定装置が実行する処理手順の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the process sequence which the reaction estimation apparatus in 2nd Embodiment performs.

以下、図面に基づいて本発明の実施の形態を説明する。図１は、第１の実施の形態におけるリアクション推定装置のハードウェア構成例を示す図である。図１のリアクション推定装置１０は、それぞれバスＢで相互に接続されているドライブ装置１００、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ１０４、及びインタフェース装置１０５等を有する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a diagram illustrating a hardware configuration example of the reaction estimation apparatus according to the first embodiment. The reaction estimation device 10 in FIG. 1 includes a drive device 100, an auxiliary storage device 102, a memory device 103, a CPU 104, an interface device 105, and the like that are mutually connected by a bus B.

リアクション推定装置１０での処理を実現するプログラムは、ＣＤ−ＲＯＭ等の記録媒体１０１によって提供される。プログラムを記憶した記録媒体１０１がドライブ装置１００にセットされると、プログラムが記録媒体１０１からドライブ装置１００を介して補助記憶装置１０２にインストールされる。但し、プログラムのインストールは必ずしも記録媒体１０１より行う必要はなく、ネットワークを介して他のコンピュータよりダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたプログラムを格納すると共に、必要なファイルやデータ等を格納する。 A program for realizing the processing in the reaction estimation device 10 is provided by a recording medium 101 such as a CD-ROM. When the recording medium 101 storing the program is set in the drive device 100, the program is installed from the recording medium 101 to the auxiliary storage device 102 via the drive device 100. However, the program need not be installed from the recording medium 101 and may be downloaded from another computer via a network. The auxiliary storage device 102 stores the installed program and also stores necessary files and data.

メモリ装置１０３は、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムを読み出して格納する。ＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムに従ってリアクション推定装置１０に係る機能を実行する。インタフェース装置１０５は、ネットワークに接続するためのインタフェースとして用いられる。 The memory device 103 reads the program from the auxiliary storage device 102 and stores it when there is an instruction to start the program. The CPU 104 executes a function related to the reaction estimation device 10 in accordance with a program stored in the memory device 103. The interface device 105 is used as an interface for connecting to a network.

図２は、第１の実施の形態におけるリアクション推定装置の機能構成例を示す図である。図２において、リアクション推定装置１０は、音途切れ率算出部１１及び知覚割合推定部１２等を有する。これら各部は、リアクション推定装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 2 is a diagram illustrating a functional configuration example of the reaction estimation apparatus according to the first embodiment. In FIG. 2, the reaction estimation apparatus 10 includes a sound interruption rate calculation unit 11 and a perceptual rate estimation unit 12. Each of these units is realized by processing that one or more programs installed in the reaction estimation apparatus 10 cause the CPU 104 to execute.

第１の実施の形態では、通話音声のパケットのキャプチャデータがリアクション推定装置１０への入力となる。パケットフローのプロトコルは、シーケンス番号や送信タイムスタンプから損失箇所の特定や損失区間の時間長を計算できるＲＴＰ（Real-time Transport Protocol）を用いる。なお、リアルタイムでパケットをキャプチャすることは必須ではない。何らかの方法でパケットが取得できればよい。また、ネットワークは、例えば、モバイルネットワークであるが、本実施の形態はモバイルネットワークに限らずに、様々なネットワークにおける音声通話に関する知覚行動推定に適用可能である。 In the first embodiment, captured voice packet capture data is input to the reaction estimation apparatus 10. The protocol of the packet flow uses RTP (Real-time Transport Protocol) that can identify the loss location and calculate the length of the loss interval from the sequence number and transmission time stamp. Note that it is not essential to capture packets in real time. It suffices if the packet can be acquired by some method. Further, the network is, for example, a mobile network, but the present embodiment is not limited to the mobile network, and can be applied to perceptual behavior estimation regarding voice calls in various networks.

音途切れ率算出部１１は、１回の通話期間（通信期間）において音声が途切れた期間の割合である音途切れ率を算出する。具体的には、音途切れ率算出部１１は、パケットのヘッダのシーケンス番号や送信タイムスタンプにより特定された音途切れ区間（期間）の時間長（音途切れ時間）の総和（総音途切れ時間）を算出する。音途切れ率算出部１１は、１回の通話の総時間長（例えば、１分、６０分等）と、総音途切れ時間（例えば、１０秒、５０秒等）との比率に基づいて、音途切れ率（例えば、１％、３％等）を算出する。 The sound interruption rate calculation unit 11 calculates a sound interruption rate that is a ratio of a period in which sound is interrupted in one call period (communication period). Specifically, the sound interruption rate calculation unit 11 calculates the sum (total sound interruption time) of the time length (sound interruption time) of the sound interruption period (period) specified by the sequence number of the packet header or the transmission time stamp. calculate. The sound interruption rate calculation unit 11 calculates the sound based on the ratio between the total time length of one call (for example, 1 minute, 60 minutes, etc.) and the total sound interruption time (for example, 10 seconds, 50 seconds, etc.). A break rate (for example, 1%, 3%, etc.) is calculated.

知覚割合推定部１２は、音途切れ率と知覚割合との対応関数を示す対応関係式（マッピング関数）を保持している。知覚割合とは、音声通信の品質劣化を知覚するユーザの割合（例えば、１００人中何人のユーザが品質劣化を知覚するかの割合）をいう。知覚割合推定部１２は、音途切れ率算出部１１によって算出された音途切れ率を当該対応関係式に代入することにより、知覚割合の推定値（以下、「知覚割合推定値」という。」を導出する。関係式の詳細例については、後述する処理フローのところで説明する。なお、第１の実施の形態では、音声通信の品質の劣化の知覚状況が、ユーザのリアクションの一例である。 The perception rate estimation unit 12 holds a correspondence relation expression (mapping function) indicating a correspondence function between the sound interruption rate and the perception rate. The perceptual ratio refers to the ratio of users who perceive deterioration in quality of voice communication (for example, the ratio of how many users out of 100 perceive quality deterioration). The perception ratio estimation unit 12 derives an estimated value of the perception ratio (hereinafter referred to as “perception ratio estimation value”) by substituting the sound interruption rate calculated by the sound interruption rate calculation unit 11 into the corresponding relational expression. A detailed example of the relational expression will be described later in the processing flow, and in the first embodiment, the perceived state of deterioration of voice communication quality is an example of user reaction.

以下、リアクション推定装置１０が実行する処理手順について説明する。図３は、第１の実施の形態におけるリアクション推定装置が実行する処理手順の一例を説明するためのフローチャートである。 Hereinafter, the process procedure which the reaction estimation apparatus 10 performs is demonstrated. FIG. 3 is a flowchart for explaining an example of a processing procedure executed by the reaction estimation apparatus according to the first embodiment.

例えば、ネットワークを介して接続されるＶｏＬＴＥ（Voice over Long Term Evolution）等の音声通話サービスにおいて、端末間で通話が行われている。通話の送話音声は、エンコードされ、パケットに格納されてネットワークを経由して、受信端末に届き、受信端末でデコードされて音声が出力され受話される。ネットワーク内、もしくは、受信端末で取得した通話パケットのキャプチャデータ（以下、単に「パケット」という。）は、リアクション推定装置１０の音途切れ率算出部１１に入力される（Ｓ１０１）。 For example, in a voice call service such as VoLTE (Voice over Long Term Evolution) connected via a network, a call is performed between terminals. The transmitted voice of a call is encoded, stored in a packet, reaches the receiving terminal via the network, is decoded by the receiving terminal, and the voice is output and received. Call packet capture data (hereinafter simply referred to as “packets”) acquired within the network or at the receiving terminal is input to the sound interruption rate calculation unit 11 of the reaction estimation apparatus 10 (S101).

音途切れ率算出部１１は、パケットが入力されると、１回の通話に係るパケットに含まれているシーケンス番号からパケットの損失箇所を特定する（Ｓ１０２）。具体的には、シーケンス番号が連続していない箇所が損失箇所として特定される。 When a packet is input, the sound interruption rate calculation unit 11 specifies a loss point of the packet from the sequence number included in the packet related to one call (S102). Specifically, a part where the sequence numbers are not continuous is specified as a loss part.

続いて、音途切れ率算出部１１は、パケットヘッダの送信タイムスタンプから、パケットの損失箇所を含む、１回の通話の総時間長を取得する（Ｓ１０３）。具体的には、１回の通話の最後のパケットのタイムスタンプから最初のパケットのタイムスタンプを差し引くことで、総時間長を取得することができる。厳密には、通話の最後のパケットのタイムスタンプから最初のパケットのタイムスタンプを差し引いた値に対して、最後のパケット分の通話時間が加算されてもよい。各パケットのタイムスタンプは、当該パケットの通話時間の開始時刻を示すからである。 Subsequently, the sound interruption rate calculation unit 11 acquires the total time length of one call including the packet loss point from the transmission time stamp of the packet header (S103). Specifically, the total time length can be acquired by subtracting the time stamp of the first packet from the time stamp of the last packet of one call. Strictly speaking, the call time for the last packet may be added to the value obtained by subtracting the time stamp of the first packet from the time stamp of the last packet of the call. This is because the time stamp of each packet indicates the start time of the call time of the packet.

続いて、音途切れ率算出部１１は、ステップＳ１０２において特定されたパケット損失箇所ごとに、当該損失箇所の音途切れ時間を、当該損失箇所の前後のパケットに含まれる送信タイムスタンプから算出する（Ｓ１０４）。具体的には、当該損失箇所の直後のパケットのタイムスタンプから当該損失箇所の直前のパケットのタイムスタンプを差し引くことで得られる時間から、当該直前のパケットの通話時間を差し引くことで、当該損失箇所の音途切れ時間を算出することができる。 Subsequently, for each packet loss point specified in step S102, the sound interruption rate calculation unit 11 calculates the sound interruption time of the loss point from the transmission time stamps included in the packets before and after the loss point (S104). ). Specifically, by subtracting the call time of the previous packet from the time obtained by subtracting the time stamp of the packet immediately before the loss location from the time stamp of the packet immediately after the loss location, the loss location The sound interruption time can be calculated.

続いて、音途切れ率算出部１１は、損失箇所ごとの音途切れ時間を合計して、総音途切れ時間を算出する（Ｓ１０５）。続いて、音途切れ率算出部１１は、総時間長と総音途切れ時間長の比率から、音途切れ率を算出する（Ｓ１０６）。すなわち、「総音途切れ時間÷通話の総時間長」に基づいて音途切れ率が算出される。 Subsequently, the sound interruption rate calculation unit 11 calculates the total sound interruption time by summing the sound interruption times for each loss point (S105). Subsequently, the sound interruption rate calculation unit 11 calculates the sound interruption rate from the ratio of the total time length and the total sound interruption time length (S106). That is, the sound interruption rate is calculated based on “total sound interruption time ÷ total time length of call”.

続いて、知覚割合推定部１２は、予め求められた、音途切れ率と品質劣化の知覚割合との対応関係を示す対応関係式に対し、音途切れ率算出部１１によって算出された音途切れ率を代入して、知覚割合推定値を算出する（Ｓ１０７）。続いて、知覚割合推定部１２は、算出された知覚割合推定値を出力する（Ｓ１０８）。 Subsequently, the perceptual rate estimation unit 12 calculates the sound discontinuity rate calculated by the sound discontinuity rate calculation unit 11 with respect to the correspondence expression that indicates the correspondence relationship between the sound discontinuity rate and the perceived rate of quality degradation. By substituting, a perceptual ratio estimated value is calculated (S107). Subsequently, the perception ratio estimation unit 12 outputs the calculated perception ratio estimation value (S108).

音途切れ率と品質劣化の知覚割合との対応関係を示す対応関係式（知覚推定モデル）の一例を以下に示す。 An example of a correspondence expression (perception estimation model) indicating the correspondence between the sound interruption rate and the perception ratio of quality degradation is shown below.

知覚割合＝Ａ×音途切れ率２−Ｂ×音途切れ率＋Ｃ
上記対応関係式は、予め主観評価実験により求められた音途切れ率と知覚割合との間の関係にフィッティングさせた、音途切れ率を入力とする関数である。実験により得られた音途切れ率と知覚割合との間の関係から、知覚割合推定のマッピング関数として、例えば上記のような関数を定め、知覚割合と当該関数から最小二乗法等を用いてフィッティングの精度がよい関数形状となるように係数Ａ、Ｂ、Ｃを導出する。 Perception ratio = A × sound break rate 2−B × sound break rate + C
The corresponding relational expression is a function having the sound interruption rate as an input, which is fitted to the relationship between the sound interruption rate and the perception rate obtained in advance by a subjective evaluation experiment. Based on the relationship between the sound interruption rate and the perception rate obtained through experiments, for example, the above function is defined as a mapping function for estimation of the perception rate. The coefficients A, B, and C are derived so as to obtain a function shape with high accuracy.

上記の知覚割合推定のマッピング関数は一例である。例えば、知覚割合推定のマッピング関数として音途切れ率を入力とする指数関数が用いられてもよい。 The above mapping function for perceptual ratio estimation is an example. For example, an exponential function having the sound interruption rate as an input may be used as the mapping function for estimating the perception ratio.

上述したように、第１の実施の形態によれば、パケットのキャプチャデータを用いて、音声通信の品質劣化によるユーザのリアクション（品質劣化の知覚状況）を推定することが可能となる。また、本実施の形態によれば、通話ごとに主観評価を必要とせずに、各通話における音声通信の品質劣化によるユーザのリアクションを推定することができる。 As described above, according to the first embodiment, it is possible to estimate a user reaction (perceived state of quality degradation) due to voice communication quality degradation using packet capture data. Further, according to the present embodiment, it is possible to estimate a user's reaction due to deterioration in voice communication quality in each call without requiring subjective evaluation for each call.

次に、第２の実施の形態について説明する。第２の実施の形態では第１の実施の形態と異なる点について説明する。第２の実施の形態において特に言及されない点については、第１の実施の形態と同様でもよい。 Next, a second embodiment will be described. In the second embodiment, differences from the first embodiment will be described. Points that are not particularly mentioned in the second embodiment may be the same as those in the first embodiment.

図４は、第２の実施の形態におけるリアクション推定装置の機能構成例を示す図である。図４中、図２に対応する部分には同一符号を付している。図４において、リアクション推定装置１０ａは、音途切れ率算出部１１及び行動割合推定部１３等を有する。これら各部は、リアクション推定装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 4 is a diagram illustrating a functional configuration example of the reaction estimation apparatus according to the second embodiment. In FIG. 4, parts corresponding to those in FIG. In FIG. 4, the reaction estimation apparatus 10 a includes a sound interruption rate calculation unit 11 and an action rate estimation unit 13. Each of these units is realized by processing that one or more programs installed in the reaction estimation apparatus 10 cause the CPU 104 to execute.

第２の実施の形態では、通話音声の信号がリアクション推定装置１０ａへの入力となる。なお、リアルタイムで音声信号を得ることは必須ではない。何らかの方法で音声信号が取得できればよい。また、ネットワークは、例えば、モバイルネットワークであるが、本実施の形態はモバイルネットワークに限らずに、様々なネットワークにおける音声通話に関する知覚行動推定に適用可能である。 In the second embodiment, a call voice signal is input to the reaction estimation apparatus 10a. Note that it is not essential to obtain an audio signal in real time. It suffices if the audio signal can be acquired by some method. Further, the network is, for example, a mobile network, but the present embodiment is not limited to the mobile network, and can be applied to perceptual behavior estimation regarding voice calls in various networks.

音途切れ率算出部１１は、信号レベルにより特定された音途切れ区間の時間長（音途切れ時間）の総和（総音途切れ時間）を算出する。音途切れ率算出部１１は、通話の総時間長（例えば、１分、６０分等）と、総音途切れ時間（例えば、１０秒、５０秒等）との比率に基づいて、音途切れ率（例えば、１％、３％等）を算出する。 The sound interruption rate calculation unit 11 calculates the sum (total sound interruption time) of the time length (sound interruption time) of the sound interruption section specified by the signal level. The sound interruption rate calculation unit 11 calculates a sound interruption rate (for example, 10 seconds, 50 seconds, etc.) and a total sound interruption time (for example, 10 seconds, 50 seconds, etc.) For example, 1%, 3%, etc.) are calculated.

行動割合推定部１３は、音途切れ率と品質劣化に対する行動割合との対応関係を示す対応関係式（マッピング関数）を保持している。行動割合とは、品質劣化に対して特定の又は任意の行動意思を示すユーザの割合（例えば、１００人中何人のユーザが特定の行動意思を示すかの割合）をいう。行動割合推定部１３は、音途切れ率算出部１１によって算出された音途切れ率を当該対応関係式に代入することにより、行動割合の推定値（行動割合推定値）を導出する。なお、第２の実施の形態では、音声通信の品質の劣化に対する行動意思が、ユーザのリアクションの一例である。 The action ratio estimation unit 13 holds a correspondence expression (mapping function) indicating a correspondence relation between the sound interruption rate and the action ratio with respect to quality degradation. The action ratio refers to a ratio of users who show a specific or arbitrary action intention with respect to quality degradation (for example, a ratio of how many users out of 100 show a specific action intention). The action ratio estimation unit 13 derives an estimated value of action ratio (behavior ratio estimation value) by substituting the sound interruption rate calculated by the sound interruption rate calculation unit 11 into the corresponding relational expression. In the second embodiment, the intention to act for voice communication quality degradation is an example of a user reaction.

図５は、第２の実施の形態におけるリアクション推定装置が実行する処理手順の一例を説明するためのフローチャートである。 FIG. 5 is a flowchart for explaining an example of a processing procedure executed by the reaction estimation apparatus according to the second embodiment.

例えば、ネットワークを介して接続されるＶｏＬＴＥ等の音声通話サービスにおいて、端末間で通話が行われている。通話の送話音声は、エンコードされ、パケットに格納されてネットワークを経由して、受信端末に届き、受信端末でデコードされて音声が出力され受話される。ネットワーク内、もしくは、受信端末で取得した通話パケットからデコードされた音声信号、もしくは、受信端末で直接取得した音声信号は、リアクション推定装置１０ａの音途切れ率算出部１１に入力される（Ｓ２０１）。 For example, in a voice call service such as VoLTE connected via a network, a call is performed between terminals. The transmitted voice of a call is encoded, stored in a packet, reaches the receiving terminal via the network, is decoded by the receiving terminal, and the voice is output and received. The audio signal decoded from the call packet acquired in the network or at the receiving terminal, or the audio signal directly acquired at the receiving terminal is input to the sound interruption rate calculating unit 11 of the reaction estimation apparatus 10a (S201).

音途切れ率算出部１１は、音声信号が入力されると、音声信号のレベルから音途切れ箇所を特定する（Ｓ２０２）。会話上の無音区間と音途切れはレベルの違いにより音途切れ箇所が区別される。または、常に有音となる音声信号を送話側から入力し、レベルの違いにより音途切れ箇所が区別されてもよい。 When the sound signal is input, the sound break rate calculation unit 11 specifies the sound break point from the level of the sound signal (S202). Silent sections and sound interruptions in conversation are distinguished from each other by the difference in level. Alternatively, a voice signal that is always sounded may be input from the transmission side, and the sound interruption point may be distinguished depending on the level.

続いて、音途切れ率算出部１１は、音途切れ箇所を含む通話の総時間長を音声信号長から取得する（Ｓ２０３）。続いて、音途切れ率算出部１１は、ステップＳ２０２において特定された音途切れ箇所ごとに、当該音途切れ箇所の音途切れ時間を、当該音途切れ箇所の前後のレベルの比較から算出する（Ｓ２０４）。続いて、音途切れ率算出部１１は、音途切れ箇所ごとの音途切れ時間を合計して、総音途切れ時間を算出する（Ｓ２０５）。続いて、音途切れ率算出部１１は、総時間長と総音途切れ時間長の比率から、音途切れ率を算出する（Ｓ２０６）。すなわち、「総音途切れ時間÷通話の総時間長」に基づいて音途切れ率が算出される。 Subsequently, the sound interruption rate calculation unit 11 acquires the total time length of the call including the sound interruption point from the voice signal length (S203). Subsequently, the sound interruption rate calculation unit 11 calculates, for each sound interruption point specified in step S202, a sound interruption time at the sound interruption point from a comparison of levels before and after the sound interruption point (S204). Subsequently, the sound interruption rate calculation unit 11 calculates the total sound interruption time by summing the sound interruption times for each sound interruption point (S205). Subsequently, the sound interruption rate calculation unit 11 calculates the sound interruption rate from the ratio of the total time length and the total sound interruption time length (S206). That is, the sound interruption rate is calculated based on “total sound interruption time ÷ total time length of call”.

続いて、行動割合推定部１３は、予め求められた、音途切れ率と品質劣化に対して行動意思を示すユーザの割合（行動割合）との対応関係を示す対応関係式に対し、音途切れ率算出部１１によって算出された音途切れ率を代入して、行動割合推定値を算出する（Ｓ２０７）。続いて、行動割合推定部１３は、算出された行動割合推定値を出力する（Ｓ２０８）。 Subsequently, the action rate estimation unit 13 calculates the sound interruption rate with respect to a correspondence relation expression that indicates a correspondence relationship between the sound interruption rate and the proportion of users who show an intention to act with respect to quality degradation (action rate). A behavioral rate estimated value is calculated by substituting the sound interruption rate calculated by the calculation unit 11 (S207). Subsequently, the action ratio estimation unit 13 outputs the calculated action ratio estimation value (S208).

音途切れ率と品質劣化に対する行動割合との対応関係を示す対応関係式（行動割合推定モデル）の一例を以下に示す。 An example of a correspondence expression (behavior ratio estimation model) indicating a correspondence relation between the sound interruption rate and the action ratio with respect to quality degradation is shown below.

行動割合＝Ｄ×音途切れ率２−Ｅ×音途切れ率＋Ｆ
上記対応関係式は、予め主観評価実験により求められた音途切れ率と行動割合との間の関係にフィッティングさせた、音途切れ率を入力とする関数である。実験により得られた音途切れ率と行動割合との間の関係から、行動割合推定のマッピング関数として、例えば上記のような関数を定め、行動割合と当該関数から最小二乗法等を用いてフィッティングの精度がよい関数形状となるように係数Ｄ、Ｅ、Ｆを導出する。 Action ratio = D × sound break rate 2−E × sound break rate + F
The corresponding relational expression is a function having the sound interruption rate as an input, which is fitted to the relationship between the sound interruption rate and the action ratio obtained in advance by a subjective evaluation experiment. Based on the relationship between the sound interruption rate and the action rate obtained by experiment, for example, a function as described above is defined as a mapping function for estimating the action rate. The coefficients D, E, and F are derived so as to obtain a function shape with high accuracy.

上記の行動割合推定のマッピング関数は一例である。例えば、行動割合推定のマッピング関数として音途切れ率を入力とする指数関数が用いられてもよい。 The above mapping function for estimating the behavior ratio is an example. For example, an exponential function that uses a sound interruption rate as an input may be used as a mapping function for estimating a behavior ratio.

上述したように、第２の実施の形態によれば、音声信号を用いて、音声通信の品質劣化によるユーザの行動意思を推定することが可能となる。本実施の形態によれば、通話ごとに主観評価を必要とせずに、各通話における音声通信の品質劣化によるユーザのリアクションを推定することができる。 As described above, according to the second embodiment, it is possible to estimate a user's action intention due to voice communication quality degradation using a voice signal. According to the present embodiment, it is possible to estimate a user's reaction due to quality degradation of voice communication in each call without requiring subjective evaluation for each call.

なお、第２の実施の形態における音途切れ率の算出方法が、第１の実施の形態に適用されてもよいし、第１の実施の形態における音途切れ率の算出方法が、第２の実施の形態に適用されてもよい。 Note that the method for calculating the sound interruption rate in the second embodiment may be applied to the first embodiment, and the method for calculating the sound interruption rate in the first embodiment is the second method. It may be applied to the form.

なお、本実施の形態において、音途切れ率算出部１１は、算出部の一例である。知覚割合推定部１２又は行動割合推定部１３は、推定部の一例である。 In the present embodiment, the sound interruption rate calculation unit 11 is an example of a calculation unit. The perception rate estimation unit 12 or the behavior rate estimation unit 13 is an example of an estimation unit.

以上、本発明の実施例について詳述したが、本発明は斯かる特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 As mentioned above, although the Example of this invention was explained in full detail, this invention is not limited to such specific embodiment, In the range of the summary of this invention described in the claim, various deformation | transformation・ Change is possible.

１０、１０ａリアクション推定装置
１１音途切れ率算出部
１２知覚割合推定部
１３行動割合推定部
１００ドライブ装置
１０１記録媒体
１０２補助記憶装置
１０３メモリ装置
１０４ＣＰＵ
１０５インタフェース装置
Ｂバス 10, 10a Reaction estimation device 11 Sound interruption rate calculation unit 12 Perception rate estimation unit 13 Action rate estimation unit 100 Drive device 101 Recording medium 102 Auxiliary storage device 103 Memory device 104 CPU
105 Interface device B bus

Claims

A calculation unit that calculates a sound interruption rate that is a ratio of a period in which the sound is interrupted in a communication period of the sound via the network;
An estimation unit that estimates a user's reaction by applying the sound interruption rate calculated by the calculation unit to a relational expression indicating a correspondence relationship between the sound interruption rate and a user's reaction due to the interruption of sound;
A reaction estimation apparatus comprising:

The estimation unit applies the sound interruption rate calculated by the calculation unit to a relational expression indicating a correspondence relationship between the sound interruption rate and a ratio of users perceiving the sound interruption, and the user who perceives the sound interruption Calculate an estimate of the percentage,
The reaction estimation apparatus according to claim 1.

The estimation unit applies the sound interruption rate calculated by the calculation unit to a relational expression indicating a correspondence relationship between the sound interruption rate and a ratio of users indicating intention to perform an action when the sound is interrupted. Calculate an estimate of the percentage of users who indicate their intention to
The reaction estimation apparatus according to claim 1.

The calculation unit calculates the sound interruption rate using the total time length of the voice communication and the total time length of the sound interruption acquired from the capture data of the voice communication packet,
The reaction estimation apparatus according to any one of claims 1 to 3, wherein

The calculation unit calculates the sound interruption rate using the total time length of the voice communication and the total time length of the sound interruption acquired from the voice signal of the voice communication,
The reaction estimation apparatus according to any one of claims 1 to 3, wherein

A calculation procedure for calculating a sound interruption rate, which is a ratio of a period in which the sound is interrupted in a communication period of the voice via the network;
An estimation procedure for estimating a user's reaction by applying the sound interruption rate calculated in the calculation procedure to a relational expression indicating a correspondence relationship between the sound interruption rate and a user's reaction due to interruption of sound;
A reaction estimation method characterized in that a computer executes.

The program for functioning a computer as each part in the reaction estimation apparatus as described in any one of Claims 1 thru | or 5.