JP2014241498A

JP2014241498A - Program recommendation device

Info

Publication number: JP2014241498A
Application number: JP2013123042A
Authority: JP
Inventors: 秀治倉本; Hideji Kuramoto
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2013-06-11
Filing date: 2013-06-11
Publication date: 2014-12-25

Abstract

PROBLEM TO BE SOLVED: To implement program recommendation sufficiently reflected with user preference.SOLUTION: A program recommendation device (10) comprises: a topic presentation part (11) in which a topic including a predetermined keyword is generated and presented to a user; a voice analyzing part (12) which performs voice analysis on a voice of the user and extracts rhythm features of the voice; a preference degree evaluation part (15) for evaluating a preference degree of the user with respect to the keyword included in the topic on the basis of the rhythm features extracted by the voice analyzing part for the voice of the user responding to the topic; a preference DB construction part (16) for constructing a preference information database (103) in which the keyword is made correspondent to the preference degree of the user, on the basis of a result of the evaluation by the preference degree evaluation part regarding various keywords; and a program recommendation part (17) which finds and recommends a program meeting the preference of the user from program guide information while referring to the preference information database.

Description

本発明は、デジタルテレビやデジタルビデオレコーダなどのデジタル放送受信機に用いられる番組推薦装置に関する。 The present invention relates to a program recommendation device used in a digital broadcast receiver such as a digital television or a digital video recorder.

デジタル放送受信機が有する機能の一つに番組推薦機能がある。典型的な番組推薦機能は、あらかじめユーザーが好みのキーワードやジャンルを機器に登録しておくことで、機器がＥＰＧ（Electronic Program Guide）などの番組ガイド情報から当該キーワードやジャンルに合致する番組を見つけてユーザーに推薦（例えば、自動録画予約など）するというものである。しかし、キーワードやジャンルなどを登録し、こまめに更新することはユーザーにとって煩わしい作業である。 One of the functions of a digital broadcast receiver is a program recommendation function. A typical program recommendation function is to register a keyword or genre that the user likes in the device in advance, and the device finds a program that matches the keyword or genre from program guide information such as EPG (Electronic Program Guide). And recommending it to the user (for example, automatic recording reservation). However, registering keywords, genres, etc. and updating them frequently is a cumbersome task for the user.

そのようなキーワードの登録作業なしにユーザーの視聴嗜好を学習するためにこれまでいくつかの技術が提案されている。例えば、ユーザーによる機器の操作履歴を記録し、各番組ついてユーザーがどのような機器操作を行ったかに基づいてユーザーの嗜好を学習するものがある（例えば、特許文献１を参照）。また、コンテンツ視聴中のユーザーの顔の向きを検出して区間注視度を算出し、当該区間注視度に基づいて番組嗜好度を算出するものがある（例えば、特許文献２を参照）。 Several techniques have been proposed so far to learn user viewing preferences without registering such keywords. For example, there is one that records a device operation history by a user and learns user preferences based on what device operation the user has performed for each program (see, for example, Patent Document 1). In addition, there is an apparatus that detects the orientation of the face of the user who is viewing the content, calculates the section gaze degree, and calculates the program preference degree based on the section gaze degree (see, for example, Patent Document 2).

特開２００４−１９４１０８号公報JP 2004-194108 A 特開２０１１−７１７９５号公報JP 2011-71795 A

ジャンルに基づく番組推薦ではジャンルという大まかな分類でしかユーザーの嗜好を反映できないのに対して、キーワードに基づく番組推薦はユーザーの嗜好をより細かく反映することができる。従来技術に係る機器の操作履歴や区間注視度に基づく嗜好評価によると、ユーザーが視聴中の番組に対する嗜好度、すなわち、番組のジャンルの嗜好度については評価することができるが、個別のキーワードに関してユーザーの嗜好度を評価することは困難である。このため、従来の番組推薦ではユーザーの嗜好を十分に反映できないおそれがある。 The program recommendation based on the genre can reflect the user's preference only by the general classification of the genre, whereas the program recommendation based on the keyword can more accurately reflect the user's preference. According to the preference evaluation based on the operation history of the device and the section gaze degree according to the prior art, it is possible to evaluate the preference degree for the program being watched by the user, that is, the preference degree of the genre of the program. It is difficult to evaluate user preference. For this reason, there is a possibility that the user's preference cannot be sufficiently reflected by the conventional program recommendation.

上記問題に鑑み、本発明は、ユーザーにキーワード入力などの作業負担を強いることなくユーザーの視聴嗜好を学習してユーザーの嗜好を十分に反映した番組推薦を実現する番組推薦装置を提供することを目的とする。 In view of the above problems, the present invention provides a program recommendation device that learns a user's viewing preference and imposes a program recommendation that fully reflects the user's preference without forcing the user to enter a keyword or the like. Objective.

本発明の一局面に従った番組推薦装置は、所定のキーワードを含む話題を生成してユーザーに提示する話題提示部と、ユーザーの音声を音声解析して当該音声の韻律特徴を抽出する音声解析部と、前記話題に対してユーザーが応答する音声に関して前記音声解析部によって抽出された韻律特徴に基づいて、前記話題に含まれるキーワードに対するユーザーの嗜好度を評価する嗜好度評価部と、さまざまなキーワードについての前記嗜好度評価部による評価結果に基づいて、キーワードとユーザーの嗜好度とが対応付けられた嗜好情報データベースを構築する嗜好ＤＢ構築部と、前記嗜好情報データベースを参照して、番組ガイド情報からユーザーの嗜好に合った番組を見つけて推薦する番組推薦部とを備えている。 A program recommendation device according to an aspect of the present invention includes a topic presentation unit that generates a topic including a predetermined keyword and presents it to a user, and a voice analysis that analyzes a user's voice and extracts prosodic features of the voice And a preference level evaluation unit that evaluates a user's preference level for a keyword included in the topic based on prosodic features extracted by the speech analysis unit with respect to a voice that the user responds to the topic, and various On the basis of the evaluation result by the preference level evaluation unit for keywords, a preference DB construction unit that builds a preference information database in which keywords and user preference levels are associated, and a program guide with reference to the preference information database And a program recommendation unit for finding and recommending a program that matches the user's preference from the information.

これによると、話題提示部によって提示された話題に対してユーザーが応答する音声の韻律特徴から当該話題に含まれるキーワードに対するユーザーの嗜好度が評価され、当該キーワードとユーザーの嗜好度とが対応付けられて嗜好情報データベースが構築され、構築した嗜好情報データベースを参照してユーザーの嗜好に合った番組が推薦される。したがって、ユーザーにキーワード入力などの作業負担を強いることなくユーザーとのインタラクションを通じてユーザーの視聴嗜好を学習することができる。 According to this, the user's preference level for the keyword included in the topic is evaluated from the prosodic features of the voice that the user responds to the topic presented by the topic presentation unit, and the keyword and the user's preference level are associated with each other. Then, a preference information database is constructed, and a program that matches the user's preference is recommended with reference to the constructed preference information database. Therefore, the user's viewing preference can be learned through interaction with the user without imposing a burden on the user such as inputting a keyword.

上記番組推薦装置は、ユーザーの音声を音声認識してキーワードを抽出する音声認識部を備えていてもよく、前記嗜好度評価部が、ユーザーが自由に発する音声に関して前記音声解析部によって抽出された韻律特徴に基づいて、前記音声認識部によって抽出されたキーワードに対するユーザーの嗜好度を評価するものであってもよい。 The program recommendation device may include a voice recognition unit that recognizes a user's voice and extracts a keyword, and the preference evaluation unit extracts the voice freely uttered by the user by the voice analysis unit. A user's preference degree with respect to the keyword extracted by the speech recognition unit may be evaluated based on prosodic features.

上記番組推薦装置は、ユーザーを撮影した画像を画像解析してユーザーのジェスチャを認識する画像解析部を備えていてもよく、前記嗜好度評価部が、前記画像解析部によって認識された前記話題に対するユーザーのジェスチャを加味して、前記話題に含まれるキーワードに対するユーザーの嗜好度を評価するものであってもよい。 The program recommendation device may include an image analysis unit that recognizes a user's gesture by performing image analysis on an image captured of the user, and the preference evaluation unit is configured to respond to the topic recognized by the image analysis unit. In consideration of the user's gesture, the user's preference degree for the keyword included in the topic may be evaluated.

なお、前記話題生成部は、生成した話題を音声情報および／または文字情報としてユーザーに提示することができる。 The topic generation unit can present the generated topic to the user as voice information and / or text information.

本発明の別の局面に従った番組推薦装置は、所定のキーワードを含む話題を生成してユーザーに提示する話題提示部と、前記話題に対するユーザーのジェスチャを撮影した画像を画像解析して当該ジェスチャの特徴を抽出する画像解析部と、前記ジェスチャの特徴に基づいて前記話題に含まれるキーワードに対するユーザーの嗜好度を評価する嗜好度評価部と、さまざまなキーワードについての前記嗜好度評価部による評価結果に基づいて、キーワードとユーザーの嗜好度とが対応付けられた嗜好情報データベースを構築する嗜好ＤＢ構築部と、前記嗜好情報データベースを参照して、番組ガイド情報からユーザーの嗜好に合った番組を見つけて推薦する番組推薦部とを備えている。 According to another aspect of the present invention, a program recommendation device generates a topic including a predetermined keyword and presents it to a user, and performs image analysis on an image obtained by photographing the user's gesture on the topic. Evaluation results by the image analysis unit that extracts the features of the user, the preference level evaluation unit that evaluates the user's preference level for the keyword included in the topic based on the feature of the gesture, and the preference level evaluation unit for various keywords A preference DB construction unit for constructing a preference information database in which keywords and user preference levels are associated with each other, and referring to the preference information database, find a program that matches the user's preference from program guide information. And a program recommendation section for recommending.

これによると、話題提示部によって提示された話題に対するユーザーのジェスチャの特徴から当該話題に含まれるキーワードに対するユーザーの嗜好度が評価され、当該キーワードとユーザーの嗜好度とが対応付けられて嗜好情報データベースが構築され、構築した嗜好情報データベースを参照してユーザーの嗜好に合った番組が推薦される。したがって、ユーザーにキーワード入力などの作業負担を強いることなくユーザーとのインタラクションを通じてユーザーの視聴嗜好を学習することができる。 According to this, the user's preference level for the keyword included in the topic is evaluated from the feature of the user's gesture for the topic presented by the topic presentation unit, and the keyword and the user's preference level are associated with each other in the preference information database. And a program that matches the user's preference is recommended with reference to the constructed preference information database. Therefore, the user's viewing preference can be learned through interaction with the user without imposing a burden on the user such as inputting a keyword.

本発明によると、ユーザーにキーワード入力などの作業負担を強いることなくユーザーの視聴嗜好を学習してユーザーの嗜好を十分に反映した番組推薦を行うことができる。 According to the present invention, it is possible to learn a user's viewing preference without forcing the user to enter a keyword or the like, and to make a program recommendation that sufficiently reflects the user's preference.

本発明の一実施形態に係る番組推薦装置の主要部のブロック図The block diagram of the principal part of the program recommendation apparatus which concerns on one Embodiment of this invention. 一例に係る音声解析部のブロック図Block diagram of speech analysis unit according to an example 一例に係る音声認識部のブロック図Block diagram of voice recognition unit according to an example 一例に係る画像解析部のブロック図Block diagram of the image analysis unit according to an example

以下、図面を参照しながら本発明を実施するための形態について説明する。なお、本発明は、以下の実施形態に限定されるものではない。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. In addition, this invention is not limited to the following embodiment.

本発明の一実施形態に係る番組推薦装置は、ユーザーとのインタラクションを通じてユーザーの視聴嗜好を学習し、ユーザーの嗜好に合ったコンテンツ（番組）を推薦する装置である。当該番組推薦装置は、放送事業者やコンテンツ配信会社などから地上デジタル波や衛星デジタル波などの放送波および／またはインターネットなどのネットワークで配信される番組を受信するＴＶ装置、ビデオレコーダ、端末装置などに搭載されてそれら機器に番組推薦機能を提供する。 A program recommendation device according to an embodiment of the present invention is a device that learns a user's viewing preference through interaction with the user and recommends content (program) that matches the user's preference. The program recommendation device is a TV device, a video recorder, a terminal device, or the like that receives a broadcast wave such as a terrestrial digital wave or a satellite digital wave and / or a program distributed on a network such as the Internet from a broadcaster or a content distribution company To provide program recommendation functions to these devices.

図１は、本発明の一実施形態に係る番組推薦装置１０の主要部のブロック図である。本実施形態に係る番組推薦装置１０は、話題提示部１１、音声解析部１２、音声認識部１３、画像解析部１４、嗜好度評価部１５、嗜好ＤＢ構築部１６、および番組推薦部１７を備えている。 FIG. 1 is a block diagram of a main part of a program recommendation device 10 according to an embodiment of the present invention. The program recommendation device 10 according to the present embodiment includes a topic presentation unit 11, a voice analysis unit 12, a voice recognition unit 13, an image analysis unit 14, a preference level evaluation unit 15, a preference DB construction unit 16, and a program recommendation unit 17. ing.

話題提示部１１は、所定のキーワードを含むさまざまな話題を生成し、ユーザーに話題を提供してユーザーとのインタラクションを実現する。話題提示部１１は、生成した話題を音声合成してオーディオ信号を生成し、図示しないスピーカーから話題を音声情報として出力してユーザーに提示することができる。また、話題提示部１１は、生成した話題をテキスト化してＯＳＤ（On Screen Display）用のビデオ信号などを生成し、図示しない表示装置から話題を文字情報として出力してユーザーに提示することができる。 The topic presentation unit 11 generates various topics including a predetermined keyword and provides the user with a topic to realize interaction with the user. The topic presentation unit 11 can synthesize the generated topic by voice synthesis to generate an audio signal, and can output the topic as voice information from a speaker (not shown) and present it to the user. Further, the topic presentation unit 11 can convert the generated topic into text, generate a video signal for OSD (On Screen Display), etc., output the topic as character information from a display device (not shown), and present it to the user. .

話題提示部１１は、キーワードデータベース１０１を参照して、話題を生成するのに適当なキーワードを取得することができる。キーワードデータベース１０１にはユーザーの視聴嗜好を探るのに適したキーワードが格納されている。キーワードデータベース１０１に格納されているキーワードは、例えば、「ドラマ」、「ニュース」、「スポーツ」などの番組ジャンルを表す言葉や、「東京スカイツリー」などの施設名や、地名、さらには人物名などである。 The topic presentation unit 11 can acquire a keyword suitable for generating a topic with reference to the keyword database 101. The keyword database 101 stores keywords suitable for searching for user viewing preferences. Keywords stored in the keyword database 101 include, for example, words representing program genres such as “drama”, “news”, “sports”, facility names such as “Tokyo Skytree”, place names, and person names. Etc.

なお、番組推薦装置１０は、電子番組ガイドなどからキーワードを抽出してキーワードデータベース１０１に追加することができる。後述するように、電子番組ガイドは番組ガイド情報データベース１０４に格納されている。また、番組推薦装置１０がインターネットに接続されていれば、プッシュ型サービスにより、および／またはインターネットの特定のｗｅｂサイトから最新の注目ワードを取得することにより、キーワードデータベース１０１に最新キーワードを追加することができる。また、番組推薦装置１０は、古く使わなくなったキーワードをキーワードデータベース１０１から適宜削除することができる。 The program recommendation device 10 can extract keywords from an electronic program guide or the like and add them to the keyword database 101. As will be described later, the electronic program guide is stored in the program guide information database 104. Further, if the program recommendation device 10 is connected to the Internet, the latest keyword is added to the keyword database 101 by a push-type service and / or by obtaining the latest attention word from a specific web site on the Internet. Can do. Further, the program recommendation device 10 can appropriately delete keywords that are no longer used from the keyword database 101.

音声解析部１２は、図示しないマイクロフォンによって集音されたユーザーの音声を受け、それを音声解析して韻律特徴を抽出する。韻律特徴とは、音声の抑揚、音調、強勢、音長、リズム、話速などである。音声解析部１２はそのような韻律特徴を定量化することができる。 The voice analysis unit 12 receives a user's voice collected by a microphone (not shown), analyzes the voice, and extracts prosodic features. Prosodic features include speech inflection, tone, stress, tone length, rhythm, speech speed, and the like. The voice analysis unit 12 can quantify such prosodic features.

図２は、一例に係る音声解析部１２のブロック図である。音声解析部１２は、例えば、音声特徴抽出部１２１、および韻律パラメータ生成部１２２を備えている。 FIG. 2 is a block diagram of the speech analysis unit 12 according to an example. The voice analysis unit 12 includes, for example, a voice feature extraction unit 121 and a prosody parameter generation unit 122.

音声特徴抽出部１２１には、Ａ／Ｄ変換された音声データが入力される。音声特徴抽出部１２１は、入力された音声データを解析して音声区間を抽出し、例えば１フレームを１０ｍｓとして各フレームごとに基本周波数Ｆ０を算出する。また、音声特徴抽出部１２１は、音素アライメントを行って第１モーラの母音区間を特定する。 The voice feature extraction unit 121 receives A / D converted voice data. The voice feature extraction unit 121 analyzes the input voice data and extracts a voice section, and calculates a fundamental frequency F0 for each frame, for example, with one frame as 10 ms. Moreover, the speech feature extraction unit 121 performs phoneme alignment to identify the vowel section of the first mora.

韻律パラメータ生成部１２２は、音声特徴部１２１の解析結果に基づいて、韻律特徴として、１）第１モーラの母音区間における基本周波数Ｆ０の傾き、２）音声区間全体における基本周波数Ｆ０のレンジ、および３）音声区間の最終音素の継続長、の少なくとも３つのパラメータを生成する。韻律パラメータ生成部１２２は、さらに、４）音声区間全体における２次回帰係数の２次の係数、５）ユーザーごとに平均をとって正規化した基本周波数Ｆ０の音声区間全体の平均値、６）最大値、および７）最小値、を韻律特徴に追加することもできる。 Based on the analysis result of the speech feature unit 121, the prosody parameter generation unit 122 has 1) the slope of the fundamental frequency F0 in the vowel section of the first mora, 2) the range of the fundamental frequency F0 in the entire speech section, and 3) At least three parameters of the duration of the last phoneme in the speech section are generated. The prosody parameter generation unit 122 further includes 4) a secondary coefficient of the secondary regression coefficient in the entire speech section, 5) an average value of the entire speech section of the fundamental frequency F0 normalized by taking an average for each user, and 6) Maximum values and 7) minimum values can also be added to the prosodic features.

図１に戻り、音声認識部１３は、図示しないマイクロフォンによって集音されたユーザーの音声を受け、それを音声認識してキーワードを抽出する。 Returning to FIG. 1, the voice recognition unit 13 receives a user's voice collected by a microphone (not shown), recognizes the voice and extracts a keyword.

図３は、一例に係る音声認識部１３のブロック図である。音声認識部１３は、例えば、音声処理部１３１、形態素解析部１３２、およびキーワード抽出部１３３を備えている。 FIG. 3 is a block diagram of the speech recognition unit 13 according to an example. The voice recognition unit 13 includes, for example, a voice processing unit 131, a morpheme analysis unit 132, and a keyword extraction unit 133.

音声処理部１３１には、Ａ／Ｄ変換された音声データが入力される。音声処理部１３１は、入力された音声データを解析して音声区間を抽出し、例えば１フレームを１０ｍｓとして各フレームごとにメル周波数ケプストラム係数（ＭＦＣＣ）を算出する。さらに、音声処理部１３１は、有限状態トランスデューサ（ＦＳＴ）を用いて音声データをテキスト化する。 The audio processing unit 131 receives A / D converted audio data. The voice processing unit 131 analyzes the input voice data and extracts a voice section, and calculates a mel frequency cepstrum coefficient (MFCC) for each frame, for example, with one frame as 10 ms. Furthermore, the voice processing unit 131 converts the voice data into text using a finite state transducer (FST).

形態素解析部１３２は、音声処理部１３１から出力されるテキストデータを形態素解析して、テキストを単語群に分解する。 The morphological analysis unit 132 performs morphological analysis on the text data output from the speech processing unit 131, and decomposes the text into word groups.

キーワード抽出部１３３は、形態素解析部１３２の処理結果から特定の品詞、特に名詞を抽出する。例えば、キーワード抽出部１３３は、キーワードデータベース１０１を参照して、キーワードデータベース１０１に登録されているキーワードを抽出してもよい。 The keyword extraction unit 133 extracts specific parts of speech, particularly nouns, from the processing result of the morphological analysis unit 132. For example, the keyword extraction unit 133 may extract keywords registered in the keyword database 101 with reference to the keyword database 101.

図１に戻り、画像解析部１４は、図示しないカメラによって撮影されたユーザーの画像を受け、それを画像解析してユーザーのジェスチャを認識する。画像解析部１４が認識するジェスチャは、例えば、うなずき、かしげ、首振りなど、ユーザーの肯定的（ポジティブ）または否定的（ネガティブ）な反応を推し量ることができるユーザーの頭部の動きである。 Returning to FIG. 1, the image analysis unit 14 receives a user's image taken by a camera (not shown), analyzes the image, and recognizes the user's gesture. The gesture recognized by the image analysis unit 14 is, for example, a movement of the user's head capable of estimating a positive (positive) or negative (negative) reaction of the user such as a nod, a shadow, and a head swing.

図４は、一例に係る画像解析部１４のブロック図である。画像解析部１４は、例えば、領域抽出部１４１、オプティカルフロー抽出部１４２、領域平均処理部１４３、およびジェスチャ認識部１４４を備えている。 FIG. 4 is a block diagram of the image analysis unit 14 according to an example. The image analysis unit 14 includes, for example, a region extraction unit 141, an optical flow extraction unit 142, a region average processing unit 143, and a gesture recognition unit 144.

領域抽出部１４１は、入力された映像データからユーザーの頭部領域を抽出する。頭部領域は、例えば、肌や髪の色を用いることで背景画像から抽出することができる。 The region extraction unit 141 extracts the user's head region from the input video data. The head region can be extracted from the background image by using, for example, skin or hair color.

オプティカルフロー抽出部１４２は、領域抽出部１４１によって抽出されたユーザーの頭部領域のオプティカルフローを抽出する。具体的には、オプティカルフロー抽出部１４２は、勾配法によってオプティカルフローを抽出することができる。 The optical flow extraction unit 142 extracts the optical flow of the user's head region extracted by the region extraction unit 141. Specifically, the optical flow extraction unit 142 can extract an optical flow by a gradient method.

領域平均処理部１４３は、頭部領域の重心を基準としてユーザーの頭部領域を縦横４つに分割した第１象限から第４象限について、各象限のオプティカルフローの平均値を算出する。当該平均値は各象限の全体的なフローを表す。 The area average processing unit 143 calculates the average value of the optical flows in each quadrant for the first to fourth quadrants obtained by dividing the user's head area into four vertical and horizontal sections based on the center of gravity of the head area. The average value represents the overall flow of each quadrant.

ジェスチャ認識部１４４は、領域平均処理部１４３から出力される各象限のオプティカルフローの平均値に基づいてユーザーの頭部領域の動きを特定してユーザーのジェスチャを認識する。例えば、第１から第４の全象限が上下に同じように動く場合、ジェスチャ認識部１４４は、ユーザーがうなずいていると認識することができる。第１から第４の全象限が左右に同じように動く場合、ジェスチャ認識部１４４は、ユーザーが首を振っていると認識することができる。第１象限と第３象限または第２象限と第４象限が原点を中心として対称に動く場合、ジェスチャ認識部１４４は、ユーザーが首をかしげていると認識することができる。 The gesture recognition unit 144 recognizes the user's gesture by specifying the movement of the user's head region based on the average value of the optical flow in each quadrant output from the region average processing unit 143. For example, when the first to fourth quadrants move up and down in the same manner, the gesture recognition unit 144 can recognize that the user is nodding. When the first to fourth quadrants move in the same way from side to side, the gesture recognition unit 144 can recognize that the user is shaking his / her head. When the first quadrant and the third quadrant or the second quadrant and the fourth quadrant move symmetrically around the origin, the gesture recognition unit 144 can recognize that the user is raising his / her neck.

図１に戻り、嗜好度評価部１５は、話題提示部１１によって提示された話題に対するユーザーの音声およびジェスチャに応じて、その話題に含まれるキーワードに対するユーザーの嗜好度を評価する。具体的には、嗜好度評価部１５は、音声解析部１２によって抽出された韻律特徴および／または画像解析部１４によって認識されたユーザーのジェスチャに基づいて、その話題に含まれるキーワードに対するユーザーの嗜好度を評価する。 Returning to FIG. 1, the preference level evaluation unit 15 evaluates the user's preference level for keywords included in the topic according to the user's voice and gesture for the topic presented by the topic presentation unit 11. Specifically, the preference level evaluation unit 15 is based on the prosodic features extracted by the voice analysis unit 12 and / or the user's gesture recognized by the image analysis unit 14, and the user's preference for the keywords included in the topic Assess degree.

ユーザーに話題を提示すると、ユーザーはその話題に対して音声により肯定的または否定的な何らかの反応を示すと考えられる。例えば、「○○はいかがでしょうか？」（○○は所定のキーワード）という話題が提示された場合、ユーザーがその話題に含まれるキーワードに興味があればユーザーは「いいね」などと発話し、あまり興味がなければ「○○かあ」などと発話することがある。ユーザーの個別的な発話内容にかかわらず、肯定的な発話の場合、第１モーラの母音区間における基本周波数Ｆ０の傾きが比較的大きく、音声区間全体における基本周波数Ｆ０のレンジが比較的広く、音声区間の最終音素の継続長が比較的短いといった傾向が見られる。一方、否定的な発話の場合はこれとは逆の傾向が見られる。したがって、嗜好度評価部１５は、ユーザーの音声の韻律特徴に基づいてユーザーの反応が肯定的か否定的かを評価することができる。 When a topic is presented to the user, the user is considered to have some positive or negative response to the topic by voice. For example, when the topic “How about XX?” (XX is a predetermined keyword) is presented, if the user is interested in the keyword included in the topic, the user utters “Like” If you are not interested in it, you may utter something like “XX”. Regardless of the user's individual utterance content, in the case of a positive utterance, the slope of the fundamental frequency F0 in the vowel section of the first mora is relatively large, the range of the fundamental frequency F0 in the entire speech section is relatively wide, and the speech There is a tendency that the duration of the last phoneme in the section is relatively short. On the other hand, in the case of a negative utterance, the opposite tendency is seen. Therefore, the preference level evaluation unit 15 can evaluate whether the user's reaction is positive or negative based on the prosodic features of the user's voice.

具体的には、嗜好度評価部１５は、サポートベクターマシン（ＳＶＭ）などを利用して、ユーザーの反応が肯定的か否定的かを評価する。なお、サポートベクターマシンを利用するために、ユーザーのさまざまな音声応答に係る韻律特徴を肯定／否定モデル１０２としてモデル化しておく。 Specifically, the preference level evaluation unit 15 uses a support vector machine (SVM) or the like to evaluate whether the user's reaction is positive or negative. In order to use the support vector machine, prosodic features related to various user voice responses are modeled as affirmative / negative models 102.

また、ユーザーに話題を提示すると、ユーザーはその話題に対してジェスチャにより肯定的または否定的な何らかの反応を示すこともある。例えば、「○○はいかがでしょうか？」（○○は所定のキーワード）という話題が提示された場合、ユーザーがその話題に含まれるキーワードに興味があればユーザーはうなずき、あまり興味がなければ首をかしげたり、首を振ったりすることがある。したがって、嗜好度評価部１５は、ユーザーのジェスチャに基づいてユーザーの反応が肯定的か否定的かを評価することができる。 Also, when a topic is presented to the user, the user may show some positive or negative response to the topic by a gesture. For example, when the topic “How about XX?” (XX is a predetermined keyword) is presented, the user nods if the user is interested in the keyword included in the topic, and the head if not interested You may bend or shake your head. Therefore, the preference level evaluation unit 15 can evaluate whether the user's reaction is positive or negative based on the user's gesture.

ユーザーの応答に音声およびジェスチャの両方が含まれる場合、嗜好度評価部１５はそれらを総合的に評価することができる。例えば、韻律特徴が肯定的であり、ユーザーのジェスチャがうなずきであった場合、嗜好度評価部１５は、ユーザーは当該話題に含まれるキーワードに対して強い肯定の意思を持っていると評価することができる。一方、韻律特徴が否定的であり、ユーザーのジェスチャが首振りであった場合、嗜好度評価部１５は、ユーザーは当該話題に含まれるキーワードに対して強い否定の意思を持っていると評価することができる。 When the user response includes both voice and gesture, the preference level evaluation unit 15 can comprehensively evaluate them. For example, when the prosodic feature is affirmative and the user's gesture is a nod, the preference level evaluation unit 15 evaluates that the user has a strong affirmative intention for the keyword included in the topic. Can do. On the other hand, when the prosodic feature is negative and the user's gesture is swinging, the preference level evaluation unit 15 evaluates that the user has a strong negative intention to the keyword included in the topic. be able to.

嗜好度評価部１５によって評価されるユーザーの嗜好度は、肯定／否定の２レベル、強い肯定／肯定／否定／強い否定の４レベルなどのように複数のレベルで表してもよいし、あるいは０から１までの範囲の確率値として表してもよい。 The user's preference level evaluated by the preference level evaluation unit 15 may be expressed by a plurality of levels such as two levels of affirmation / negative, four levels of strong affirmation / affirmation / denial / strong denial, or 0 May be expressed as a probability value ranging from 1 to 1.

嗜好ＤＢ構築部１６は、嗜好度評価部１５による評価結果に基づいて嗜好情報データベース１０３を構築する。嗜好情報データベース１０３においてキーワードとユーザーの嗜好度とが対応対応付けされている。嗜好ＤＢ構築部１６は、話題提示部１１から、ユーザーに提示した話題に含まれるキーワードを受け、また、嗜好度評価部１５から当該話題に含まれるキーワードに対するユーザーの嗜好度の評価結果を受け、それらを対応付けて嗜好情報データベース１０３に登録する。 The preference DB construction unit 16 constructs the preference information database 103 based on the evaluation result by the preference degree evaluation unit 15. In the preference information database 103, keywords and user preference levels are associated with each other. The preference DB construction unit 16 receives the keyword included in the topic presented to the user from the topic presentation unit 11, and receives the evaluation result of the user's preference level for the keyword included in the topic from the preference level evaluation unit 15, These are registered in the preference information database 103 in association with each other.

また、嗜好ＤＢ構築部１６は、ユーザーが自由に発する音声に含まれるキーワードに対するユーザーの嗜好度も嗜好情報データベース１０３に登録することができる。具体的には、嗜好度評価部１５は、音声認識部１３から、ユーザーが自由に発する音声に含まれるキーワードを受け、また、嗜好度評価部１５から当該キーワードに対するユーザーの嗜好度の評価結果を受け、それらを対応付けて嗜好情報データベース１０３に登録する。 In addition, the preference DB construction unit 16 can also register the user's preference degree with respect to the keyword included in the voice freely uttered by the user in the preference information database 103. Specifically, the preference level evaluation unit 15 receives a keyword included in the voice freely uttered by the user from the voice recognition unit 13, and receives the evaluation result of the user's preference level for the keyword from the preference level evaluation unit 15. And associates them with each other and registers them in the preference information database 103.

このように、さまざまなキーワードについてユーザーの好き嫌いを嗜好情報データベース１０３に登録することで、番組推薦装置１０はユーザーの視聴嗜好を学習することができる。 In this way, by registering user likes and dislikes for various keywords in the preference information database 103, the program recommendation device 10 can learn the viewing preference of the user.

番組推薦部１７は、嗜好情報データベース１０３を参照してユーザーの嗜好に合った番組を推薦する。例えば、番組推薦部１７は、ＥＰＧなどの番組ガイド情報を格納している番組ガイド情報データベース１０４を検索して、ユーザーの嗜好度が高いキーワードを多く含む番組の情報を見つける。そして、番組推薦部１７は、見つけた番組の放送予定をユーザーに通知したり、自動的に録画予約することで番組推薦を行う。 The program recommendation unit 17 refers to the preference information database 103 and recommends a program that matches the user's preference. For example, the program recommendation unit 17 searches the program guide information database 104 that stores program guide information such as EPG, and finds information on programs that include many keywords that are highly preferred by the user. Then, the program recommendation unit 17 notifies the user of the broadcast schedule of the found program or makes a program recommendation by automatically making a recording reservation.

以上のように、本実施形態によると、ユーザーとのインタラクションを通じてさまざまなキーワードについてユーザーの嗜好度を学習し、その学習結果に基づいてユーザーの嗜好に合った番組を推薦することができる。これにより、興味のあるキーワードやジャンルなどを逐一登録するといった煩わしい作業からユーザーを解放することができる。 As described above, according to the present embodiment, it is possible to learn a user's preference level for various keywords through user interaction, and to recommend a program that matches the user's preference based on the learning result. As a result, the user can be freed from the troublesome work of registering interesting keywords and genres one by one.

なお、本実施形態に係る番組推薦装置１０のユーザーは特定の１名に限定されず、複数のユーザーが１台の番組推薦装置１０を共有することができる。その場合、番組推薦装置１０は、ユーザーごとに嗜好情報を管理して各ユーザーの嗜好に合った番組を推薦することができる。 Note that the number of users of the program recommendation device 10 according to the present embodiment is not limited to one specific person, and a plurality of users can share one program recommendation device 10. In that case, the program recommendation device 10 can manage preference information for each user and recommend a program that matches each user's preference.

また、上述のような機器とユーザーとのインタラクションによるユーザーの嗜好情報の学習は番組推薦装置に限られず、ユーザーの嗜好情報をサービスに利用するその他の家庭電化製品に応用することができる。 Moreover, learning of user preference information by the interaction between the device and the user as described above is not limited to the program recommendation device, and can be applied to other home appliances that use the user preference information for services.

１０番組推薦装置
１１話題提示部
１２音声解析部
１３音声認識部
１４画像解析部
１５嗜好度評価部
１６嗜好ＤＢ構築部
１７番組推薦部
１０３嗜好情報データベース DESCRIPTION OF SYMBOLS 10 Program recommendation apparatus 11 Topic presentation part 12 Voice analysis part 13 Voice recognition part 14 Image analysis part 15 Preference evaluation part 16 Preference DB construction part 17 Program recommendation part 103 Preference information database

Claims

A topic presenting unit that generates a topic including a predetermined keyword and presents it to the user;
A voice analysis unit that analyzes a user's voice and extracts prosodic features of the voice;
A preference level evaluation unit that evaluates a user's preference level for a keyword included in the topic, based on the prosodic features extracted by the voice analysis unit with respect to the voice that the user responds to the topic;
A preference DB construction unit that constructs a preference information database in which keywords and user preference levels are associated with each other based on evaluation results by the preference degree evaluation unit for various keywords;
A program recommendation device comprising: a program recommendation unit that refers to the preference information database and finds and recommends a program that matches a user's preference from program guide information.

It has a voice recognition unit that recognizes user's voice and extracts keywords,
The preference level evaluation unit evaluates a user's preference level for a keyword extracted by the speech recognition unit based on prosodic features extracted by the speech analysis unit with respect to speech freely uttered by a user. The program recommendation device described.

An image analysis unit that recognizes the user's gesture by analyzing the image taken by the user,
The preference level evaluation unit evaluates a user's preference level for a keyword included in the topic in consideration of a user's gesture for the topic recognized by the image analysis unit. The program recommendation device described in 1.

The program recommendation device according to claim 1, wherein the topic generation unit presents the generated topic to the user as voice information and / or text information.

A topic presenting unit that generates a topic including a predetermined keyword and presents it to the user;
An image analysis unit that analyzes an image of a user's gesture on the topic and extracts features of the gesture;
A preference level evaluation unit that evaluates a user's preference level for keywords included in the topic based on the features of the gesture;
A preference DB construction unit that constructs a preference information database in which keywords and user preference levels are associated with each other based on evaluation results by the preference degree evaluation unit for various keywords;
A program recommendation device comprising: a program recommendation unit that refers to the preference information database and finds and recommends a program that matches a user's preference from program guide information.