JPWO2020090790A1

JPWO2020090790A1 - Information processing equipment

Info

Publication number: JPWO2020090790A1
Application number: JP2020553924A
Authority: JP
Inventors: 彰田中; 翔七尾; 広樹石塚; 昇悟池田; 充弘小形; 誠村▲崎▼
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2018-10-30
Filing date: 2019-10-29
Publication date: 2021-12-23
Also published as: WO2020090790A1

Abstract

ユーザ装置は、ユーザの音声に基づいて第１キーワードを生成する第１キーワード生成部と、画像信号の示す画像から抽出した複数のオブジェクト画像に１対１で対応する複数の第２キーワードを生成する第２キーワード生成部と、複数の第２キーワードの各々と第１キーワードとの関連性の程度に基づいて、第２キーワード生成部によって生成された複数の第２キーワードの中からコメントの対象となる対象キーワードを特定する特定部と、対象キーワードに関連するコメントを生成するコメント生成部とを備える。The user device generates a first keyword generation unit that generates a first keyword based on the user's voice, and a plurality of second keywords that have a one-to-one correspondence with a plurality of object images extracted from an image indicated by an image signal. Based on the degree of relevance between the second keyword generation unit, each of the plurality of second keywords, and the first keyword, a comment is made from among the plurality of second keywords generated by the second keyword generation unit. It includes a specific unit that specifies the target keyword and a comment generation unit that generates a comment related to the target keyword.

Description

本発明は、情報処理装置に関する。 The present invention relates to an information processing device.

特許文献１には、ユーザがポインティングデバイスを用いて表示装置に表示される画像内のオブジェクト画像を指定すると、当該オブジェクト画像に関するレコメンドを表示する技術が開示されている。また、特許文献２にはユーザの音声に応答して当該ユーザが音声で指定したオブジェクト画像に関する情報をレコメンドする技術が開示されている。 Patent Document 1 discloses a technique for displaying a recommendation regarding an object image when a user specifies an object image in an image displayed on a display device using a pointing device. Further, Patent Document 2 discloses a technique for recommending information about an object image specified by the user by voice in response to the voice of the user.

特開２０１７−２２８１７７号公報Japanese Unexamined Patent Publication No. 2017-228177 特開２０１３−８８９０６号公報Japanese Unexamined Patent Publication No. 2013-88806

しかしながら、従来の技術では、ユーザがオブジェクト画像を指定する必要がある。すなわち、オブジェクト画像をユーザが指定しない場合に、ユーザの曖昧な発言に応答して、レコメンドなどのコメントを生成することはできなかった。 However, in the conventional technique, the user needs to specify the object image. That is, when the user does not specify the object image, it is not possible to generate a comment such as a recommendation in response to the user's ambiguous remark.

以上の課題を解決するために、本発明の好適な態様に係る情報処理装置は、ユーザの音声に基づいて第１キーワードを生成する第１キーワード生成部と、画像信号の示す画像から抽出した複数のオブジェクト画像に１対１で対応する複数の第２キーワードを生成する第２キーワード生成部と、前記複数の第２キーワードの各々と前記第１キーワードとの関連性の程度に基づいて、前記複数の第２キーワードの中からコメントの対象となる対象キーワードを特定する特定部と、前記対象キーワードに関連するコメントを生成するコメント生成部とを備える。 In order to solve the above problems, the information processing apparatus according to the preferred embodiment of the present invention includes a first keyword generation unit that generates a first keyword based on a user's voice, and a plurality of information processing devices extracted from an image indicated by an image signal. A second keyword generation unit that generates a plurality of second keywords having a one-to-one correspondence with the object image of the above, and the plurality of said based on the degree of association between each of the plurality of second keywords and the first keyword. A specific unit for specifying a target keyword to be commented from the second keyword of the above, and a comment generation unit for generating a comment related to the target keyword are provided.

本発明に係る情報処理装置によれば、ユーザがコメントの対象となるオブジェクト画像を指定することなく、ユーザの曖昧な発言に応答して、コメントを生成することができる。 According to the information processing apparatus according to the present invention, a comment can be generated in response to an ambiguous remark by the user without the user specifying an object image to be commented.

本発明の第１実施形態に係るサービスシステムの全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the service system which concerns on 1st Embodiment of this invention. 同実施形態に用いられるユーザ装置のハードウェア構成を例示するブロック図である。It is a block diagram which illustrates the hardware composition of the user apparatus used in this embodiment. 同実施形態に用いられるキーワードテーブルのデータ構造を示す説明図である。It is explanatory drawing which shows the data structure of the keyword table used in this embodiment. 同実施形態に用いられるユーザ装置の機能を示す機能ブロック図である。It is a functional block diagram which shows the function of the user apparatus used in the same embodiment. 同実施形態のオブジェクト画像の一例を示す説明図である。It is explanatory drawing which shows an example of the object image of the same embodiment. 同実施形態に用いられるユーザ装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the user apparatus used in this embodiment. 第２実施形態に用いられるユーザ装置の機能を示す機能ブロック図である。It is a functional block diagram which shows the function of the user apparatus used in 2nd Embodiment. 同実施形態に用いられるユーザ装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the user apparatus used in this embodiment. 同実施形態におけるオブジェクト画像の評価結果を説明するための説明図である。It is explanatory drawing for demonstrating the evaluation result of the object image in the same embodiment. 第３実施形態に用いられるユーザ装置の機能を示す機能ブロック図である。It is a functional block diagram which shows the function of the user apparatus used in 3rd Embodiment. 同実施形態に用いられるユーザ装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the user apparatus used in this embodiment. 第４実施形態に用いられるユーザ装置の機能を示す機能ブロック図である。It is a functional block diagram which shows the function of the user apparatus used in 4th Embodiment. 同実施形態に用いられる評価テーブルの記憶内容を示す説明図である。It is explanatory drawing which shows the memory content of the evaluation table used in the same embodiment. 同実施形態に用いられるユーザ装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the user apparatus used in this embodiment.

[１．第１実施形態]
［１．１．サービスシステムの構成］
図１は、本発明の第１実施形態に係るサービスシステムの全体構成を示すブロック図である。図１に示されるサービスシステム１は、動画の配信サービスを提供する。動画の配信サービスは、例えば、映画又は地上波デジタル放送のコンテンツなどを提供する。[1. First Embodiment]
[1.1. Service system configuration]
FIG. 1 is a block diagram showing an overall configuration of a service system according to the first embodiment of the present invention. The service system 1 shown in FIG. 1 provides a moving image distribution service. The moving image distribution service provides, for example, contents of a movie or terrestrial digital broadcasting.

図１に例示するように、サービスシステム１は、ユーザＵ_1〜ユーザＵ_mが管理するユーザ装置２０_1〜２０_m（ｍは１以上の整数）と、ネットワークＮＷと、動画配信サーバ１０とを備える。以下の説明では、同種の要素を区別しない場合には、ユーザ装置２０又はユーザＵのように、参照符号のうちの共通番号だけを使用する。 As illustrated in FIG. 1, the service system 1 includes user devices 20_1 to 20_m (m is an integer of 1 or more) managed by users U_1 to U_m, a network NW, and a video distribution server 10. In the following description, when the same type of elements are not distinguished, only the common number among the reference codes is used as in the user device 20 or the user U.

ユーザ装置２０は、各種の情報を処理する情報処理装置である。ユーザ装置２０は、例えば、スマートフォン又はタブレット端末等の可搬型の情報処理装置である。但し、ユーザ装置２０としては、任意の情報処理装置を採用することができる。ユーザ装置２０は、例えば、パーソナルコンピュータ等の端末型の情報機器であってもよい。 The user device 20 is an information processing device that processes various types of information. The user device 20 is, for example, a portable information processing device such as a smartphone or a tablet terminal. However, any information processing device can be adopted as the user device 20. The user device 20 may be, for example, a terminal-type information device such as a personal computer.

ユーザ装置２０は、動画配信サーバ１０から送信される画像信号Ｓｇを受信して画像を表示したり、あるいは、画像信号Ｓｇをテレビジョン受像機３０に送信してテレビジョン受像機３０に画像を表示させることができる。
ユーザＵは、動画を見ながら発言することがある。例えば、ユーザＵは動画についての感想を述べたり、つぶやくことがある。この場合、ユーザＵの発言は動画に関連するものではあるが、当該発言が曖昧であることが理由で発言が動画の画像に含まれるどの物体に関連するものであるかを一意に特定できないことが多い。ユーザ装置２０は、ユーザＵの曖昧な発言に応答してレコメンドなどのコメントを生成する機能を有する。The user device 20 receives the image signal Sg transmitted from the video distribution server 10 and displays an image, or transmits the image signal Sg to the television receiver 30 and displays the image on the television receiver 30. Can be made to.
User U may speak while watching the video. For example, user U may state his or her impressions of the video or tweet. In this case, although the user U's remark is related to the video, it is not possible to uniquely identify which object included in the image of the video the remark is related to because the remark is ambiguous. There are many. The user device 20 has a function of generating a comment such as a recommendation in response to an ambiguous statement of the user U.

［１．２．ユーザ装置の構成］
図２は、ユーザ装置２０のハードウェア構成を例示するブロック図である。ユーザ装置２０は、処理装置２１、記憶装置２２、通信装置２３、出力装置２４、入力装置２５、近距離無線通信装置２６、ＧＰＳ（Global Positioning System）装置２７、及びバス２８を具備するコンピュータシステムにより実現される。処理装置２１、記憶装置２２、通信装置２３、出力装置２４、入力装置２５、近距離無線通信装置２６及びＧＰＳ装置２７は、情報を通信するためのバス２８で接続される。バス２８は、単一のバスで構成されてもよいし、装置間で異なるバスで構成されてもよい。なお、ユーザ装置２０の各要素は、単数又は複数の機器で構成され、ユーザ装置２０の一部の要素を省略してもよい。[1.2. User device configuration]
FIG. 2 is a block diagram illustrating a hardware configuration of the user device 20. The user device 20 is a computer system including a processing device 21, a storage device 22, a communication device 23, an output device 24, an input device 25, a short-range wireless communication device 26, a GPS (Global Positioning System) device 27, and a bus 28. It will be realized. The processing device 21, the storage device 22, the communication device 23, the output device 24, the input device 25, the short-range wireless communication device 26, and the GPS device 27 are connected by a bus 28 for communicating information. The bus 28 may be composed of a single bus or may be composed of different buses between the devices. Each element of the user device 20 may be composed of a single device or a plurality of devices, and some elements of the user device 20 may be omitted.

処理装置２１は、ユーザ装置２０の全体を制御するプロセッサであり、例えば単数又は複数のチップで構成される。処理装置２１は、例えば、周辺装置とのインタフェース、演算装置及びレジスタ等を含む中央処理装置（ＣＰＵ：Central Processing Unit）で構成される。なお、処理装置２１の機能の一部又は全部を、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＰＬＤ（Programmable Logic Device）、ＦＰＧＡ（Field Programmable Gate Array）等のハードウェアで実現してもよい。処理装置２１は、各種の処理を並列的又は逐次的に実行する。 The processing device 21 is a processor that controls the entire user device 20, and is composed of, for example, a single or a plurality of chips. The processing device 21 is composed of, for example, a central processing unit (CPU) including an interface with peripheral devices, an arithmetic unit, registers, and the like. In addition, a part or all of the functions of the processing device 21 are realized by hardware such as DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), FPGA (Field Programmable Gate Array). You may. The processing device 21 executes various processes in parallel or sequentially.

記憶装置２２は、処理装置２１が読取可能な記録媒体である。記憶装置２２は、処理装置２１が実行する制御プログラムＰＲａを含む複数のプログラム、キーワードテーブルＴＢＬａ、コメントテーブルＴＢＬｂ及び処理装置２１が使用する各種のデータを記憶する。記憶装置２２は、例えば、ＲＯＭ（Read Only Memory）、ＥＰＲＯＭ（Erasable Programmable ROM）、ＥＥＰＲＯＭ（Electrically Erasable Programmable ROM）、及びＲＡＭ（Random Access Memory）等の記憶回路の１種類以上で構成される。 The storage device 22 is a recording medium that can be read by the processing device 21. The storage device 22 stores a plurality of programs including the control program PRa executed by the processing device 21, a keyword table TBLa, a comment table TBLb, and various data used by the processing device 21. The storage device 22 is composed of, for example, one or more types of storage circuits such as ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), and RAM (Random Access Memory).

キーワードテーブルＴＢＬａには、複数の単語が記憶されている。複数の単語は、名詞と形容詞に大別される。名詞の単語はキーワードに対応する。後述する第１キーワードＫＷ１と第２キーワードＫＷ２とは、キーワードテーブルＴＢＬａに記憶されている名詞の単語に含まれる。また、形容詞の単語は、名詞の単語に対応付けて記憶されている。形容詞は名詞を修飾する機能がある。形容詞の単語と名詞の単語との対応付けは、単語の修飾関係に応じて定められている。例えば、形容詞の単語である「美味しい」は、名詞の単語である「飲食物」に対応付けられている。 A plurality of words are stored in the keyword table TBLa. Multiple words are broadly divided into nouns and adjectives. Noun words correspond to keywords. The first keyword KW1 and the second keyword KW2, which will be described later, are included in the noun words stored in the keyword table TBLa. Further, the adjective word is stored in association with the noun word. Adjectives have the function of modifying nouns. The correspondence between adjective words and noun words is determined according to the modification relationship of the words. For example, the adjective word "delicious" is associated with the noun word "food and drink."

図３は、キーワードテーブルＴＢＬａに記憶される名詞の単語のデータ構造を示す説明図である。同図に示されるように、名詞の単語のデータ構造は、複数の単語が意味によって階層化された木構造となっている。この例では、複数の単語が第１階層から第４階層に分類されている。なお、階層数は、４以上であってもよい。 FIG. 3 is an explanatory diagram showing a data structure of a noun word stored in the keyword table TBLa. As shown in the figure, the data structure of a noun word is a tree structure in which a plurality of words are layered according to their meanings. In this example, a plurality of words are classified into the first layer to the fourth layer. The number of layers may be 4 or more.

また、キーワードテーブルＴＢＬａには、名詞の単語と名詞の単語との関連性の程度を示す関連度が記憶されている。関連性の程度は、上位概念と下位概念の関係の他、単語の示す物体の用途及び機能を考慮して定められる。例えば、「日本酒」と「ワイン」とは、いずれも「酒」の下位概念である。これに対して、「お猪口」は「酒」の下位概念ではないが、「お猪口」は「日本酒」を飲むために用いられる。このため、「日本酒」と「お猪口」との関連度は、「日本酒」と「ワイン」との関連度より高くなっている。 Further, in the keyword table TBLa, the degree of relevance indicating the degree of relevance between the noun word and the noun word is stored. The degree of relevance is determined in consideration of the relationship between the superordinate concept and the subordinate concept, as well as the use and function of the object indicated by the word. For example, "sake" and "wine" are both subordinate concepts of "sake". On the other hand, "Inoguchi" is not a subordinate concept of "Sake", but "Inoguchi" is used to drink "Sake". For this reason, the degree of relevance between "sake" and "inoguchi" is higher than the degree of relevance between "sake" and "wine".

説明を図２に戻す。通信装置２３は、移動体通信網又はインターネット等のネットワークＮＷを介して他の装置と通信する機器である。通信装置２３は、例えばネットワークデバイス、ネットワークコントローラ、ネットワークカード又は通信モジュールとも表記される。通信装置２３は、ネットワークＮＷを介して、動画配信サーバ１０と通信可能である。 The explanation is returned to FIG. The communication device 23 is a device that communicates with another device via a mobile communication network or a network NW such as the Internet. The communication device 23 is also referred to as, for example, a network device, a network controller, a network card, or a communication module. The communication device 23 can communicate with the video distribution server 10 via the network NW.

出力装置２４は、処理装置２１による制御のもとで各種の情報をユーザＵに知らせる。出力装置２４は、表示装置２４１とスピーカ２４２とを備える。表示装置２４１は、画像を表示する。例えば液晶表示パネル、又は有機ＥＬ（Electro Luminescence）表示パネル等の各種の表示パネルが表示装置２４１として好適に利用される。
スピーカ２４２には、処理装置２１から音データが供給される。スピーカ２４２はＤＡ変換器を備える。ＤＡ変換器によって音データはアナログ信号に変換され、アナログ信号によってスピーカ２４２は駆動される。The output device 24 informs the user U of various information under the control of the processing device 21. The output device 24 includes a display device 241 and a speaker 242. The display device 241 displays an image. For example, various display panels such as a liquid crystal display panel or an organic EL (Electro Luminescence) display panel are suitably used as the display device 241.
Sound data is supplied to the speaker 242 from the processing device 21. The speaker 242 includes a DA converter. The sound data is converted into an analog signal by the DA converter, and the speaker 242 is driven by the analog signal.

入力装置２５は、ユーザ装置２０を使用するための情報をユーザＵが入力するための機器である。入力装置２５は、ユーザＵによる入力操作を受け付ける。この例の入力装置２５は、マイクロフォン２５１及びタッチパネル２５２を備える。タッチパネル２５２は、表示装置２４１の表示面に対するユーザＵによる接触を検出する。タッチパネル２５２は、接触位置に基づいて、数字及び文字等の符号を入力する操作と、表示装置２４１が表示するアイコンを選択する操作とを受け付ける。マイクロフォン２５１は、ユーザＵの音声をアナログの電気信号に変換し、当該電気信号を音声信号Ｓａとして出力する。音声信号Ｓａは図示せぬＡＤ変換部によりデジタル信号に変換されバス２８を介して処理装置２１に供給される。 The input device 25 is a device for the user U to input information for using the user device 20. The input device 25 accepts an input operation by the user U. The input device 25 of this example includes a microphone 251 and a touch panel 252. The touch panel 252 detects contact by the user U with respect to the display surface of the display device 241. The touch panel 252 accepts an operation of inputting a code such as a number and a character based on a contact position and an operation of selecting an icon displayed by the display device 241. The microphone 251 converts the voice of the user U into an analog electric signal, and outputs the electric signal as the voice signal Sa. The audio signal Sa is converted into a digital signal by an AD conversion unit (not shown) and supplied to the processing device 21 via the bus 28.

近距離無線通信装置２６は、近距離無線通信によって他の装置と通信する機器である。近距離無線通信には、例えばＢｌｕｅｔｏｏｔｈ（登録商標）、ＺｉｇＢｅｅ（登録商標）、又は、ＷｉＦｉ（登録商標）等が挙げられる。他の装置としては、テレビジョン受像機３０等が該当する。
ＧＰＳ装置２７は複数の衛星からの電波を受信し、受信した電波から位置情報を生成する。位置情報は、ユーザ装置２０の位置を示す。位置情報は、位置を特定できるのであれば、どのような形式であってもよい。位置情報は、例えば、ユーザ装置２０の緯度と経度とを示す。本実施形態では、位置情報はＧＰＳ装置２７から得られることを例示するが、ユーザ装置２０は、他の任意の方法で位置情報を取得してもよい。例えば、ユーザ装置２０は、ユーザ装置２０の通信先である基地局に割り当てられたセルＩＤを用いて位置情報を取得してもよい。あるいは、ユーザ装置２０が近距離無線通信装置２６を用いて無線ＬＡＮ（Local Area Network）のアクセスポイントと通信する場合には、ユーザ装置２０は、アクセスポイントに割り当てられたネットワーク上の識別アドレス（ＭＡＣ（Media Access Control）アドレス）と実際の住所（位置）とを互いに対応付けたデータベースを参照して位置情報を取得してもよい。あるいは、ユーザ装置２０は、近距離無線通信装置２６を用いてＢＬＥ(Bluetooth Low Energy) 規格に準拠したアドバタイズメント・パケットに含まれるＩＤ情報を受信し、当該ＩＤ情報に基づいて位置情報を取得してもよい。The short-range wireless communication device 26 is a device that communicates with another device by short-range wireless communication. Examples of short-range wireless communication include Bluetooth (registered trademark), ZigBee (registered trademark), WiFi (registered trademark), and the like. As another device, a television receiver 30 or the like is applicable.
The GPS device 27 receives radio waves from a plurality of satellites and generates position information from the received radio waves. The position information indicates the position of the user device 20. The position information may be in any format as long as the position can be specified. The position information indicates, for example, the latitude and longitude of the user device 20. In the present embodiment, it is exemplified that the position information is obtained from the GPS device 27, but the user device 20 may acquire the position information by any other method. For example, the user apparatus 20 may acquire location information using the cell ID assigned to the base station to which the user apparatus 20 communicates. Alternatively, when the user device 20 communicates with an access point of a wireless LAN (Local Area Network) using the short-range wireless communication device 26, the user device 20 is an identification address (MAC) on the network assigned to the access point. (Media Access Control) address) and the actual address (location) may be referred to each other in a database to be associated with each other to acquire location information. Alternatively, the user device 20 receives the ID information included in the advertisement packet compliant with the BLE (Bluetooth Low Energy) standard by using the short-range wireless communication device 26, and acquires the location information based on the ID information. You may.

［１．３．ユーザ装置２０の機能］
図４は、ユーザ装置２０の機能を示す機能ブロック図である。処理装置２１は記憶装置２２から制御プログラムＰＲａを読み取り実行することによって、第１キーワード生成部２１０、第２キーワード生成部２２０Ａ、特定部２３０Ａ、及びコメント生成部２４０として機能する。[1.3. Functions of user device 20]
FIG. 4 is a functional block diagram showing the functions of the user device 20. By reading and executing the control program PRa from the storage device 22, the processing device 21 functions as a first keyword generation unit 210, a second keyword generation unit 220A, a specific unit 230A, and a comment generation unit 240.

第１キーワード生成部２１０は、音声信号Ｓａによって示されるユーザＵの音声に基づいて第１キーワードＫＷ１を生成する。
具体的には、第１キーワード生成部２１０は、ユーザＵの音声を解析し、解析結果から名詞と形容詞とを抽出する。第１キーワード生成部２１０は、ユーザＵの音声に名詞と形容詞との両方が含まれている場合には注目ワードとして名詞を特定する。例えば、ユーザＵの音声が「赤い車が怪しい」である場合、名詞である「車」が注目ワードとして特定される。また、第１キーワード生成部２１０は、ユーザＵの音声に名詞が含まれておらず形容詞が含まれている場合、形容詞を注目ワードとして特定する。例えば、ユーザＵの音声が「美味しそうだな」である場合、形容詞である「美味しい」が注目ワードとして特定される。The first keyword generation unit 210 generates the first keyword KW1 based on the voice of the user U indicated by the voice signal Sa.
Specifically, the first keyword generation unit 210 analyzes the voice of the user U and extracts nouns and adjectives from the analysis results. The first keyword generation unit 210 identifies the noun as a word of interest when the voice of the user U contains both a noun and an adjective. For example, when the voice of the user U is "the red car is suspicious", the noun "car" is specified as a word of interest. Further, when the voice of the user U does not include a noun but contains an adjective, the first keyword generation unit 210 specifies the adjective as a word of interest. For example, when the voice of the user U is "looks delicious", the adjective "delicious" is specified as a word of interest.

また、第１キーワード生成部２１０は、注目ワードがキーワードテーブルＴＢＬａに含まれるかを判定する。第１キーワード生成部２１０は、判定結果が否定である場合、第１キーワードＫＷ１を生成しない。従って、第１キーワードＫＷ１はキーワードテーブルＴＢＬａに含まれるキーワードに限定される。一方、判定結果が肯定であり、かつ注目ワードが名詞である場合、第１キーワード生成部２１０は、注目ワードを第１キーワードＫＷ１として生成する。判定結果が肯定であり、かつ注目ワードが形容詞である場合、第１キーワード生成部２１０は、キーワードテーブルＴＢＬａを参照して、注目ワードに対応付けられている名詞のワードを第１キーワードＫＷ１として生成する。例えば、注目ワードが「美味しい」である場合、第１キーワード生成部２１０は、第１キーワードＫＷ１として「飲食物」を生成する。 Further, the first keyword generation unit 210 determines whether or not the word of interest is included in the keyword table TBLa. The first keyword generation unit 210 does not generate the first keyword KW1 when the determination result is negative. Therefore, the first keyword KW1 is limited to the keywords included in the keyword table TBLa. On the other hand, when the determination result is affirmative and the attention word is a noun, the first keyword generation unit 210 generates the attention word as the first keyword KW1. When the determination result is affirmative and the attention word is an adjective, the first keyword generation unit 210 refers to the keyword table TBLa and generates a noun word associated with the attention word as the first keyword KW1. do. For example, when the word of interest is "delicious", the first keyword generation unit 210 generates "food and drink" as the first keyword KW1.

このように、第１キーワード生成部２１０は、ユーザＵの発言が曖昧な場合であっても、ユーザＵの発言に関連する第１キーワードＫＷ１を生成する。 In this way, the first keyword generation unit 210 generates the first keyword KW1 related to the user U's remark even when the user U's remark is ambiguous.

次に、第２キーワード生成部２２０Ａは、画像信号Ｓｇの示す画像から抽出したオブジェクト画像の各々について第２キーワードＫＷ２を生成する。第２キーワード生成部２２０Ａは、抽出部２２１と変換部２２２とを有する。 Next, the second keyword generation unit 220A generates the second keyword KW2 for each of the object images extracted from the image indicated by the image signal Sg. The second keyword generation unit 220A has an extraction unit 221 and a conversion unit 222.

抽出部２２１は、画像信号Ｓｇの示す画像から複数のオブジェクト画像を抽出する。１画面の画像には、多数のオブジェクト画像が存在する。 The extraction unit 221 extracts a plurality of object images from the image indicated by the image signal Sg. There are many object images in the image on one screen.

画像信号Ｓｇの示す画像が、図５に示される画像である場合、抽出部２２１が抽出する画像は、例えば、オブジェクト画像ＯＢ１〜ＯＢ５である。 When the image indicated by the image signal Sg is the image shown in FIG. 5, the images extracted by the extraction unit 221 are, for example, object images OB1 to OB5.

変換部２２２は、抽出部２２１によって抽出された複数のオブジェクト画像ＯＢ１〜ＯＢ５の各々を第２キーワードＫＷ２に変換する。変換部２２２は、例えば、機械学習により学習された画像認識モデルを用いて、各オブジェクト画像ＯＢを第２キーワードＫＷ２に変換する。但し、第２キーワードＫＷ２は、キーワードテーブルＴＢＬａに記憶されているキーワードに含まれる。例えば、図５にされるオブジェクト画像ＯＢ１は「ワイン」、オブジェクト画像ＯＢ２は「ワイングラス」、オブジェクト画像ＯＢ３は「時計」、オブジェクト画像ＯＢ４は「キャンドル」、オブジェクト画像ＯＢ５は「洋食」に変換される。 The conversion unit 222 converts each of the plurality of object images OB1 to OB5 extracted by the extraction unit 221 into the second keyword KW2. The conversion unit 222 converts each object image OB into the second keyword KW2 by using, for example, an image recognition model learned by machine learning. However, the second keyword KW2 is included in the keywords stored in the keyword table TBLa. For example, the object image OB1 shown in FIG. 5 is converted into "wine", the object image OB2 is converted into "wine glass", the object image OB3 is converted into "clock", the object image OB4 is converted into "candle", and the object image OB5 is converted into "Western food". The object.

特定部２３０Ａは第２キーワードＫＷ２と第１キーワードＫＷ１との関連性の程度を示す関連度に基づいて、第２キーワード生成部２２０Ａによって生成された複数の第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。より具体的には、特定部２３０Ａは、キーワードテーブルＴＢＬａを参照して、第２キーワードＫＷ２と第１キーワードＫＷ１との関連性の程度を示す関連度を第２キーワードＫＷ２と第１キーワードＫＷ１との組ごとに取得する。特定部２３０Ａは、最も関連度が大きい第２キーワードＫＷ２と第１キーワードＫＷ１との組みに含まれる第２キーワードＫＷ２を対象キーワードＷｘとして特定する。後述のように、コメント生成部２４０は特定部２３０Ａが特定した対象キーワードＷｘに関連するコメントを生成する。 The specific unit 230A selects the target keyword Wx from a plurality of second keyword KW2 generated by the second keyword generation unit 220A based on the degree of association indicating the degree of association between the second keyword KW2 and the first keyword KW1. Identify. More specifically, the specific unit 230A refers to the keyword table TBLa and sets the degree of association between the second keyword KW2 and the first keyword KW1 indicating the degree of association between the second keyword KW2 and the first keyword KW1. Obtain for each group. The specifying unit 230A specifies the second keyword KW2 included in the combination of the second keyword KW2 and the first keyword KW1 having the highest degree of relevance as the target keyword Wx. As will be described later, the comment generation unit 240 generates a comment related to the target keyword Wx specified by the specific unit 230A.

例えば、ユーザＵが図５に示す画像を見て「美味しそうだな」と発言したとする。また、第１キーワードＫＷ１として「飲食物」が生成され、第２キーワードＫＷ２として、「ワイン」、「ワイングラス」、「時計」、「キャンドル」、及び「洋食」が生成されることを想定する。この場合、特定部２３０Ａは、「飲食物」と「ワイン」との関連度、「飲食物」と「ワイングラス」との関連度、「飲食物」と「時計」との関連度、「飲食物」と「キャンドル」との関連度、「飲食物」と「洋食」との関連度を、キーワードテーブルＴＢＬａを参照して取得する。特定部２３０Ａは、取得した複数の関連度を比較して、関連度が最も高い第２キーワードＫＷ２を対象キーワードＷｘとして特定する。 For example, suppose that the user U looks at the image shown in FIG. 5 and says, "It looks delicious." Further, it is assumed that "food and drink" is generated as the first keyword KW1 and "wine", "wine glass", "clock", "candle", and "Western food" are generated as the second keyword KW2. .. In this case, the specific unit 230A has a degree of relation between "food and drink" and "wine", a degree of relation between "food and drink" and "wine glass", a degree of relation between "food and drink" and "clock", and "food and drink". The degree of relevance between "things" and "candles" and the degree of relevance between "food and drink" and "Western food" are acquired with reference to the keyword table TBLa. The specifying unit 230A compares the acquired plurality of relevance degrees and identifies the second keyword KW2 having the highest relevance degree as the target keyword Wx.

コメント生成部２４０は、対象キーワードＷｘに関連するコメントを生成する。コメントとは、対象キーワードＷｘについての説明(explanation)又は解説(exposition)を意味する。また、コメントはレコメンド(recommendation)を含む概念である。このため、対象キーワードＷｘに関連してユーザＵに購入を勧める商品及び当該商品を取り扱う店舗に関する情報がコメントに含まれる。コメント生成部２４０は、対象キーワードＷｘに対応付けられて記憶されたコメントをコメントテーブルＴＢＬｂから読み出すことによってコメントを生成する。また、コメント生成部２４０は、ネットワークＮＷに接続される検索サイトにアクセスして当該検索サイトから対象キーワードＷｘに関連する情報を取得し、取得した情報をコメントとして生成してもよい。例えば、対象キーワードＷｘが「ラーメン」である場合、コメント生成部２４０は、ＧＰＳ装置２７などで生成される位置情報の近くのラーメン屋を検索し、検索結果をコメントとして出力してもよい。 The comment generation unit 240 generates a comment related to the target keyword Wx. The comment means an explanation or an explanation of the target keyword Wx. Comments are also a concept that includes recommendations. Therefore, the comment includes information about the product recommended to the user U to purchase in relation to the target keyword Wx and the store handling the product. The comment generation unit 240 generates a comment by reading the comment stored in association with the target keyword Wx from the comment table TBLb. Further, the comment generation unit 240 may access a search site connected to the network NW, acquire information related to the target keyword Wx from the search site, and generate the acquired information as a comment. For example, when the target keyword Wx is "ramen", the comment generation unit 240 may search for a ramen shop near the position information generated by the GPS device 27 or the like and output the search result as a comment.

［１．４．ユーザ装置２０の動作］
次に、ユーザ装置２０の動作について説明する。図６は、ユーザ装置２０の動作を示すフローチャートである。[1.4. Operation of user device 20]
Next, the operation of the user device 20 will be described. FIG. 6 is a flowchart showing the operation of the user device 20.

まず、処理装置２１は、ユーザＵの音声に基づいて注目ワードを特定する（ステップＳ１）。処理装置２１は、ユーザＵの音声をテキストに変換する音声認識処理と、変換したテキストから名詞及び形容詞を特定する特定処理を実行することによって、注目ワードを抽出する。抽出ワードは、名詞、又は、名詞が特定されない場合には形容詞である。 First, the processing device 21 identifies the word of interest based on the voice of the user U (step S1). The processing device 21 extracts the word of interest by performing a voice recognition process for converting the voice of the user U into a text and a specific process for specifying a noun and an adjective from the converted text. The extracted word is a noun or an adjective if the noun is not specified.

次に、処理装置２１は、注目ワードがキーワードテーブルＴＢＬａに含まれているか否かを判定する（ステップＳ２）。注目ワードがキーワードテーブルＴＢＬａに含まれていない場合、処理装置２１は、処理をステップＳ１に戻し、キーワードテーブルＴＢＬａに含まれる注目ワードが特定されるまで(すなわち、ステップＳ２の判定結果が肯定となるまで)、ステップＳ１及びＳ２の処理を繰り返す。 Next, the processing device 21 determines whether or not the word of interest is included in the keyword table TBLa (step S2). If the word of interest is not included in the keyword table TBLa, the processing device 21 returns the process to step S1 until the word of interest included in the keyword table TBLa is specified (that is, the determination result in step S2 is affirmative). Up to), the processing of steps S1 and S2 is repeated.

ステップＳ２の判定結果が肯定の場合、処理装置２１は注目ワードが名詞であるか否かを判定する（ステップＳ３）。注目ワードが名詞である場合、処理装置２１は注目ワードを第１キーワードＫＷ１として生成する。ステップＳ１の処理において、処理装置２１は名詞又は形容詞を注目ワードとして抽出しているので、ステップＳ３の判定結果が否定の場合、注目ワードは形容詞となる。この場合、処理装置２１はキーワードテーブルＴＢＬａを参照して、注目ワードに対応付けられている名詞のワードを第１キーワードＫＷ１として生成する（ステップＳ５）。 If the determination result in step S2 is affirmative, the processing device 21 determines whether or not the word of interest is a noun (step S3). When the word of interest is a noun, the processing device 21 generates the word of interest as the first keyword KW1. In the process of step S1, the processing device 21 extracts the noun or the adjective as the word of interest. Therefore, if the determination result of step S3 is negative, the word of interest is an adjective. In this case, the processing device 21 refers to the keyword table TBLa and generates a noun word associated with the word of interest as the first keyword KW1 (step S5).

次に、処理装置２１は、画像信号Ｓｇの示す画像からオブジェクト画像を抽出する（ステップＳ６）。１フレームの画像には、通常、複数のオブジェクト画像が存在する。このため、処理装置２１は、ステップＳ６の処理において複数のオブジェクト画像を抽出する。この後、処理装置２１は、抽出された複数のオブジェクト画像の各々を第２キーワードＫＷ２に変換する（ステップＳ７）。 Next, the processing device 21 extracts an object image from the image indicated by the image signal Sg (step S6). A single frame image usually has a plurality of object images. Therefore, the processing device 21 extracts a plurality of object images in the processing of step S6. After that, the processing device 21 converts each of the extracted object images into the second keyword KW2 (step S7).

次に、処理装置２１は、第２キーワードＫＷ２と第１キーワードＫＷ１との関連度に基づいて、ステップＳ７で生成された複数の第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。 Next, the processing device 21 identifies the target keyword Wx from the plurality of second keywords KW2 generated in step S7 based on the degree of association between the second keyword KW2 and the first keyword KW1.

次に、処理装置２１は、対象キーワードＷｘに関連するコメントを生成する（ステップＳ９）。ステップＳ９の処理において、処理装置２１は対象キーワードＷｘに対応付けられて記憶されたコメントをコメントテーブルＴＢＬｂから読み出すことによってコメントを生成する。処理装置２１は、生成したコメントを以下のいずれかの方法で出力する。(a)処理装置２１は、生成したコメントの画像がスーパーインポーズされた動画データが表わす動画を、表示装置２４１に表示させる。(b) 処理装置２１は、近距離無線通信装置２６を用いて、生成したコメントの画像がスーパーインポーズされた動画データをテレビジョン受像機３０に送信する。(c)処理装置２１は、生成したコメントを音データに変換し、コメントを表わす音データを動画の音データと合成して合成結果をスピーカ２４２から放音させる。(d) 処理装置２１は、近距離無線通信装置２６を用いて、コメントを表わす音データと動画の音データとの合成結果をテレビジョン受像機３０に送信する。(a)から(d)の方法を任意に組み合わせても良い。 Next, the processing device 21 generates a comment related to the target keyword Wx (step S9). In the process of step S9, the processing device 21 generates a comment by reading the comment stored in association with the target keyword Wx from the comment table TBLb. The processing device 21 outputs the generated comment by one of the following methods. (a) The processing device 21 causes the display device 241 to display a moving image represented by moving image data in which the generated comment image is superimposed. (b) The processing device 21 uses the short-range wireless communication device 26 to transmit the moving image data in which the generated comment image is superimposed to the television receiver 30. (c) The processing device 21 converts the generated comment into sound data, synthesizes the sound data representing the comment with the sound data of the moving image, and emits the combined result from the speaker 242. (d) The processing device 21 uses the short-range wireless communication device 26 to transmit the synthesis result of the sound data representing the comment and the sound data of the moving image to the television receiver 30. Any combination of the methods (a) to (d) may be used.

また、処理装置２１は、ステップＳ１からステップＳ５までの処理において第１キーワード生成部２１０として機能し、ステップＳ６の処理において抽出部２２１として機能し、ステップＳ７の処理において変換部２２２として機能する。さらに、処理装置２１は、ステップＳ８の処理において特定部２３０Ａとして機能し、ステップＳ９の処理おいてコメント生成部２４０として機能する。 Further, the processing device 21 functions as a first keyword generation unit 210 in the processing from step S1 to step S5, functions as an extraction unit 221 in the processing of step S6, and functions as a conversion unit 222 in the processing of step S7. Further, the processing device 21 functions as the specific unit 230A in the processing of step S8, and functions as the comment generation unit 240 in the processing of step S9.

以上、説明したようにユーザ装置２０の一例である情報処理装置は、ユーザＵの音声に基づいて第１キーワードＫＷ１を生成する第１キーワード生成部２１０と、画像信号Ｓｇの示す画像から抽出した複数のオブジェクト画像の各々について第２キーワードＫＷ２を生成する第２キーワード生成部２２０Ａと、各第２キーワードＫＷ２と第１キーワードＫＷ１との関連性の程度に基づいて、第２キーワード生成部２２０Ａによって生成された複数の第２キーワードＫＷ２の中からコメントの対象となる対象キーワードＷｘを特定する特定部２３０Ａと、対象キーワードＷｘに関連するコメントを生成するコメント生成部２４０と、を備える。 As described above, the information processing device which is an example of the user device 20 includes a first keyword generation unit 210 which generates the first keyword KW1 based on the voice of the user U, and a plurality of information processing devices extracted from the image indicated by the image signal Sg. Generated by the second keyword generation unit 220A that generates the second keyword KW2 for each of the object images of the above, and the second keyword generation unit 220A based on the degree of association between each second keyword KW2 and the first keyword KW1. A specific unit 230A for specifying a target keyword Wx to be a comment from a plurality of second keyword KW2, and a comment generation unit 240 for generating a comment related to the target keyword Wx are provided.

この態様によれば、ユーザＵがコメントの対象となるオブジェクト画像を指定することなく、ユーザＵの曖昧な発言に応答して、コメントを生成することができる。 According to this aspect, the user U can generate a comment in response to the ambiguous remark of the user U without designating the object image to be the target of the comment.

また、一の第２キーワードＫＷ２が第１キーワードＫＷ１に不一致である場合の第２キーワードＫＷ２と第１キーワードＫＷ１との関連度と比較して、他の第２キーワードＫＷ２が第１キーワードＫＷ１に一致する場合の関連度が高くなる。従って、複数の第２キーワードＫＷ２のいずれかが第１キーワードＫＷ１と一致する場合、特定部２３０Ａは第１キーワードＫＷ１と一致する第２キーワードＫＷ２を対象キーワードＷｘとして特定する。この場合、特定部２３０Ａは、複数の第２キーワードＫＷ２の各々が第１キーワードＫＷ１に一致するかを判定し、複数の第２キーワードＫＷ２のいずれかに関する判定結果が肯定の場合は、第１キーワードＫＷ１に一致する第２キーワードＫＷ２を対象キーワードＷｘとして特定することができる。このため、キーワードテーブルＴＢＬａを参照して関連度を取得する必要が無く、処理負荷を軽減することができる。 Further, the other second keyword KW2 matches the first keyword KW1 in comparison with the degree of relevance between the second keyword KW2 and the first keyword KW1 when the first second keyword KW2 does not match the first keyword KW1. The degree of relevance is high. Therefore, when any one of the plurality of second keyword KW2 matches the first keyword KW1, the specifying unit 230A specifies the second keyword KW2 matching the first keyword KW1 as the target keyword Wx. In this case, the specific unit 230A determines whether each of the plurality of second keyword KW2 matches the first keyword KW1, and if the determination result regarding any of the plurality of second keyword KW2 is affirmative, the first keyword. The second keyword KW2 that matches KW1 can be specified as the target keyword Wx. Therefore, it is not necessary to refer to the keyword table TBLa to acquire the relevance degree, and the processing load can be reduced.

［２．第２実施形態］
第２実施形態のサービスシステム１は、ユーザ装置２０における処理装置２１の機能を除いて、第１実施形態のサービスシステム１と同一である。図７は第２実施形態の処理装置２１の機能を示す機能ブロック図である。第２実施形態の処理装置２１は、第２キーワード生成部２２０Ａの替わりに第２キーワード生成部２２０Ｂを備える点で、第１実施形態の処理装置２１と相違する。[2. Second Embodiment]
The service system 1 of the second embodiment is the same as the service system 1 of the first embodiment except for the function of the processing device 21 in the user device 20. FIG. 7 is a functional block diagram showing the functions of the processing device 21 of the second embodiment. The processing device 21 of the second embodiment is different from the processing device 21 of the first embodiment in that the second keyword generation unit 220B is provided in place of the second keyword generation unit 220A.

図７に示されるように第２キーワード生成部２２０Ｂは、抽出部２２１、変換部２２２、及び解析部２２３を備える。解析部２２３には画像信号Ｓｇが供給される。解析部２２３は、動画の画像信号Ｓｇを解析して解析結果を抽出部２２１に出力する。 As shown in FIG. 7, the second keyword generation unit 220B includes an extraction unit 221, a conversion unit 222, and an analysis unit 223. The image signal Sg is supplied to the analysis unit 223. The analysis unit 223 analyzes the image signal Sg of the moving image and outputs the analysis result to the extraction unit 221.

解析部２２３は、例えば、画像信号Ｓｇの任意のフレームに含まれる複数のオブジェクト画像の各々を、第１の評価項目から第４の評価項目を用いて評価し、評価値の合計を解析結果として抽出部２２１に出力する。この場合、抽出部２２１は評価値の合計が所定値を超えるオブジェクト画像を抽出する。 For example, the analysis unit 223 evaluates each of a plurality of object images included in an arbitrary frame of the image signal Sg using the first evaluation item to the fourth evaluation item, and the total of the evaluation values is used as the analysis result. It is output to the extraction unit 221. In this case, the extraction unit 221 extracts an object image in which the total evaluation value exceeds a predetermined value.

第１の評価項目は、１画面の面積に対するオブジェクト画像の面積の割合であり、オブジェクト画像の割合が大きいほど当該オブジェクト画像の評価値が高い。第２の評価項目は、ユーザＵから見たオブジェクト画像の遠近であり、オブジェクト画像が手前に位置するほど当該オブジェクト画像の評価値が高い。第３の評価項目は、オブジェクト画像の明るさであり、オブジェクト画像の明るさが明るいほど当該オブジェクト画像の評価値が高い。第４の評価項目は、オブジェクト画像の位置であり、オブジェクト画像の位置が画面の中心に近いほど当該オブジェクト画像の評価値が高い。第１から第４の評価項目は、いずれも１画面の画像中でユーザＵの関心を引く要素である。複数の評価項目を用いてオブジェクト画像を評価することによって、ユーザＵの関心が高いオブジェクト画像を抽出することができる。 The first evaluation item is the ratio of the area of the object image to the area of one screen, and the larger the ratio of the object image, the higher the evaluation value of the object image. The second evaluation item is the perspective of the object image as seen by the user U, and the closer the object image is to the front, the higher the evaluation value of the object image. The third evaluation item is the brightness of the object image, and the brighter the brightness of the object image, the higher the evaluation value of the object image. The fourth evaluation item is the position of the object image, and the closer the position of the object image is to the center of the screen, the higher the evaluation value of the object image. The first to fourth evaluation items are all elements that attract the attention of the user U in the image on one screen. By evaluating the object image using a plurality of evaluation items, it is possible to extract the object image that the user U is highly interested in.

次に、第２実施形態におけるユーザ装置２０の動作を説明する。図８は、第２実施形態に係るユーザ装置２０の動作を示すフローチャートである。同図に示すフローチャートは、ステップＳ６の替わりにステップＳ６_１及びＳ６_２を実行する点を除いて、図６に示す第１実施形態のフローチャートと同一である。以下、相違点について説明する。 Next, the operation of the user device 20 in the second embodiment will be described. FIG. 8 is a flowchart showing the operation of the user device 20 according to the second embodiment. The flowchart shown in FIG. 6 is the same as the flowchart of the first embodiment shown in FIG. 6, except that steps S6_1 and S6_1 are executed instead of step S6. The differences will be described below.

ステップＳ６_１において、処理装置２１は解析部２２３として機能し、あるフレームに含まれる複数のオブジェクト画像の各々を評価項目ごとに評価した複数の評価値を取得し、これら評価値の合計を算出する。例えば、図５に示すオブジェクト画像ＯＢ１〜ＯＢ５の解析結果は、図９に示すものとなる。この例では、オブジェクト画像ＯＢ１〜ＯＢ５の各々について評価値の合計は「１１」〜「１６」の範囲にある。 In step S6_1, the processing device 21 functions as the analysis unit 223, acquires a plurality of evaluation values obtained by evaluating each of the plurality of object images included in a certain frame for each evaluation item, and calculates the total of these evaluation values. For example, the analysis results of the object images OB1 to OB5 shown in FIG. 5 are as shown in FIG. In this example, the total evaluation value for each of the object images OB1 to OB5 is in the range of "11" to "16".

ステップＳ６_２において、処理装置２１は抽出部２２１として機能し、評価値の合計が所定値を超えるオブジェクト画像を抽出する。例えば、所定値が「１３」であり、かつ、各オブジェクト画像について図９に示す評価値の合計が得られた場合を想定する。この場合、処理装置２１はオブジェクト画像ＯＢ２及びＯＢ５を抽出する。なお、ステップＳ７以降の処理は、図６を参照して第１実施形態で説明した処理と同一であるので、説明を省略する。 In step S6_2, the processing device 21 functions as an extraction unit 221 to extract an object image whose total evaluation value exceeds a predetermined value. For example, it is assumed that the predetermined value is "13" and the total evaluation value shown in FIG. 9 is obtained for each object image. In this case, the processing device 21 extracts the object images OB2 and OB5. Since the processes after step S7 are the same as the processes described in the first embodiment with reference to FIG. 6, the description thereof will be omitted.

以上、説明したように第２実施形態によれば、第２キーワード生成部２２０Ｂは、画像信号Ｓｇを解析する解析部２２３と、解析部２２３の解析結果に基づいて、画像信号Ｓｇの示す画像から複数のオブジェクト画像を抽出する抽出部２２１と、複数のオブジェクト画像の各々を第２キーワードＫＷ２に変換する変換部２２２と、を備える。 As described above, according to the second embodiment, the second keyword generation unit 220B is based on the analysis result of the analysis unit 223 that analyzes the image signal Sg and the analysis result of the analysis unit 223, from the image indicated by the image signal Sg. It includes an extraction unit 221 that extracts a plurality of object images, and a conversion unit 222 that converts each of the plurality of object images into the second keyword KW2.

この態様によれば、画像信号Ｓｇの解析結果に基づいて複数のオブジェクト画像を抽出するので、解析結果を用いることなくオブジェクト画像を抽出する場合と比較して、抽出するオブジェクト画像の数を減らすことができる。従って、変換部２２２の処理負荷を軽減できる。 According to this aspect, since a plurality of object images are extracted based on the analysis result of the image signal Sg, the number of object images to be extracted is reduced as compared with the case where the object image is extracted without using the analysis result. Can be done. Therefore, the processing load of the conversion unit 222 can be reduced.

なお、解析部２２３は、複数のフレームに渡る画像信号Ｓｇを解析して、解析結果を生成してもよい。この場合、解析部２２３は、第１の評価項目から第４の評価項目に加え、オブジェクト画像の動きに関する第５の評価項目を採用してもよい。第５の評価項目の一例は、動きのあるオブジェクト画像が画面内に存在する時間長に相当するフレーム数であり、この評価項目によれば、フレーム数が大きくなるほど(オブジェクト画像が画面内に存在する時間が長くなるほど)当該オブジェクト画像の評価値が高い。例えば、映画の主人公が動く場合、主人公の動きに追従するように映画が撮影されることが多い。このため、画像信号Ｓｇが表わす動画が映画である場合、主人公を表わすオブジェクト画像の評価値及び主人公が所持する所持品を表わすオブジェクト画像の評価値を高くする。この結果、ユーザＵが着目するオブジェクト画像を抽出部２２１が抽出する可能性を高めることができる。逆に、ユーザＵが着目しないオブジェクト画像を抽出部２２１が抽出する可能性を低減できる。 The analysis unit 223 may analyze the image signal Sg over a plurality of frames to generate an analysis result. In this case, the analysis unit 223 may adopt a fifth evaluation item related to the movement of the object image in addition to the first evaluation item to the fourth evaluation item. An example of the fifth evaluation item is the number of frames corresponding to the length of time that the moving object image exists in the screen. According to this evaluation item, the larger the number of frames (the object image exists in the screen). The longer the time is, the higher the evaluation value of the object image is. For example, when the main character of a movie moves, the movie is often shot to follow the movement of the main character. Therefore, when the moving image represented by the image signal Sg is a movie, the evaluation value of the object image representing the hero and the evaluation value of the object image representing the belongings possessed by the hero are increased. As a result, it is possible to increase the possibility that the extraction unit 221 extracts the object image of interest to the user U. On the contrary, the possibility that the extraction unit 221 extracts the object image that the user U does not pay attention to can be reduced.

［３．第３実施形態］
第３実施形態のサービスシステム１は、ユーザ装置２０における処理装置２１の機能及び記憶装置２２の記憶内容を除いて、第１実施形態のサービスシステム１と同一である。図１０は第３実施形態の処理装置２１の機能を示す機能ブロック図である。第３実施形態の処理装置２１は、特定部２３０Ａの替わりに特定部２３０Ｂを備える点で、第１実施形態の処理装置２１と相違する。[3. Third Embodiment]
The service system 1 of the third embodiment is the same as the service system 1 of the first embodiment except for the function of the processing device 21 in the user device 20 and the storage contents of the storage device 22. FIG. 10 is a functional block diagram showing the functions of the processing device 21 of the third embodiment. The processing device 21 of the third embodiment is different from the processing device 21 of the first embodiment in that the specific unit 230B is provided in place of the specific unit 230A.

第３実施形態のユーザ装置２０の記憶装置２２は、行動履歴テーブルＴＢＬｃを記憶する。行動履歴テーブルＴＢＬｃにはユーザＵの行動履歴が記憶される。行動履歴には、ユーザＵのインターネット検索履歴、商品及びサービスの購入履歴、ＳＮＳ（Social Networking Service）のアクティビティ、及びＷｅｂブラウザのブックマークなどが含まれる。 The storage device 22 of the user device 20 of the third embodiment stores the action history table TBLc. The action history of the user U is stored in the action history table TBLC. The action history includes the Internet search history of the user U, the purchase history of products and services, the activity of the SNS (Social Networking Service), the bookmark of the Web browser, and the like.

特定部２３０Ｂは、各第２キーワードＫＷ２と第１キーワードＫＷ１との関連性の程度を示す関連度及びユーザＵの行動履歴に基づいて、第２キーワード生成部２２０Ａで生成された複数の第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。 The specific unit 230B has a plurality of second keywords generated by the second keyword generation unit 220A based on the degree of association indicating the degree of association between each second keyword KW2 and the first keyword KW1 and the action history of the user U. Specify the target keyword Wx from KW2.

まず、特定部２３０Ｂは、第２キーワード生成部２２０Ａによって生成された複数の第２キーワードＫＷ２のうち、第１キーワードＫＷ１との関連度が所定値以上となる第２キーワードＫＷ２を選択する。選択された第２キーワードＫＷ２は、対象キーワードＷｘの候補となる。次に、特定部２３０Ｂは、行動履歴テーブルＴＢＬｃに記憶された行動履歴を参照して、選択された第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。例えば、関連度に基づいて選択された第２キーワードＫＷ２が、「ワイン」及び「洋食」であったとする。また、行動履歴テーブルＴＢＬｃにワインの購入履歴が記録されているとする。この場合、特定部２３０Ｂは、行動履歴テーブルＴＢＬｃを参照して、ユーザＵにワインの購入履歴があることを検知すると、「ワイン」と「洋食」のうち、「ワイン」を対象キーワードＷｘとして特定する。この結果、コメント生成部２４０は、「ワイン」に関するコメントを生成することができる。 First, the specific unit 230B selects the second keyword KW2 having a degree of association with the first keyword KW1 of a predetermined value or more from the plurality of second keyword KW2 generated by the second keyword generation unit 220A. The selected second keyword KW2 is a candidate for the target keyword Wx. Next, the specifying unit 230B identifies the target keyword Wx from the selected second keyword KW2 with reference to the action history stored in the action history table TBLC. For example, it is assumed that the second keyword KW2 selected based on the degree of relevance is "wine" and "Western food". Further, it is assumed that the purchase history of wine is recorded in the action history table TBLC. In this case, when the specific unit 230B detects that the user U has a purchase history of wine by referring to the action history table TBLC, the specific unit 230B specifies "wine" as the target keyword Wx among "wine" and "Western food". do. As a result, the comment generation unit 240 can generate a comment regarding "wine".

次に、第３実施形態におけるユーザ装置２０の動作を説明する。図１１は、第３実施形態に係るユーザ装置２０の動作を示すフローチャートである。同図に示すフローチャートは、ステップＳ８の替わりにステップＳ８_１及びＳ８_２を実行する点を除いて、図６に示す第１実施形態のフローチャートと同一である。以下、相違点について説明する。 Next, the operation of the user device 20 in the third embodiment will be described. FIG. 11 is a flowchart showing the operation of the user device 20 according to the third embodiment. The flowchart shown in FIG. 6 is the same as the flowchart of the first embodiment shown in FIG. 6, except that steps S8_1 and S8_1 are executed instead of step S8. The differences will be described below.

ステップＳ８_１において、処理装置２１は特定部２３０Ｂとして機能し、ステップＳ７で生成された複数の第２キーワードＫＷ２のうち、第１キーワードＫＷ１との関連度が所定値以上となる第２キーワードＫＷ２を選択する。 In step S8_1, the processing device 21 functions as the specific unit 230B, and among the plurality of second keyword KW2 generated in step S7, the second keyword KW2 having a degree of association with the first keyword KW1 is selected by a predetermined value or more. do.

ステップＳ８_２において、処理装置２１は特定部２３０Ｂとして機能し、行動履歴に基づいて、ステップＳ８_１の処理で選択された第２キーワードＫＷ２のうち、行動履歴に関連する第２キーワードＫＷ２を対象キーワードＷｘとして特定する。 In step S8_2, the processing device 21 functions as the specific unit 230B, and among the second keyword KW2 selected in the process of step S8_1, the second keyword KW2 related to the action history is set as the target keyword Wx based on the action history. Identify.

第３実施形態によれば、特定部２３０Ｂは、関連性の程度を示す関連度及びユーザＵの行動履歴に基づいて、複数の第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。この態様によれば、ユーザＵの行動履歴を考慮して対象キーワードＷｘを特定するため、ユーザＵの行動履歴を考慮しない場合と比較して、ユーザＵの関心の高いコメントを提供することができる。 According to the third embodiment, the specific unit 230B specifies the target keyword Wx from the plurality of second keywords KW2 based on the degree of relevance indicating the degree of relevance and the action history of the user U. According to this aspect, since the target keyword Wx is specified in consideration of the action history of the user U, it is possible to provide a comment that is of high interest to the user U as compared with the case where the action history of the user U is not considered. ..

なお、図１１を参照して説明したユーザ装置２０の動作では、特定部２３０Ｂは、関連度を用いて第２キーワード生成部２２０Ａが生成した複数の第２キーワードＫＷ２のうち、対象キーワードＷｘの候補となる第２キーワードＫＷ２を選択し（ステップＳ８_１）、その後、行動履歴に基づいて対象キーワードＷｘを特定する（ステップＳ８_２）が、順序を逆転させてもよい。即ち、特定部２３０Ｂは、行動履歴に基づいて第２キーワード生成部２２０Ａで生成した複数の第２キーワードＫＷ２のうち対象キーワードＷｘの候補となる第２キーワードＫＷ２を選択し、その後、関連度を用いて対象キーワードＷｘを特定してもよい。加えて、特定部２３０Ｂは、行動履歴及び関連度を同時に用いて、複数の第２キーワードＫＷ２の中から、対象キーワードＷｘを特定してもよい。特定部２３０Ｂは、例えば、行動履歴に関連する第２キーワードＫＷ２については関連度に所定値を加算し、所定値が加算された関連度を複数の第２キーワードＫＷ２同士で比較して対象キーワードＷｘを特定してもよい。 In the operation of the user apparatus 20 described with reference to FIG. 11, the specific unit 230B is a candidate for the target keyword Wx among the plurality of second keyword KW2 generated by the second keyword generation unit 220A using the degree of relevance. The second keyword KW2 is selected (step S8_1), and then the target keyword Wx is specified based on the action history (step S8_1), but the order may be reversed. That is, the specific unit 230B selects the second keyword KW2 that is a candidate for the target keyword Wx from the plurality of second keyword KW2 generated by the second keyword generation unit 220A based on the action history, and then uses the degree of relevance. The target keyword Wx may be specified. In addition, the specific unit 230B may specify the target keyword Wx from the plurality of second keywords KW2 by simultaneously using the action history and the degree of relevance. For example, the specific unit 230B adds a predetermined value to the relevance degree for the second keyword KW2 related to the action history, compares the relevance degree to which the predetermined value is added between the plurality of second keyword KW2s, and compares the target keyword Wx. May be specified.

［４．第４実施形態］
第４実施形態のサービスシステム１は、ユーザ装置２０における処理装置２１の機能及び記憶装置２２の記憶内容を除いて、第１実施形態のサービスシステム１と同一である。図１２は第４実施形態の処理装置２１の機能を示す機能ブロック図である。第４実施形態の処理装置２１は、特定部２３０Ａの替わりに特定部２３０Ｃを備える点で、第１実施形態の処理装置２１と相違する。[4. Fourth Embodiment]
The service system 1 of the fourth embodiment is the same as the service system 1 of the first embodiment except for the function of the processing device 21 in the user device 20 and the storage contents of the storage device 22. FIG. 12 is a functional block diagram showing the functions of the processing device 21 of the fourth embodiment. The processing device 21 of the fourth embodiment is different from the processing device 21 of the first embodiment in that the specific unit 230C is provided in place of the specific unit 230A.

第４実施形態のユーザ装置２０の記憶装置２２は、プロファイルデータＤＰと評価テーブルＴＢＬｄとを記憶する。プロファイルデータＤＰはユーザＵのプロファイルを示す。プロファイルとは、ユーザＵの属性の意味であり、年齢、性別などの項目が含まれる。 The storage device 22 of the user device 20 of the fourth embodiment stores the profile data DP and the evaluation table TBLd. The profile data DP indicates the profile of the user U. The profile means the attribute of the user U, and includes items such as age and gender.

評価テーブルＴＢＬｄには、プロファイルの項目ごとの評価値がキーワードと対応付けて記憶される。評価値は、キーワードに対するユーザＵの関心の程度を示す値である。図１３は評価テーブルＴＢＬｄの記憶内容の一例を示す。例えば、キーワード「車」について、性別「男」の評価値は「７」であるのに対し、性別「女」の評価値は「４」である。これは、男性が女性より車に関心が高いことを示している。 In the evaluation table TBLd, the evaluation value for each item of the profile is stored in association with the keyword. The evaluation value is a value indicating the degree of interest of the user U in the keyword. FIG. 13 shows an example of the stored contents of the evaluation table TBLd. For example, for the keyword "car", the evaluation value of the gender "male" is "7", while the evaluation value of the gender "female" is "4". This shows that men are more interested in cars than women.

特定部２３０Ｃは、各第２キーワードＫＷ２と第１キーワードＫＷ１との関連性の程度を示す関連度及びユーザＵのプロファイルに基づいて、第２キーワード生成部２２０Ａで生成された複数の第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。 The specific unit 230C has a plurality of second keyword KW2 generated by the second keyword generation unit 220A based on the degree of association indicating the degree of association between each second keyword KW2 and the first keyword KW1 and the profile of the user U. Specify the target keyword Wx from among them.

まず、特定部２３０Ｃは、第２キーワード生成部２２０Ａによって生成された複数の第２キーワードＫＷ２のうち、第１キーワードＫＷ１との関連度が所定値以上となる第２キーワードＫＷ２を選択する。次に、特定部２３０Ｃは、プロファイルデータＤＰと評価テーブルＴＢＬｄとを用いて、選択された第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。具体的には、選択された第２キーワードＫＷ２の各々について、ユーザＵのプロファイルの複数の項目にそれぞれ対応する評価値を合計した合計評価値を算出し、合計評価値が最も高い第２キーワードＫＷ２を対象キーワードＷｘとして特定する。 First, the specific unit 230C selects the second keyword KW2 having a degree of association with the first keyword KW1 of a predetermined value or more from the plurality of second keyword KW2 generated by the second keyword generation unit 220A. Next, the specifying unit 230C uses the profile data DP and the evaluation table TBLd to specify the target keyword Wx from the selected second keyword KW2. Specifically, for each of the selected second keyword KW2, a total evaluation value is calculated by summing the evaluation values corresponding to each of a plurality of items in the profile of the user U, and the second keyword KW2 having the highest total evaluation value is calculated. Is specified as the target keyword Wx.

次に、第４実施形態におけるユーザ装置２０の動作を説明する。図１４は、第４実施形態に係るユーザ装置２０の動作を示すフローチャートである。同図に示すフローチャートは、ステップＳ８_２の替わりにステップＳ８_３を実行する点を除いて、図１２に示す第３実施形態のフローチャートと同一である。以下、相違点について説明する。 Next, the operation of the user device 20 in the fourth embodiment will be described. FIG. 14 is a flowchart showing the operation of the user device 20 according to the fourth embodiment. The flowchart shown in FIG. 12 is the same as the flowchart of the third embodiment shown in FIG. 12, except that step S8_3 is executed instead of step S8_2. The differences will be described below.

ステップＳ８_３において、処理装置２１は特定部２３０Ｃとして機能し、プロファイルに基づいて、ステップＳ８_１の処理で選択された第２キーワードＫＷ２のうち、ユーザＵのプロファイルの合計評価値が最も高い第２キーワードＫＷ２を対象キーワードＷｘとして特定する。 In step S8_3, the processing device 21 functions as the specific unit 230C, and among the second keyword KW2 selected in the process of step S8_1 based on the profile, the second keyword KW2 having the highest total evaluation value of the profile of the user U Is specified as the target keyword Wx.

第４実施形態によれば、特定部２３０Ｃは、関連性の程度を示す関連度及びユーザＵのプロファイルに基づいて、複数の第２キーワードＫＷ２の中から対象キーワードＷｘを特定する。この態様によれば、ユーザＵのプロファイルを考慮して対象キーワードＷｘを特定するため、ユーザＵのプロファイルを考慮しない場合と比較して、ユーザＵの関心の高いコメントを提供することができる。 According to the fourth embodiment, the specifying unit 230C specifies the target keyword Wx from the plurality of second keywords KW2 based on the degree of relevance indicating the degree of relevance and the profile of the user U. According to this aspect, since the target keyword Wx is specified in consideration of the profile of the user U, it is possible to provide a comment that is of high interest to the user U as compared with the case where the profile of the user U is not considered.

なお、図１４を参照して説明したユーザ装置２０の動作では、特定部２３０Ｂは、関連度を用いて第２キーワード生成部２２０Ａで生成した複数の第２キーワードＫＷ２のうち対象キーワードＷｘの候補となる第２キーワードＫＷ２を選択し（ステップＳ８_１）、その後、プロファイルに基づいて対象キーワードＷｘを特定する（ステップＳ８_３）が、順序を逆転させてもよい。即ち、特定部２３０Ｃは、プロファイルに基づいて第２キーワード生成部２２０Ａで生成した複数の第２キーワードＫＷ２のうち対象キーワードＷｘの候補となる第２キーワードＫＷ２を選択し、その後、関連度を用いて対象キーワードＷｘを特定してもよい。加えて、特定部２３０Ｃは、プロファイル及び関連度を同時に用いて、複数の第２キーワードＫＷ２の中から、対象キーワードＷｘを特定してもよい。特定部２３０Ｃは、例えば、プロファイルに基づく合計評価値を関連度に加算し、複数の第２キーワードＫＷ２についてのそれぞれの加算結果を比較して対象キーワードＷｘを特定してもよい。 In the operation of the user apparatus 20 described with reference to FIG. 14, the specific unit 230B is a candidate for the target keyword Wx among the plurality of second keyword KW2 generated by the second keyword generation unit 220A using the degree of relevance. The second keyword KW2 is selected (step S8_1), and then the target keyword Wx is specified based on the profile (step S8_3), but the order may be reversed. That is, the specific unit 230C selects the second keyword KW2 that is a candidate for the target keyword Wx from the plurality of second keyword KW2 generated by the second keyword generation unit 220A based on the profile, and then uses the degree of relevance. The target keyword Wx may be specified. In addition, the specifying unit 230C may specify the target keyword Wx from a plurality of second keywords KW2 by using the profile and the degree of relevance at the same time. For example, the specific unit 230C may add the total evaluation value based on the profile to the degree of relevance and compare the addition results of the plurality of second keywords KW2 to specify the target keyword Wx.

[５．変形例]
本発明は、以上に例示した各実施形態に限定されない。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された２以上の態様を併合してもよい。[5. Modification example]
The present invention is not limited to the embodiments exemplified above. Specific modes of modification are illustrated below. Two or more embodiments arbitrarily selected from the following examples may be merged.

（１）上述した各実施形態において、抽出部２２１が画像信号Ｓｇの画像からオブジェクト画像を抽出するフレームは以下のフレームであってもよい。
第１に、抽出部２２１は、視聴率の高いフレームでオブジェクト画像を抽出してもよい。この場合、抽出部２２１は、視聴率を外部装置からリアルタイムで取得すればよい。具体的には、抽出部２２１は、取得した視聴率が所定の視聴率を超えたフレームでオブジェクト画像の抽出を実行する。視聴率が高いフレームは、他のフレームと比較してユーザＵの関心が他の高いと推定される。従って、ユーザＵの関心が高いフレームの画像からオブジェクト画像が抽出されるので、ユーザＵに有益なコメントを生成できる。
第２に、抽出部２２１は、ユーザＵの音声信号Ｓａに基づいて、ユーザＵが歓声をあげたフレームでオブジェクト画像を抽出してもよい。
第３に、抽出部２２１は、番組情報に基づいて番組の主題となるフレームでオブジェク画像を抽出してもよい。例えば、抽出部２２１は、第２実施形態で説明した解析部２２３を用いて、画像信号Ｓｇを解析し、番組の主題となるフレームを特定してもよい。この場合、解析部２２３は、ネットワークＮＷを介して外部装置から番組情報を取得すればよい。
また、上述した各実施形態では、画像信号Ｓｇは動画を示す信号として説明したが、画像信号Ｓｇは静止画を示す信号であってもよい。(1) In each of the above-described embodiments, the frame from which the extraction unit 221 extracts the object image from the image of the image signal Sg may be the following frame.
First, the extraction unit 221 may extract an object image in a frame having a high audience rating. In this case, the extraction unit 221 may acquire the audience rating from an external device in real time. Specifically, the extraction unit 221 extracts the object image at a frame in which the acquired audience rating exceeds a predetermined audience rating. It is presumed that the frame with a high audience rating is of higher interest to the user U than the other frames. Therefore, since the object image is extracted from the image of the frame in which the user U is highly interested, a comment useful to the user U can be generated.
Secondly, the extraction unit 221 may extract the object image in the frame in which the user U cheers based on the voice signal Sa of the user U.
Third, the extraction unit 221 may extract an object image in a frame that is the subject of the program based on the program information. For example, the extraction unit 221 may analyze the image signal Sg by using the analysis unit 223 described in the second embodiment to specify a frame that is the subject of the program. In this case, the analysis unit 223 may acquire program information from an external device via the network NW.
Further, in each of the above-described embodiments, the image signal Sg has been described as a signal indicating a moving image, but the image signal Sg may be a signal indicating a still image.

（２）上述した各実施形態において、第２キーワードＫＷ２と第１キーワードＫＷ１との関連性の程度を示す関連度は、キーワードテーブルＴＢＬａに記憶されていたが、これに限定されない。
例えば、特定部２３０Ａ、２３０Ｂ、及び２３０Ｃは、複数の単語が意味によって階層化された木構造を有するキーワードテーブルＴＢＬａ（キーワードデータの一例）から特定されるノード数に応じた関連度を取得してもよい。具体的には、第１キーワード生成部２１０は、キーワードテーブルＴＢＬａに含まれる単語を第１キーワードＫＷ１として生成する。また、第２キーワード生成部２２０Ａ及び２２０Ｂは、キーワードテーブルＴＢＬａに含まれる単語を第２キーワードＫＷ２として生成する。特定部２３０Ａ、２３０Ｂ、及び２３０Ｃは、キーワードテーブルＴＢＬａの木構造において、第１キーワードＫＷ１から第２キーワードＫＷ２までの経路におけるノード数を関連度として取得する。
さらに具体的には、キーワードテーブルＴＢＬａのデータ構造が図３に示される木構造である場合を想定する。例えば、第１キーワードＫＷ１が「酒」であり、第２キーワードＫＷ２が「フライドポテト」である場合、「酒」から「フライドポテト」に至る経路は、ノード「酒」→ノード「飲み物」→ノード「飲食物」→ノード「食べ物」→ノード「洋食」→ノード「フライドポテト」となる。従って、第１キーワードＫＷ１「酒」から第２キーワードＫＷ２「フライドポテト」に至る経路のノード数は、「５」となる。また、第１キーワードＫＷ１が「飲食物」であり、第２キーワードＫＷ２が「フライドポテト」である場合、「飲食物」から「フライドポテト」に至る経路は、ノード「飲食物」→ノード「食べ物」→ノード「洋食」→ノード「フライドポテト」となる。従って、第１キーワードＫＷ１「飲食物」から第２キーワードＫＷ２「フライドポテト」に至る経路のノード数は、「３」となる。第１キーワードＫＷ１と第２キーワードＫＷ２とを結ぶ経路のノード数が少ないほど関連度が高いから、上記の例においては、第１キーワードＫＷ１「飲食物」と第２キーワードＫＷ２「フライドポテト」との関連度は、第１キーワードＫＷ１「酒」と第２キーワードＫＷ２「フライドポテト」との関連度よりも高い。
ノード数に応じて関連度を特定することによって、ユーザ装置２０において必要となるキーワードテーブルＴＢＬａの記憶容量を削減することができる。(2) In each of the above-described embodiments, the degree of association indicating the degree of association between the second keyword KW2 and the first keyword KW1 is stored in the keyword table TBLa, but is not limited thereto.
For example, the specific units 230A, 230B, and 230C acquire the degree of relevance according to the number of nodes specified from the keyword table TBLa (an example of keyword data) having a tree structure in which a plurality of words are layered according to meaning. May be good. Specifically, the first keyword generation unit 210 generates a word included in the keyword table TBLa as the first keyword KW1. Further, the second keyword generation units 220A and 220B generate the words included in the keyword table TBLa as the second keyword KW2. The specific units 230A, 230B, and 230C acquire the number of nodes in the route from the first keyword KW1 to the second keyword KW2 as the degree of relevance in the tree structure of the keyword table TBLa.
More specifically, it is assumed that the data structure of the keyword table TBLa is the tree structure shown in FIG. For example, if the first keyword KW1 is "liquor" and the second keyword KW2 is "french fries", the route from "liquor" to "french fries" is node "liquor"-> node "drink"-> node. "Food and drink"-> node "food"-> node "Western food"-> node "french fries". Therefore, the number of nodes in the route from the first keyword KW1 "liquor" to the second keyword KW2 "french fries" is "5". When the first keyword KW1 is "food and drink" and the second keyword KW2 is "french fries", the route from "food and drink" to "french fries" is node "food and drink" → node "food". → Node "Western food" → Node "French fries". Therefore, the number of nodes in the route from the first keyword KW1 “food and drink” to the second keyword KW2 “fried potato” is “3”. The smaller the number of nodes in the route connecting the first keyword KW1 and the second keyword KW2, the higher the degree of relevance. Therefore, in the above example, the first keyword KW1 "food and drink" and the second keyword KW2 "french fries" are used. The degree of relevance is higher than the degree of relevance between the first keyword KW1 "liquor" and the second keyword KW2 "french fries".
By specifying the degree of relevance according to the number of nodes, the storage capacity of the keyword table TBLa required in the user apparatus 20 can be reduced.

（３）上述した各実施形態において、抽出部２２１は、ユーザＵの行動履歴を考慮せずにオブジェクト画像を抽出したが、画像信号Ｓｇの示す画像から行動履歴に基づいてオブジェクト画像を抽出してもよい。この場合、抽出部２２１は、第３実施形態で説明した行動履歴テーブルＴＢＬｃを参照して、商品の購入履歴等から例えばユーザＵの好みの色を特定し、特定した色のオブジェクト画像を抽出してもよい。この変形例によれば、オブジェクト画像を絞り込むことができるので、変換部２２２の処理負荷を軽減できる。 (3) In each of the above-described embodiments, the extraction unit 221 extracts the object image without considering the action history of the user U, but extracts the object image from the image indicated by the image signal Sg based on the action history. May be good. In this case, the extraction unit 221 refers to the action history table TBLC described in the third embodiment, specifies, for example, the user U's favorite color from the purchase history of the product, and extracts the object image of the specified color. You may. According to this modification, the object image can be narrowed down, so that the processing load of the conversion unit 222 can be reduced.

（４）上述した実施形態は適宜組み合わせることが可能である。例えば、第２実施形態の第２キーワード生成部２２０Ｂを第３実施形態及び第４実施形態の第２キーワード生成部２２０Ａの替わりに用いてもよい。 (4) The above-described embodiments can be combined as appropriate. For example, the second keyword generation unit 220B of the second embodiment may be used in place of the second keyword generation unit 220A of the third embodiment and the fourth embodiment.

（５）上述した各実施形態の説明に用いたブロック図は、機能単位のブロックを示している。これらの機能ブロック（構成部）は、ハードウェア及び／又はソフトウェアの任意の組み合わせによって実現される。また、各機能ブロックの実現手段は特に限定されない。すなわち、各機能ブロックは、物理的及び／又は論理的に結合した１つの装置により実現されてもよいし、物理的及び／又は論理的に分離した２つ以上の装置を直接的及び／又は間接的に(例えば、有線及び／又は無線)で接続し、これら複数の装置により実現されてもよい。例えば、変換部２２２の機能はネットワークＮＷを介して接続されるサーバ装置から提供されてもよい。同様に、キーワードテーブルＴＢＬａもサーバ装置に設けられてもよい。
また、上述した各実施形態の説明に用いた「装置」という文言は、回路、デバイス又はユニット等の他の用語に読替えてもよい。(5) The block diagram used in the description of each of the above-described embodiments shows a block of functional units. These functional blocks (components) are realized by any combination of hardware and / or software. Further, the means for realizing each functional block is not particularly limited. That is, each functional block may be realized by one physically and / or logically coupled device, or directly and / or indirectly by two or more physically and / or logically separated devices. (For example, wired and / or wireless) may be connected and realized by these plurality of devices. For example, the function of the conversion unit 222 may be provided from a server device connected via the network NW. Similarly, the keyword table TBLa may be provided in the server device.
Further, the word "device" used in the description of each of the above-described embodiments may be read as another term such as a circuit, a device, or a unit.

（６）上述した各実施形態における処理手順、シーケンス、フローチャートなどは、矛盾の無い限り、順序を入れ替えてもよい。例えば、本明細書で説明した方法については、例示的な順序で様々なステップの要素を提示しており、提示した特定の順序に限定されない。 (6) The order of the processing procedures, sequences, flowcharts, etc. in each of the above-described embodiments may be changed as long as there is no contradiction. For example, the methods described herein present elements of various steps in an exemplary order and are not limited to the particular order presented.

（７）上述した各実施形態において、入出力された情報等は特定の場所(例えば、メモリ)に保存されてもよいし、管理テーブルで管理してもよい。入出力される情報等は、上書き、更新、又は追記され得る。出力された情報等は削除されてもよい。入力された情報等は他の装置へ送信されてもよい。 (7) In each of the above-described embodiments, the input / output information and the like may be stored in a specific place (for example, a memory) or may be managed by a management table. Information to be input / output may be overwritten, updated, or added. The output information and the like may be deleted. The input information or the like may be transmitted to another device.

（８）上述した各実施形態において、判定は、１ビットで表される値（０か１か）によって行われてもよいし、真偽値（Boolean：true又はfalse）によって行われてもよいし、数値の比較（例えば、所定の値との比較）によって行われてもよい。 (8) In each of the above-described embodiments, the determination may be made by a value represented by one bit (0 or 1) or by a truth value (Boolean: true or false). However, it may be performed by comparison of numerical values (for example, comparison with a predetermined value).

（９）上述した各実施形態では、記憶装置２２は、処理装置２１が読取可能な記録媒体であり、ＲＯＭ及びＲＡＭなどを例示したが、フレキシブルディスク、光磁気ディスク(例えば、コンパクトディスク、デジタル多用途ディスク、Ｂｌｕ−ｒａｙ（登録商標）ディスク)、スマートカード、フラッシュメモリデバイス(例えば、カード、スティック、キードライブ)、ＣＤ−ＲＯＭ（Compact Disc−ＲＯＭ）、レジスタ、リムーバブルディスク、ハードディスク、フロッピー（登録商標）ディスク、磁気ストリップ、データベース、サーバその他の適切な記憶媒体である。また、プログラムは、ネットワークＮＷから送信されても良い。また、プログラムは、電気通信回線を介して通信網から送信されても良い。 (9) In each of the above-described embodiments, the storage device 22 is a recording medium that can be read by the processing device 21, and examples thereof include a ROM and a RAM. Applications Discs, Blu-ray® discs), smart cards, flash memory devices (eg cards, sticks, key drives), CD-ROMs (Compact Disc-ROMs), registers, removable discs, hard disks, floppy (registration). Trademarks) Disks, magnetic strips, databases, servers and other suitable storage media. Further, the program may be transmitted from the network NW. Further, the program may be transmitted from the communication network via a telecommunication line.

（１０）上述した各実施形態は、ＬＴＥ（Long Term Evolution）、ＬＴＥ−Ａ（LTE-Advanced）、ＳＵＰＥＲ３Ｇ、ＩＭＴ−Ａｄｖａｎｃｅｄ、４Ｇ、５Ｇ、ＦＲＡ（Future Radio Access）、Ｗ−ＣＤＭＡ（登録商標）、ＧＳＭ（登録商標）、ＣＤＭＡ２０００、ＵＭＢ（Ultra Mobile Broadband）、ＩＥＥＥ８０２．１１（Ｗｉ−Ｆｉ）、ＩＥＥＥ８０２．１６（ＷｉＭＡＸ）、ＩＥＥＥ８０２．２０、ＵＷＢ（Ultra-WideBand）、Ｂｌｕｅｔｏｏｔｈ（登録商標）、その他の適切なシステムを利用するシステム及び／又はこれらに基づいて拡張された次世代システムに適用されてもよい。 (10) Each of the above-described embodiments is LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G, 5G, FRA (Future Radio Access), W-CDMA (registered trademark). ), GSM (registered trademark), CDMA2000, UMB (Ultra Mobile Broadband), IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20, UWB (Ultra-WideBand), Bluetooth (registered) It may be applied to systems that utilize other suitable systems and / or next-generation systems that are extended based on them.

（１１）上述した各実施形態において、説明した情報及び信号などは、様々な異なる技術のいずれかを使用して表されてもよい。例えば、上述の説明全体に渡って言及され得るデータ、命令、コマンド、情報、信号、ビット、シンボル、チップなどは、電圧、電流、電磁波、磁界若しくは磁性粒子、光場若しくは光子、又はこれらの任意の組み合わせによって表されてもよい。
なお、本明細書で説明した用語及び／又は本明細書の理解に必要な用語については、同一の又は類似する意味を有する用語と置き換えてもよい。(11) In each of the above embodiments, the information, signals, and the like described may be represented using any of a variety of different techniques. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description are voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. It may be represented by a combination of.
The terms described herein and / or the terms necessary for understanding the present specification may be replaced with terms having the same or similar meanings.

（１２）図４、図７、図１０、及び図１２に例示された各機能は、ハードウェア及びソフトウェアの任意の組合せによって実現される。また、各機能は、単体の装置によって実現されてもよいし、相互に別体で構成された２個以上の装置によって実現されてもよい。 (12) Each of the functions exemplified in FIGS. 4, 7, 10, and 12 is realized by any combination of hardware and software. Further, each function may be realized by a single device, or may be realized by two or more devices configured as separate bodies from each other.

（１３）上述した各実施形態で例示したプログラムは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード又はハードウェア記述言語と呼ばれるか、他の名称によって呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順又は機能等を意味するよう広く解釈されるべきである。
また、ソフトウェア、命令などは、伝送媒体を介して送受信されてもよい。例えば、ソフトウェアが、同軸ケーブル、光ファイバケーブル、ツイストペア及びデジタル加入者回線（ＤＳＬ）などの有線技術及び／又は赤外線、無線及びマイクロ波などの無線技術を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び／又は無線技術は、伝送媒体の定義内に含まれる。(13) The program exemplified in each of the above-described embodiments is called an instruction, an instruction set, a code, or a code segment regardless of whether it is called a software, firmware, middleware, microcode or hardware description language, or by another name. , Program code, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures or functions, etc. should be broadly interpreted.
Further, software, instructions, and the like may be transmitted and received via a transmission medium. For example, the software may use wired technology such as coaxial cable, fiber optic cable, twisted pair and digital subscriber line (DSL) and / or wireless technology such as infrared, wireless and microwave to website, server, or other. When transmitted from a remote source, these wired and / or wireless technologies are included within the definition of transmission medium.

（１４）上述した各実施形態において、「システム」及び「ネットワーク」という用語は、互換的に使用される。 (14) In each of the above embodiments, the terms "system" and "network" are used interchangeably.

（１５）上述した各実施形態において、情報、パラメータなどは、絶対値で表されてもよいし、所定の値からの相対値で表されてもよいし、対応する別の情報で表されてもよい。 (15) In each of the above-described embodiments, the information, parameters, etc. may be represented by absolute values, relative values from predetermined values, or other corresponding information. May be good.

（１６）上述した各実施形態において、ユーザ装置２０は、移動局である場合が含まれる。移動局は、当業者によって、加入者局、モバイルユニット、加入者ユニット、ワイヤレスユニット、リモートユニット、モバイルデバイス、ワイヤレスデバイス、ワイヤレス通信デバイス、リモートデバイス、モバイル加入者局、アクセス端末、モバイル端末、ワイヤレス端末、リモート端末、ハンドセット、ユーザエージェント、モバイルクライアント、クライアント、又はいくつかの他の適切な用語で呼ばれる場合もある。 (16) In each of the above-described embodiments, the user device 20 may be a mobile station. Mobile stations can be used by those skilled in the art as subscriber stations, mobile units, subscriber units, wireless units, remote units, mobile devices, wireless devices, wireless communication devices, remote devices, mobile subscriber stations, access terminals, mobile terminals, wireless. It may also be referred to as a terminal, remote terminal, handset, user agent, mobile client, client, or some other suitable term.

（１７）上述した各実施形態において、「接続された(connected)」という用語、又はこれらのあらゆる変形は、２又はそれ以上の要素間の直接的又は間接的なあらゆる接続又は結合を意味し、互いに「接続」された２つの要素間に１又はそれ以上の中間要素が存在することを含むことができる。要素間の接続は、物理的なものであっても、論理的なものであっても、或いはこれらの組み合わせであってもよい。本明細書で使用する場合、２つの要素は、１又はそれ以上の電線、ケーブル及び／又はプリント電気接続を使用することにより、並びにいくつかの非限定的かつ非包括的な例として、無線周波数領域、マイクロ波領域及び光（可視及び不可視の両方）領域の波長を有する電磁エネルギーなどの電磁エネルギーを使用することにより、互いに「接続」されると考えることができる。 (17) In each of the embodiments described above, the term "connected", or any variation thereof, means any direct or indirect connection or connection between two or more elements. It can include the presence of one or more intermediate elements between two elements that are "connected" to each other. The connection between the elements may be physical, logical, or a combination thereof. As used herein, the two elements are by using one or more wires, cables and / or printed electrical connections, and, as some non-limiting and non-comprehensive examples, radio frequencies. It can be considered to be "connected" to each other by using electromagnetic energies such as electromagnetic energies having wavelengths in the region, microwave region and light (both visible and invisible) regions.

（１８）上述した各実施形態において、「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 (18) In each of the embodiments described above, the statement "based on" does not mean "based on" unless otherwise stated. In other words, the statement "based on" means both "based only" and "at least based on".

（１９）本明細書で使用する「第１」、「第２」などの呼称を使用した要素へのいかなる参照も、それらの要素の量又は順序を全般的に限定するものではない。これらの呼称は、２つ以上の要素間を区別する便利な方法として本明細書で使用され得る。従って、第１及び第２の要素への参照は、２つの要素のみがそこで採用され得ること、又は何らかの形で第１の要素が第２の要素に先行しなければならないことを意味しない。 (19) Any reference to elements using designations such as "first", "second" as used herein does not generally limit the quantity or order of those elements. These designations can be used herein as a convenient way to distinguish between two or more elements. Thus, references to the first and second elements do not mean that only two elements can be adopted there, or that the first element must somehow precede the second element.

（２０）上述した各実施形態において「含む(ｉｎｃｌｕｄｉｎｇ)」、「含んでいる（ｃｏｍｐｒｉｓｉｎｇ）」、及びそれらの変形が、本明細書あるいは特許請求の範囲で使用されている限り、これら用語は、用語「備える」と同様に、包括的であることが意図される。さらに、本明細書あるいは特許請求の範囲において使用されている用語「又は（or）」は、排他的論理和ではないことが意図される。 (20) As long as "inclusion," "comprising," and variations thereof in each of the embodiments described above are used herein or within the scope of the claims, these terms are used. As with the term "prepared", it is intended to be inclusive. Moreover, the term "or" as used herein or in the claims is intended to be non-exclusive.

（２１）本願の全体において、例えば、英語におけるa、an及びtheのように、翻訳によって冠詞が追加された場合、これらの冠詞は、文脈から明らかにそうではないことが示されていなければ、複数を含む。 (21) In the whole of the present application, if articles are added by translation, for example a, an and the in English, unless the context clearly indicates that these articles are not. Includes multiple.

（２２）本発明が本明細書中に説明した実施形態に限定されないことは当業者にとって明白である。本発明は、特許請求の範囲の記載に基づいて定まる本発明の趣旨及び範囲を逸脱することなく修正及び変更態様として実施できる。従って、本明細書の記載は、例示的な説明を目的とし、本発明に対して何ら制限的な意味を有さない。また、本明細書に例示した態様から選択された複数の態様を組合せてもよい。 (22) It will be apparent to those skilled in the art that the invention is not limited to the embodiments described herein. The present invention can be implemented as modifications and modifications without departing from the spirit and scope of the present invention, which is determined based on the description of the scope of claims. Therefore, the description herein is for purposes of illustration and has no limiting implications for the present invention. In addition, a plurality of embodiments selected from the embodiments exemplified herein may be combined.

１…サービスシステム、１０…動画配信サーバ、１１…処理装置、２０…ユーザ装置、２１…処理装置、２２…記憶装置、２１０…第１キーワード生成部、２２０Ａ，２２０Ｂ…第２キーワード生成部、２２０Ｂ…第２キーワード生成部、２２１…抽出部、２２２…変換部、２２３…解析部、２３０Ａ，２３０Ｂ，２３０Ｃ…特定部、２４０…コメント生成部、ＫＷ１…第１キーワード、ＫＷ２…第２キーワード、ＴＢＬａ…キーワードテーブル、ＴＢＬｂ…コメントテーブル、ＴＢＬｃ…行動履歴テーブル、Ｗｘ…対象キーワード。 1 ... Service system, 10 ... Video distribution server, 11 ... Processing device, 20 ... User device, 21 ... Processing device, 22 ... Storage device, 210 ... First keyword generation unit, 220A, 220B ... Second keyword generation unit, 220B ... second keyword generation unit, 221 ... extraction unit, 222 ... conversion unit, 223 ... analysis unit, 230A, 230B, 230C ... specific unit, 240 ... comment generation unit, KW1 ... first keyword, KW2 ... second keyword, TBLa ... Keyword table, TBLb ... Comment table, TBLc ... Action history table, Wx ... Target keywords.

Claims

A first keyword generator that generates the first keyword based on the user's voice,
A second keyword generation unit that generates a plurality of second keywords corresponding to a plurality of object images extracted from an image indicated by an image signal on a one-to-one basis.
A specific unit that specifies a target keyword to be commented from the plurality of second keywords based on the degree of relevance between each of the plurality of second keywords and the first keyword.
A comment generator that generates comments related to the target keyword,
Information processing device equipped with.

Of the plurality of second keywords, the relationship when the other second keyword matches the first keyword as compared with the degree of relevance when one second keyword does not match the first keyword. The degree of sex is high,
The information processing apparatus according to claim 1, wherein the specific unit specifies a second keyword that matches the first keyword among the plurality of second keywords as the target keyword.

The second keyword generation unit is
An analysis unit that analyzes the image signal and
An extraction unit that extracts the plurality of object images from the image indicated by the image signal based on the analysis result of the analysis unit, and an extraction unit.
A conversion unit for converting each of the plurality of object images into the corresponding second keyword among the plurality of second keywords is provided.
The information processing apparatus according to claim 1 or 2.

From claim 1, the specific unit specifies the target keyword from the plurality of second keywords based on the degree of the relationship between each second keyword and the first keyword and the behavior history of the user. The information processing apparatus according to any one of up to 3.

Claims 1 to 3 specify the target keyword from the plurality of second keywords based on the degree of the relationship between each second keyword and the first keyword and the profile of the user. The information processing apparatus according to any one of the above items.