JP7257010B2

JP7257010B2 - SEARCH SUPPORT SERVER, SEARCH SUPPORT METHOD, AND COMPUTER PROGRAM

Info

Publication number: JP7257010B2
Application number: JP2021039267A
Authority: JP
Inventors: 清幸鈴木; 克利大川; 正規中村
Original assignee: Advanced Media Inc
Current assignee: Advanced Media Inc
Priority date: 2021-03-11
Filing date: 2021-03-11
Publication date: 2023-04-13
Anticipated expiration: 2041-03-11
Also published as: JP2022139052A

Description

本発明は、Ｗｅｂサイトにおいてコンテンツ提供者が提供する多数の情報あるいはコンテンツの中から、検索主体であるユーザが、希望する情報あるいはコンテンツを、発話により絞り込むことで迅速にかつ確実に表示することが可能な検索支援サーバ、検索支援方法及びコンピュータプログラムに関する。 According to the present invention, a user, who is the subject of a search, can quickly and reliably display desired information or contents by speaking out of a large amount of information or contents provided by a content provider on a website. The present invention relates to a possible search support server, search support method and computer program.

Ｗｅｂサイトには様々な情報が存在する。現状では、Ｗｅｂサイトのトップページのアイコンのクリックやクリック可能な言葉による説明、Ｇｏｏｇｌｅ（Ｒ）、Ｙａｈｏｏ（Ｒ）等の検索エンジンを利用した検索窓を使うことで、ユーザが希望する内容を含むと考えられるＷｅｂページへのアドレス、あるいは情報やコンテンツそのものを抽出してユーザが視認可能な表示装置において表示している。 Various information exists on the website. Currently, it is possible to find the content that the user desires by clicking on the icon on the top page of the website, using a description using words that can be clicked, or using a search window using a search engine such as Google (R) or Yahoo (R). An address to a Web page, or information or content itself is extracted and displayed on a display device that can be visually recognized by the user.

また、ユーザの発話により検索キーの入力を受け付ける場合、Ｇｏｏｇｌｅ（Ｒ）、Ｍｉｃｒｏｓｏｆｔ（Ｒ）等が提供する音声認識エンジンを使用して、ユーザによる発話を認識している。これらの商用の音声認識エンジンは、広汎に発話のデータを収集し、機械学習、あるいは深層学習により生成された汎用型の音声認識エンジンとして提供されている。 Also, when receiving a search key input by user's utterance, a speech recognition engine provided by Google (R), Microsoft (R), etc. is used to recognize the user's utterance. These commercial speech recognition engines are provided as general-purpose speech recognition engines generated by extensively collecting speech data and machine learning or deep learning.

したがって、ユーザの発話を音声認識する場合に、コンテンツ提供者がユーザに対して提供を意図する検索キーを正しく認識することができない場合が生じるおそれがあった。そこで、例えば特許文献１には、複数の言語モデルを用いることで音声認識精度を高めている音声理解装置が開示されている。 Therefore, when recognizing the user's utterance by voice, there is a possibility that the search key that the content provider intends to provide to the user cannot be correctly recognized. Therefore, for example, Patent Literature 1 discloses a speech understanding device that uses a plurality of language models to improve speech recognition accuracy.

特開２０１０－１７０１３７JP 2010-170137

しかし、特許文献１では、複数の音声認識エンジンを用いて並列に処理し、これら複数の認識結果に基づく言語理解エンジンの結果を統合処理している。しかし、複数認識結果及び言語理解処理結果の統合処理に相当の時間を要するばかりでなく、統合処理後の認識結果の確度を保証できないため現実的ではない。したがって、ユーザの発話によって、短時間で正しい検索キーを取得できる保証がないという問題点があった。 However, in Patent Document 1, processing is performed in parallel using a plurality of speech recognition engines, and the results of the language understanding engine based on the plurality of recognition results are integrated. However, not only does it take a considerable amount of time to integrate multiple recognition results and language understanding processing results, but the accuracy of recognition results after the integration processing cannot be guaranteed, which is not realistic. Therefore, there is a problem that there is no guarantee that the correct search key can be acquired in a short time by the user's utterance.

本発明は、上記事情に鑑みてなされたものであり、コンテンツの特定部分を示すタグ情報を選択するための発話を誘導するガイドデータを選択的に表示するサイネージウィンドウと、全てのガイドデータを発話で絞り込んで表示することが可能なスピーチウィンドウを用いることで、ユーザが自己の目的に応じて絞り込んだガイドデータ通りに発話又は選択する限り、コンテンツ提供者がユーザに提供したいコンテンツの特定部分へ迅速にかつ確実に誘導することが可能な検索支援サーバ、検索支援方法及びコンピュータプログラムを提供することを目的とする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances. By using a speech window that can be narrowed down and displayed by the user, as long as the user speaks or selects according to the guide data narrowed down according to his/her purpose, the content provider can quickly access the specific part of the content that the content provider wants to provide to the user. It is an object of the present invention to provide a search support server, a search support method, and a computer program that can guide users accurately and reliably.

上記目的を達成するために本発明に係る検索支援サーバは、Ｗｅｂサイト上に公開されるコンテンツの検索を支援する検索支援サーバであって、コンテンツの特定部分を示すタグ情報を選択するための発話を誘導するガイドデータを選択的に表示するサイネージウィンドウと、前記ガイドデータの中から、表示するべきガイドデータをユーザの発話により絞り込むことが可能なスピーチウィンドウとをデータ通信することが可能に接続されている端末装置に表示させ、前記サイネージウィンドウ及び／又は前記スピーチウィンドウに表示された前記タグ情報のユーザによる発話を受け付けて、表示される前記タグ情報を絞り込み、絞り込まれた前記タグ情報の選択を受け付けることにより、選択を受け付けた前記タグ情報に対応する前記コンテンツの特定部分を前記端末装置に表示させることを特徴とする。 In order to achieve the above object, a search support server according to the present invention is a search support server that supports a search for content published on a website, and includes an utterance for selecting tag information indicating a specific portion of content. and a speech window capable of narrowing down the guide data to be displayed from among the guide data by user's utterance. receive the user's utterance of the tag information displayed on the signage window and/or the speech window, narrow down the tag information to be displayed, and select the narrowed down tag information. By receiving the selection, the specific part of the content corresponding to the selected tag information is displayed on the terminal device.

また、本発明に係る検索支援サーバは、前記タグ情報は、少なくとも前記コンテンツの特定部分を識別するタグ名称、前記タグ名称の内容を説明する単文又は単語群からなるテキストデータ及び前記タグ名称の属性を示す属性情報で構成されていることが好ましい。 In the search support server according to the present invention, the tag information includes at least a tag name identifying a specific part of the content, text data consisting of a simple sentence or a group of words describing the content of the tag name, and attributes of the tag name. is preferably configured with attribute information indicating

また、本発明に係る検索支援サーバは、前記サイネージウィンドウ及び／又は前記スピーチウィンドウに表示するガイドデータの基礎となるサーチデータを生成するサーチデータ生成手段を備え、該サーチデータ生成手段は、前記コンテンツの内容に基づいて、内容を示す単文又は単語群を抽出して、抽出された単文又は単語群の選択を受け付けた場合に対応する前記コンテンツを表示する抽出・表示手段と、表示された前記コンテンツに基づいて、前記タグ名称及び前記タグ名称の内容を示す単文又は単語群の入力を受け付けるタグ入力受付手段と、表示された前記コンテンツに基づいて、前記コンテンツの特定部分を示すポインタ情報を探索するポインタ探索手段と、入力を受け付けた前記タグ名称ごとに、探索された前記ポインタ情報の割り付けを受け付けるポインタ割付受付手段と、入力を受け付けた前記タグ名称ごとに、検索時に発話可能な前記属性情報の入力を受け付ける属性情報受付手段とを備えることが好ましい。 Further, the search support server according to the present invention comprises search data generation means for generating search data that is the basis of guide data to be displayed on the signage window and/or the speech window, and the search data generation means generates the content Extraction/display means for extracting a simple sentence or a group of words indicating the contents based on the contents of and displaying the content corresponding to the selection of the extracted simple sentence or the group of words; and searching for pointer information indicating a specific portion of the content based on the displayed content. pointer search means; pointer allocation reception means for receiving allocation of the searched pointer information for each tag name whose input is received; and attribute information that can be spoken at the time of retrieval for each tag name whose input is received. It is preferable to have attribute information receiving means for receiving an input.

また、本発明に係る検索支援サーバは、前記抽出・表示手段が、前記コンテンツを精査して、含まれている文字列を単文又は単語群として抽出してテキストデータとして出力するコンテンツ文字起し手段と、出力されたテキストデータの選択を受け付けた場合、選択を受け付けたテキストデータに対応する前記コンテンツを表示するコンテンツ表示手段とを備えることが好ましい。 Further, in the search support server according to the present invention, the extraction/display means scrutinizes the contents, extracts the contained character strings as simple sentences or word groups, and outputs them as text data. and content display means for displaying the content corresponding to the selected text data when the selection of the output text data is accepted.

また、本発明に係る検索支援サーバは、前記ガイドデータが、前記サーチデータに含まれる前記タグ情報を、前記サーチデータを識別するサーチデータ識別情報と対応付けて生成されることが好ましい。 Further, in the search support server according to the present invention, it is preferable that the guide data is generated by associating the tag information included in the search data with search data identification information for identifying the search data.

また、本発明に係る検索支援サーバは、前記スピーチウィンドウに表示され、ユーザによる選択を受け付けることが可能な前記タグ名称を絞り込むために、ユーザにより発話された音声データの入力を受け付ける発話受付手段と、入力を受け付けた音声データをテキストデータに変換し、変換したテキストデータの前記ガイドデータの前記タグ名称及び前記タグ名称の内容を示す単文又は単語群に対する一致度を算出し、算出された一致度が最大であるタグ名称を特定するとともに、前記一致度が所定値より大きい場合に前記タグ名称に対応する前記コンテンツの特定部分を表示するメタタグ音声認識手段と、入力を受け付けた音声データに基づいて、前記ガイドデータの前記タグ情報のうち、前記属性情報に一致するものを抽出し、一致する一又は複数の属性情報の論理積による前記ガイドデータの絞り込みを行い、絞り込まれた前記ガイドデータの前記タグ名称及び前記属性情報を出力する絞り込み音声認識手段とを備え、前記メタタグ音声認識手段及び前記絞り込み音声認識手段を並行して実行することが好ましい。 Further, the search support server according to the present invention further includes speech accepting means for accepting input of voice data uttered by the user in order to narrow down the tag names displayed in the speech window and capable of accepting selection by the user. , converts the received voice data into text data, calculates the degree of matching of the converted text data with the tag name of the guide data and a simple sentence or word group indicating the content of the tag name, and calculates the calculated degree of matching Metatag speech recognition means for identifying the tag name with the maximum value and displaying the specific part of the content corresponding to the tag name when the degree of matching is greater than a predetermined value; extracting tag information that matches the attribute information from among the tag information of the guide data, narrowing down the guide data by logical AND of one or more pieces of attribute information that match; It is preferable to provide a narrowing speech recognition means for outputting the tag name and the attribute information, and to execute the metatag speech recognition means and the narrowing speech recognition means in parallel.

また、本発明に係る検索支援サーバは、前記メタタグ音声認識手段が、変換されたテキストデータと、すべての前記タグ名称及び前記タグ名称の内容を示す単文又は単語群で形成された第一のデータセットとの一致度を算出し、前記絞り込み音声認識手段が、すべてのタグ名称に対応付けられている前記属性情報で形成された第二のデータセットを音声認識フィルタとして用いた認識結果として抽出された前記属性情報の論理積により前記ガイドデータを絞り込むことが好ましい。 Further, in the search support server according to the present invention, the metatag speech recognition means converts the text data, all the tag names and the first data formed of simple sentences or word groups indicating the contents of the tag names The degree of matching with the set is calculated, and the narrowing speech recognition means extracts the second data set formed by the attribute information associated with all the tag names as a recognition result using as a speech recognition filter. Preferably, the guide data are narrowed down by logical product of the attribute information.

また、本発明に係る検索支援サーバは、前記絞り込み音声認識手段は、絞り込まれたガイドデータの前記属性情報により前記第二のデータセットを更新することが好ましい。 Further, in the search support server according to the present invention, it is preferable that the narrowed-down speech recognition means updates the second data set with the attribute information of the narrowed-down guide data.

次に、上記目的を達成するために本発明に係る検索支援方法は、Ｗｅｂサイト上に公開されるコンテンツの検索を支援する検索支援サーバで実行することが可能な検索支援方法であって、前記検索支援サーバが、コンテンツの特定部分を示すタグ情報を選択するための発話を誘導するガイドデータを選択的に表示するサイネージウィンドウと、前記ガイドデータの中から、表示するべきガイドデータをユーザの発話により絞り込むことが可能なスピーチウィンドウとをデータ通信することが可能に接続されている端末装置に表示させる工程と、前記サイネージウィンドウ及び／又は前記スピーチウィンドウに表示された前記タグ情報のユーザによる発話を受け付けて、表示される前記タグ情報を絞り込み、絞り込まれた前記タグ情報の選択を受け付けることにより、選択を受け付けた前記タグ情報に対応する前記コンテンツの特定部分を前記端末装置に表示させる工程とを実行することを特徴とする。 Next, in order to achieve the above object, a search support method according to the present invention is a search support method that can be executed by a search support server that supports searches for content published on a website, comprising: A search support server provides a signage window for selectively displaying guide data for guiding an utterance for selecting tag information indicating a specific part of a content, and a user uttering the guide data to be displayed from among the guide data. a step of displaying a speech window that can be narrowed down by a terminal device connected to be capable of data communication; receiving and narrowing down the tag information to be displayed, and receiving a selection of the narrowed-down tag information, thereby causing the terminal device to display a specific portion of the content corresponding to the selected tag information. characterized by executing

また、本発明に係る検索支援方法は、前記タグ情報は、少なくとも前記コンテンツの特定部分を識別するタグ名称、前記タグ名称の内容を説明する単文又は単語群からなるテキストデータ及び前記タグ名称の属性を示す属性情報で構成されていることが好ましい。 Further, in the search support method according to the present invention, the tag information includes at least a tag name identifying a specific part of the content, text data consisting of a simple sentence or a group of words describing the contents of the tag name, and attributes of the tag name. is preferably configured with attribute information indicating

また、本発明に係る検索支援方法は、前記検索支援サーバが、前記コンテンツの内容に基づいて、内容を示す単文又は単語群を抽出して、抽出された単文又は単語群の選択を受け付けた場合に対応する前記コンテンツを表示する工程と、表示された前記コンテンツに基づいて、前記タグ名称及び前記タグ名称の内容を示す単文又は単語群の入力を受け付ける工程と、表示された前記コンテンツに基づいて、前記コンテンツの特定部分を示すポインタ情報を探索する工程と、入力を受け付けた前記タグ名称ごとに、探索された前記ポインタ情報の割り付けを受け付ける工程と、入力を受け付けた前記タグ名称ごとに、検索時に発話可能な前記属性情報の入力を受け付ける工程とを実行し、前記サイネージウィンドウ及び／又は前記スピーチウィンドウに表示するガイドデータの基礎となるサーチデータを生成することが好ましい。 Further, in the search support method according to the present invention, when the search support server extracts a simple sentence or word group indicating the content based on the content of the content and accepts selection of the extracted simple sentence or word group receiving an input of the tag name and a simple sentence or word group indicating the content of the tag name based on the displayed content; and based on the displayed content a step of searching for pointer information indicating a specific portion of said content; a step of receiving allocation of said searched pointer information for each of said tag name whose input is received; and a step of searching for each of said tag name whose input is received. receiving the input of the attribute information that can be spoken at times, and generating search data serving as a basis for the guide data to be displayed in the signage window and/or the speech window.

また、本発明に係る検索支援方法は、前記検索支援サーバが、前記コンテンツを精査して、含まれている文字列を単文又は単語群として抽出してテキストデータとして出力する工程と、出力されたテキストデータの選択を受け付けた場合、選択を受け付けたテキストデータに対応する前記コンテンツを表示する工程とを実行することが好ましい。 Further, in the search support method according to the present invention, the search support server scrutinizes the content, extracts the included character string as a simple sentence or word group, and outputs it as text data; and displaying the content corresponding to the selected text data when the selection of the text data is accepted.

また、本発明に係る検索支援方法は、前記ガイドデータが、前記サーチデータに含まれる前記タグ情報を、前記サーチデータを識別するサーチデータ識別情報と対応付けて生成されることが好ましい。 Further, in the search support method according to the present invention, it is preferable that the guide data is generated by associating the tag information included in the search data with search data identification information for identifying the search data.

また、本発明に係る検索支援方法は、前記スピーチウィンドウに表示され、ユーザによる選択を受け付けることが可能な前記タグ名称を選択するために、前記検索支援サーバが、ユーザにより発話された音声データの入力を受け付ける工程を実行し、前記検索支援サーバが、入力を受け付けた音声データをテキストデータに変換し、変換したテキストデータの前記ガイドデータの前記タグ名称及び前記タグ名称の内容を示す単文又は単語群に対する一致度を算出し、算出された一致度が最大であるタグ名称を特定するとともに、前記一致度が所定値より大きい場合に前記タグ名称に対応する前記コンテンツの特定部分を表示する工程、及び入力を受け付けた音声データに基づいて、前記ガイドデータの前記タグ情報のうち、前記属性情報に一致するものを抽出し、一致する一又は複数の属性情報の論理積による前記ガイドデータの絞り込みを行い、絞り込まれた前記ガイドデータの前記タグ名称及び前記属性情報を出力する工程を、並行して実行することが好ましい。 Further, in the search support method according to the present invention, in order to select the tag name displayed in the speech window and capable of accepting selection by the user, the search support server uses voice data uttered by the user. performing a step of accepting an input, wherein the search support server converts the received voice data into text data, and a simple sentence or word indicating the tag name of the guide data in the converted text data and the content of the tag name calculating the degree of matching with respect to the group, identifying the tag name with the highest calculated degree of matching, and displaying a specific portion of the content corresponding to the tag name when the degree of matching is greater than a predetermined value; and extracting the tag information that matches the attribute information from among the tag information of the guide data based on the received voice data, and narrowing down the guide data by logical AND of one or more pieces of attribute information that match. It is preferable that the step of outputting the tag name and the attribute information of the narrowed down guide data is executed in parallel.

また、本発明に係る検索支援方法は、前記検索支援サーバが、変換されたテキストデータと、すべての前記タグ名称及び前記タグ名称の内容を示す単文又は単語群で形成された第一のデータセットとの一致度を算出する工程と、すべてのタグ名称に対応付けられている前記属性情報で形成された第二のデータセットを音声認識フィルタとして用いた認識結果として抽出された前記属性情報の論理積により前記ガイドデータを絞り込む工程とを実行することが好ましい。 Further, in the search support method according to the present invention, the search support server provides a first data set formed of converted text data, simple sentences or word groups indicating all the tag names and the contents of the tag names. and the logic of the attribute information extracted as a recognition result using a second data set formed of the attribute information associated with all tag names as a speech recognition filter and narrowing down the guide data by product.

また、本発明に係る検索支援方法は、前記検索支援サーバが、絞り込まれたガイドデータの前記属性情報により前記第二のデータセットを更新する工程を実行することが好ましい。 Further, in the search support method according to the present invention, it is preferable that the search support server updates the second data set with the attribute information of the narrowed down guide data.

次に、上記目的を達成するために本発明に係るコンピュータプログラムは、Ｗｅｂサイト上に公開されるコンテンツの検索を支援する検索支援サーバで実行することが可能なコンピュータプログラムであって、前記検索支援サーバを、コンテンツの特定部分を示すタグ情報を選択するための発話を誘導するガイドデータを選択的に表示するサイネージウィンドウと、前記ガイドデータの中から、表示するべきガイドデータをユーザの発話により絞り込むことが可能なスピーチウィンドウとをデータ通信することが可能に接続されている端末装置に表示させる手段、及び前記サイネージウィンドウ及び／又は前記スピーチウィンドウに表示された前記タグ情報のユーザによる発話を受け付けて、表示される前記タグ情報を絞り込み、絞り込まれた前記タグ情報の選択を受け付けることにより、選択を受け付けた前記タグ情報に対応する前記コンテンツの特定部分を前記端末装置に表示させる手段として機能させることを特徴とする。 Next, in order to achieve the above object, a computer program according to the present invention is a computer program that can be executed by a search support server that supports a search for content published on a website, comprising: A signage window for selectively displaying guide data for guiding a server to select tag information indicating a specific part of contents, and narrowing down the guide data to be displayed from among the guide data by the user's utterance. means for displaying a speech window capable of data communication on a terminal device connected to be capable of data communication; and narrowing down the tag information to be displayed, and receiving selection of the narrowed-down tag information, thereby functioning as a means for displaying on the terminal device a specific part of the content corresponding to the selected tag information. characterized by

また、本発明に係るコンピュータプログラムは、前記タグ情報は、少なくとも前記コンテンツの特定部分を識別するタグ名称、前記タグ名称の内容を説明する単文又は単語群からなるテキストデータ及び前記タグ名称の属性を示す属性情報で構成されていることが好ましい。 Further, in the computer program according to the present invention, the tag information includes at least a tag name that identifies a specific part of the content, text data consisting of a simple sentence or a group of words describing the contents of the tag name, and attributes of the tag name. It is preferable that the attribute information is composed of the attribute information shown.

また、本発明に係るコンピュータプログラムは、前記検索支援サーバを、前記サイネージウィンドウ及び／又は前記スピーチウィンドウに表示するガイドデータの基礎となるサーチデータを生成するサーチデータ生成手段として機能させ、該サーチデータ生成手段を、前記コンテンツの内容に基づいて、内容を示す単文又は単語群を抽出して、抽出された単文又は単語群の選択を受け付けた場合に対応する前記コンテンツを表示する抽出・表示手段、表示された前記コンテンツに基づいて、前記タグ名称及び前記タグ名称の内容を示す単文又は単語群の入力を受け付けるタグ入力受付手段、表示された前記コンテンツに基づいて、前記コンテンツの特定部分を示すポインタ情報を探索するポインタ探索手段、入力を受け付けた前記タグ名称ごとに、探索された前記ポインタ情報の割り付けを受け付けるポインタ割付受付手段、及び入力を受け付けた前記タグ名称ごとに、検索時に発話可能な前記属性情報の入力を受け付ける属性情報受付手段として機能させることが好ましい。 Further, the computer program according to the present invention causes the search support server to function as search data generating means for generating search data serving as a basis for guide data displayed in the signage window and/or the speech window, and extracting and displaying means for extracting a simple sentence or word group indicating content based on the content of the content, and displaying the content corresponding to the selection of the extracted simple sentence or word group; tag input receiving means for receiving an input of the tag name and a simple sentence or word group indicating the content of the tag name based on the displayed content; and a pointer indicating a specific portion of the content based on the displayed content. Pointer searching means for searching for information, pointer allocation receiving means for receiving allocation of the searched pointer information for each of the tag names that have received inputs, and said utterable at the time of searching for each of the tag names that have received inputs. It is preferable to function as attribute information receiving means for receiving input of attribute information.

また、本発明に係るコンピュータプログラムは、前記抽出・表示手段を、前記コンテンツを精査して、含まれている文字列を単文又は単語群として抽出してテキストデータとして出力するコンテンツ文字起し手段、及び出力されたテキストデータの選択を受け付けた場合、選択を受け付けたテキストデータに対応する前記コンテンツを表示するコンテンツ表示手段として機能させることが好ましい。 Further, the computer program according to the present invention further comprises content transcription means for scrutinizing the content, extracting the contained character strings as simple sentences or word groups, and outputting them as text data; And, when receiving the selection of the output text data, it is preferable to function as content display means for displaying the content corresponding to the selected text data.

また、本発明に係るコンピュータプログラムは、前記ガイドデータが、前記サーチデータに含まれる前記タグ情報を、前記サーチデータを識別するサーチデータ識別情報と対応付けて生成されることが好ましい。 Further, in the computer program according to the present invention, it is preferable that the guide data is generated by associating the tag information included in the search data with search data identification information for identifying the search data.

また、本発明に係るコンピュータプログラムは、前記スピーチウィンドウに表示され、ユーザによる選択を受け付けることが可能な前記タグ名称を絞り込むために、前記検索支援サーバを、ユーザにより発話された音声データの入力を受け付ける発話受付手段として機能させ、前記検索支援サーバを、入力を受け付けた音声データをテキストデータに変換し、変換したテキストデータの前記ガイドデータの前記タグ名称及び前記タグ名称の内容を示す単文又は単語群に対する一致度を算出し、算出された一致度が最大であるタグ名称を特定するとともに、前記一致度が所定値より大きい場合に前記タグ名称に対応する前記コンテンツの特定部分を表示するメタタグ音声認識手段、及び入力を受け付けた音声データに基づいて、前記ガイドデータの前記タグ情報のうち、前記属性情報に一致するものを抽出し、一致する一又は複数の属性情報の論理積による前記ガイドデータの絞り込みを行い、絞り込まれた前記ガイドデータの前記タグ名称及び前記属性情報を出力する絞り込み音声認識手段として、並行して機能させることが好ましい。 Further, the computer program according to the present invention causes the search support server to input voice data uttered by the user in order to narrow down the tag names displayed in the speech window and capable of accepting selection by the user. Functioning as an utterance accepting means for accepting input, the search support server converts the input accepted voice data into text data, and a simple sentence or word indicating the tag name of the guide data of the converted text data and the content of the tag name Meta tag voice for calculating the degree of matching with respect to the group, identifying the tag name with the highest degree of matching calculated, and displaying a specific portion of the content corresponding to the tag name when the degree of matching is greater than a predetermined value. extracting tag information that matches the attribute information from among the tag information of the guide data based on the recognition means and the voice data that has been input; and outputting the tag name and attribute information of the guide data narrowed down.

また、本発明に係るコンピュータプログラムは、前記メタタグ音声認識手段を、変換されたテキストデータと、すべての前記タグ名称及び前記タグ名称の内容を示す単文又は単語群で形成された第一のデータセットとの一致度を算出する手段として機能させ、前記絞り込み音声認識手段を、すべてのタグ名称に対応付けられている前記属性情報で形成された第二のデータセットを音声認識フィルタとして用いた認識結果として抽出された前記属性情報の論理積により前記ガイドデータを絞り込む手段として機能させることが好ましい。 Further, the computer program according to the present invention provides a first data set formed of converted text data, simple sentences or word groups indicating all of the tag names and the content of the tag names. The second data set formed by the attribute information associated with all the tag names is functioned as a means for calculating the degree of matching, and the second data set formed by the attribute information associated with all tag names is used as a speech recognition result. It is preferable to function as means for narrowing down the guide data by logical product of the attribute information extracted as .

また、本発明に係るコンピュータプログラムは、前記絞り込み音声認識手段を、絞り込まれたガイドデータの前記属性情報により前記第二のデータセットを更新する手段として機能させることが好ましい。 Also, the computer program according to the present invention preferably causes the narrowed-down speech recognition means to function as means for updating the second data set with the attribute information of the narrowed-down guide data.

本発明によれば、Ｗｅｂサイト上に公開されている多数のコンテンツの中から、コンテンツを提供するコンテンツ提供者が、自分のコンテンツへユーザを誘導しやすいガイドデータをスピーチウィンドウやサイネージウィンドウに表示することができるとともに、ユーザ自らの意思で選択対象となるガイドデータを絞り込むことができるので、個々のユーザが表示されているガイドデータ通りに発話あるいはキーインすることにより、簡便にしかも迅速にコンテンツ提供者が提供するコンテンツをユーザの意向に沿って表示することが可能となる。 According to the present invention, a content provider who provides content among a large number of content published on a website displays guide data on a speech window or a signage window to easily guide a user to his/her own content. In addition, since the guide data to be selected can be narrowed down by the user's own intention, the content provider can be easily and quickly provided by each user speaking or keying in according to the displayed guide data. It is possible to display the content provided by according to the user's intention.

本発明の実施の形態に係る音声検索システムの構成を模式的に示すブロック図である。1 is a block diagram schematically showing the configuration of a voice search system according to an embodiment of the present invention; FIG. 本発明の実施の形態に係る検索支援サーバの構成を模式的に示すブロック図である。1 is a block diagram schematically showing the configuration of a search support server according to an embodiment of the present invention; FIG. 本発明の実施の形態に係る端末装置の構成を模式的に示すブロック図である。1 is a block diagram schematically showing the configuration of a terminal device according to an embodiment of the present invention; FIG. 本発明の実施の形態に係る検索支援サーバのサーチデータ生成処理の機能ブロック図である。4 is a functional block diagram of search data generation processing of the search support server according to the embodiment of the present invention; FIG. 本発明の実施の形態に係る検索支援サーバが、コンテンツ提供者が使用する端末装置に表示させる入力受付画面の例示図である。FIG. 4 is an exemplary diagram of an input acceptance screen displayed on a terminal device used by a content provider by the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバで生成されるサーチデータ及びガイドデータの例示図である。FIG. 4 is an exemplary diagram of search data and guide data generated by the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバのサーチデータ及びガイドデータの例示図である。FIG. 4 is an exemplary diagram of search data and guide data of the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバのサーチデータ及びガイドデータの他の例示図である。FIG. 8 is another exemplary diagram of search data and guide data of the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバが、ユーザが使用する端末装置に表示させるスピーチウィンドウの例示図である。FIG. 4 is an exemplary diagram of a speech window displayed on a terminal device used by a user by the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバのＣＰＵのコンテンツ提供者の設定処理手順を示すフローチャートである。9 is a flow chart showing a content provider setting processing procedure of the CPU of the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバのユーザによる発話の音声認識処理の機能ブロック図である。FIG. 4 is a functional block diagram of voice recognition processing of an utterance by a user of the search support server according to the embodiment of the present invention; 本発明の実施の形態に係る検索支援サーバのＣＰＵの音声認識処理の手順を示すフローチャートである。4 is a flow chart showing the procedure of speech recognition processing of the CPU of the search support server according to the embodiment of the present invention;

以下、本発明の実施の形態に係る検索支援サーバについて、図面を参照して説明する。以下の実施の形態は、特許請求の範囲に記載された発明を限定するものではなく、実施の形態の中で説明されている特徴的事項の組み合わせの全てが解決手段の必須事項であるとは限らないことは言うまでもない。 A search support server according to an embodiment of the present invention will be described below with reference to the drawings. The following embodiments do not limit the invention described in the claims, and all combinations of characteristic items described in the embodiments are essential items for the solution. It goes without saying that there is no limit.

また、本発明は多くの異なる態様にて実施することが可能であり、実施の形態の記載内容に限定して解釈されるべきものではない。実施の形態を通じて同じ要素には同一の符号を付している。 Moreover, the present invention can be implemented in many different modes and should not be construed as being limited to the description of the embodiments. The same reference numerals are given to the same elements throughout the embodiments.

以下の実施の形態では、コンピュータシステムにコンピュータプログラムを導入した音声検索システムとして説明するが、当業者であれば明らかな通り、本発明はその一部をコンピュータで実行することが可能なコンピュータプログラムとして実施することができる。したがって、本発明は、コンテンツの特定部分を示すタグ情報を選択するための発話を誘導するガイドデータを選択的に表示するサイネージウィンドウと、全てのガイドデータを発話で絞り込んで表示することが可能なスピーチウィンドウを用いることで、ユーザが自己の目的に応じて絞り込んだガイドデータ通りに発話又は選択する限り、コンテンツ提供者がユーザに提供したいコンテンツの特定部分へ確実に誘導することが可能な検索支援サーバというハードウェアとしての実施の形態、ソフトウェアとしての実施の形態、又はソフトウェアとハードウェアとの組み合わせの実施の形態をとることができる。コンピュータプログラムは、ハードディスク、ＤＶＤ、ＣＤ、光記憶装置、磁気記憶装置等の任意のコンピュータで読み取ることが可能な記録媒体に記録することができる。 In the following embodiments, a voice search system in which a computer program is installed in a computer system will be described. can be implemented. Therefore, the present invention can provide a signage window that selectively displays guide data that guides utterances for selecting tag information indicating a specific portion of content, and can display all guide data by narrowing down the information by utterances. By using the speech window, as long as the user speaks or selects according to the guide data narrowed down according to the user's purpose, it is a search support that can surely guide the content provider to the specific part of the content that the content provider wants to provide to the user. It can be implemented as a hardware called server, implemented as software, or implemented as a combination of software and hardware. The computer program can be recorded on any computer-readable recording medium such as a hard disk, DVD, CD, optical storage device, or magnetic storage device.

本発明の実施の形態によれば、Ｗｅｂサイト上に公開されている多数のコンテンツの中から、コンテンツを提供するコンテンツ提供者が、自分のコンテンツへユーザを誘導しやすいガイドデータをスピーチウィンドウやサイネージウィンドウに表示することができるとともに、ユーザ自らの意思で選択対象となるガイドデータを絞り込むことができるので、個々のユーザが表示されているガイドデータ通りに発話あるいはキーインすることにより、簡便にしかも迅速にコンテンツ提供者が提供するコンテンツをユーザの意向に沿って表示することが可能となる。 According to the embodiment of the present invention, a content provider who provides content from among a large number of content published on a website can set guide data to easily guide users to his or her own content in a speech window or signage. Since it is possible to display it in a window and to narrow down the guide data to be selected at the user's own will, individual users can speak or key-in according to the displayed guide data, which is easy and quick. It is possible to display the content provided by the content provider in accordance with the user's intention.

図１は、本発明の実施の形態に係る音声検索システムの構成を模式的に示すブロック図である。本実施の形態に係る音声検索システムは、コンテンツ提供者が使用する端末装置１ａと、コンテンツや動画コンテンツを検索するユーザが使用する端末装置１ｂと、端末装置１ａ及び１ｂとデータ通信することが可能にインターネット等のネットワーク網２を介して接続されている検索支援サーバ３とで構成されている。端末装置１ａ、１ｂは、マイクやスピーカを接続してある据え置き型のＰＣに限定されるものではなく、マイクやスピーカを内蔵しているスマートホン、タブレット等の携帯端末であっても良い。 FIG. 1 is a block diagram schematically showing the configuration of a voice search system according to an embodiment of the invention. The voice search system according to the present embodiment can perform data communication with a terminal device 1a used by a content provider, a terminal device 1b used by a user searching for content or video content, and the terminal devices 1a and 1b. and a search support server 3 connected via a network 2 such as the Internet. The terminal devices 1a and 1b are not limited to stationary PCs to which microphones and speakers are connected, and may be portable terminals such as smart phones and tablets that have built-in microphones and speakers.

図２は、本発明の実施の形態に係る検索支援サーバ３の構成を模式的に示すブロック図である。本実施の形態に係る検索支援サーバ３は、少なくともＣＰＵ（中央演算装置）３１、メモリ３２、記憶装置３３、Ｉ／Ｏインタフェース３４、ビデオインタフェース３５、可搬型メモリドライブ３６、通信インタフェース３７及び上述したハードウェアを接続する内部バス３８で構成されている。 FIG. 2 is a block diagram schematically showing the configuration of the search support server 3 according to the embodiment of the invention. The search support server 3 according to this embodiment includes at least a CPU (Central Processing Unit) 31, a memory 32, a storage device 33, an I/O interface 34, a video interface 35, a portable memory drive 36, a communication interface 37, and the above-described It consists of an internal bus 38 that connects hardware.

ＣＰＵ３１は、内部バス３８を介して検索支援サーバ３の上述したようなハードウェア各部と接続されており、上述したハードウェア各部の動作を制御するとともに、記憶装置３３に記憶されているコンピュータプログラム１００に従って、種々のソフトウェア的機能を実行する。メモリ３２は、ＳＲＡＭ、ＳＤＲＡＭ等の揮発性メモリで構成され、コンピュータプログラム１００の実行時にロードモジュールが展開され、コンピュータプログラム１００の実行時に発生する一時的なデータ等を記憶する。 The CPU 31 is connected to the above-described hardware units of the search support server 3 via the internal bus 38, controls the operation of the above-described hardware units, and executes the computer program 100 stored in the storage device 33. perform various software-like functions according to The memory 32 is composed of a volatile memory such as SRAM, SDRAM, etc. A load module is expanded when the computer program 100 is executed, and temporary data generated when the computer program 100 is executed is stored.

記憶装置３３は、内蔵される固定型記憶装置（ハードディスク）、ＲＯＭ等で構成されている。記憶装置３３に記憶されたコンピュータプログラム１００は、プログラム及びデータ等の情報を記録したＤＶＤ、ＣＤ－ＲＯＭ、ＵＳＢメモリ、ＳＤカード等の可搬型記録媒体９０から、可搬型メモリドライブ３６によりダウンロードされ、実行時には記憶装置３３からメモリ３２へ展開して実行される。もちろん、通信インタフェース３７を介して接続されている外部コンピュータからダウンロードされたコンピュータプログラムであっても良い。 The storage device 33 is composed of a built-in fixed storage device (hard disk), ROM, and the like. The computer program 100 stored in the storage device 33 is downloaded by the portable memory drive 36 from a portable recording medium 90 such as a DVD, CD-ROM, USB memory, SD card, etc. that records information such as programs and data, When executed, it is expanded from the storage device 33 to the memory 32 and executed. Of course, it may be a computer program downloaded from an external computer connected via the communication interface 37 .

記憶装置３３は、サーチデータ記憶部３３１及びガイドデータ記憶部３３２とを備えている。サーチデータ記憶部３３１は、コンテンツ提供者が、提供するコンテンツに対してユーザがアクセス可能なサーチデータ（コンテンツの特定部分を識別するタグ名称、タグ名称の内容を説明する単文又は単語群からなるテキストデータ及びタグ名称の属性を示す属性情報）をポインタ情報（動画コンテンツの場合には、加えてタイムスタンプ情報）に対応付けて記憶する。なお、ポインタ情報とは、コンテンツにアクセスすることが可能なコンテンツの存在位置を示す情報を広く意味している。コンテンツがＷｅｂページである場合にはＵＲＬがポインタ情報に相当し、動画コンテンツである場合には、再生可能なＵＲＬだけではなく、動画コンテンツの再生を開始するタイムスタンプ情報、再生を終了するタイムスタンプ情報もポインタ情報に含まれる。 The storage device 33 has a search data storage section 331 and a guide data storage section 332 . The search data storage unit 331 stores search data (tag names that identify specific parts of the content, simple sentences or word groups that describe the content of the tag names) that can be accessed by users for content provided by content providers. attribute information indicating attributes of data and tag names) is stored in association with pointer information (and time stamp information in the case of video content). Note that the pointer information broadly means information indicating the location of content that can be accessed. If the content is a web page, the URL corresponds to the pointer information, and if the content is video content, not only the reproducible URL, but also time stamp information to start playing the video content and time stamp to end the playing. information is also included in the pointer information.

ガイドデータ記憶部３３２は、ユーザが検索するための発話あるいはクリック、タッチ等の選択操作を促すために、サーチデータに基づいて生成されるガイドデータを記憶する。ガイドデータは、サーチデータの中から、コンテンツの特定部分を識別するタグ名称、タグ名称の内容を説明する単文又は単語群からなるテキストデータ及びタグ名称の属性を示す属性情報を抽出して、対応するサーチデータの識別情報に対応付けて生成される。ガイドデータを表示するスピーチウィンドウに一覧表示された状態で、ユーザの発話により表示されるガイドデータが絞り込まれる。ユーザが絞り込まれて表示されているガイドデータの中からいずれかのタグ名称を選択することで、選択されたタグ名称に対応付けられたサーチデータで特定されるコンテンツ（動画コンテンツ含む）を表示（あるいは再生）することができる。 The guide data storage unit 332 stores guide data generated based on the search data in order to prompt the user to make an utterance for searching or a selection operation such as a click or a touch. The guide data extracts from the search data a tag name that identifies a specific part of the content, text data consisting of a simple sentence or a group of words describing the content of the tag name, and attribute information that indicates the attribute of the tag name. It is generated in association with the identification information of the search data to be searched. The guide data to be displayed is narrowed down by the user's utterance while the list is displayed in the speech window displaying the guide data. When the user selects one of the tag names from the guide data that has been narrowed down and displayed, the content (including video content) specified by the search data associated with the selected tag name is displayed ( or playback).

通信インタフェース３７は内部バス３８に接続されており、インターネット、ＬＡＮ、ＷＡＮ等の外部のネットワーク網２に接続されることにより、外部コンピュータ等とデータ送受信を行うことが可能となっている。 The communication interface 37 is connected to an internal bus 38, and by connecting to an external network 2 such as the Internet, LAN, WAN, etc., it is possible to transmit and receive data to and from an external computer or the like.

Ｉ／Ｏインタフェース３４は、入力装置であるキーボード４１、マウス４２と接続され、データの入力を行う。本実施の形態では、実際に音声を入力するのは、コンテンツ提供者又はユーザが使用している端末装置１ａ、１ｂ（スマートフォン、タブレット等）であり、入力された音声データを通信インタフェース３７を介して受信する。もちろん、検索支援サーバ３にマイク、スピーカ等を直接接続していても良い。 The I/O interface 34 is connected to a keyboard 41 and a mouse 42, which are input devices, to input data. In the present embodiment, it is the terminal device 1a, 1b (smartphone, tablet, etc.) used by the content provider or the user that actually inputs the voice, and the input voice data is transmitted via the communication interface 37. to receive. Of course, the search support server 3 may be directly connected with a microphone, a speaker, or the like.

ビデオインタフェース３５は、ＣＲＴディスプレイ、液晶ディスプレイ等の表示装置４３と接続されている。本実施の形態では、実際に画像を出力表示するのは、コンテンツ提供者又はユーザが使用している端末装置１ａ、１ｂ（スマートフォン、タブレット等）であり、検索支援サーバ３は、端末装置１ａ、１ｂへ（音声データを含む）画像データ等を通信インタフェース３７を介して送信する。 The video interface 35 is connected to a display device 43 such as a CRT display or liquid crystal display. In the present embodiment, it is the terminal devices 1a and 1b (smartphones, tablets, etc.) used by the content provider or the user that actually output and display the images. 1b via the communication interface 37, such as image data (including audio data).

図３は、本発明の実施の形態に係る端末装置１（１ａ、１ｂ共通）の構成を模式的に示すブロック図である。本実施の形態に係る端末装置１は、少なくともＣＰＵ（中央演算装置）１１、メモリ１２、記憶装置１３、Ｉ／Ｏインタフェース１４、ビデオインタフェース１５、可搬型メモリドライブ１６、通信インタフェース１７及び上述したハードウェアを接続する内部バス１８で構成されている。 FIG. 3 is a block diagram schematically showing the configuration of the terminal device 1 (common to 1a and 1b) according to the embodiment of the present invention. The terminal device 1 according to the present embodiment includes at least a CPU (Central Processing Unit) 11, a memory 12, a storage device 13, an I/O interface 14, a video interface 15, a portable memory drive 16, a communication interface 17, and the hardware described above. It consists of an internal bus 18 that connects hardware.

ＣＰＵ１１は、内部バス１８を介して端末装置１の上述したようなハードウェア各部と接続されており、上述したハードウェア各部の動作を制御するとともに、記憶装置１３に記憶されているコンピュータプログラム１０１に従って、種々のソフトウェア的機能を実行する。メモリ１２は、ＳＲＡＭ、ＳＤＲＡＭ等の揮発性メモリで構成され、コンピュータプログラム１０１の実行時にロードモジュールが展開され、コンピュータプログラム１０１の実行時に発生する一時的なデータ等を記憶する。 The CPU 11 is connected to the above-described hardware units of the terminal device 1 via the internal bus 18 , controls the operation of the above-described hardware units, and executes computer programs 101 stored in the storage device 13 . , which performs various software-like functions. The memory 12 is composed of a volatile memory such as SRAM, SDRAM, etc. A load module is expanded when the computer program 101 is executed, and temporary data generated when the computer program 101 is executed is stored.

記憶装置１３は、内蔵される固定型記憶装置（ハードディスク）、ＲＯＭ等で構成されている。記憶装置１３に記憶されたコンピュータプログラム１０１は、通信インタフェース１７を介して接続されている外部コンピュータからダウンロードされ、実行時には記憶装置１３からメモリ１２へ展開して実行される。もちろん、プログラム及びデータ等の情報を記録したＳＤカード等の可搬型記録媒体９１から可搬型メモリドライブ１６を介してダウンロードされたコンピュータプログラムであっても良い。 The storage device 13 is composed of a built-in fixed storage device (hard disk), ROM, and the like. A computer program 101 stored in the storage device 13 is downloaded from an external computer connected via the communication interface 17, and expanded from the storage device 13 to the memory 12 at the time of execution. Of course, it may be a computer program downloaded via the portable memory drive 16 from a portable recording medium 91 such as an SD card recording information such as programs and data.

通信インタフェース１７は内部バス１８に接続されており、インターネット、ＬＡＮ、ＷＡＮ等の外部のネットワーク網２に接続されることにより、外部コンピュータ等とデータ送受信を行うことが可能となっている。 The communication interface 17 is connected to an internal bus 18, and by connecting to an external network 2 such as the Internet, LAN, WAN, etc., it is possible to transmit and receive data to and from an external computer or the like.

Ｉ／Ｏインタフェース１４は、キーボード２０３、マウス２０４等の入力装置の他、マイク２０１等の音声入力装置、スピーカ２０２等の音声出力装置と接続され、データの入出力を行う。通信インタフェース１７を介してスマートホン等を接続し、音声入力装置及び音声出力装置を代用しても良い。 The I/O interface 14 is connected to an input device such as the keyboard 203 and the mouse 204, as well as an audio input device such as the microphone 201 and an audio output device such as the speaker 202, and inputs and outputs data. A smart phone or the like may be connected via the communication interface 17 to substitute the voice input device and the voice output device.

ビデオインタフェース１５は、表示装置２０５と接続されており、検索支援サーバ３から送信されてくる入出力用の画像をブラウザ等で表示する。検索されたコンテンツや動画コンテンツは、表示装置２０５に表示しても良いし、別途ネットワーク網を介してデータ通信することが可能に接続されている外部のコンピュータで表示しても良い。 The video interface 15 is connected to the display device 205 and displays images for input/output transmitted from the search support server 3 on a browser or the like. The retrieved content and moving image content may be displayed on the display device 205, or may be displayed on an external computer that is separately connected to enable data communication via a network.

以下、上述した構成の検索支援サーバ３の動作について説明する。 The operation of the search support server 3 configured as described above will be described below.

図４は、本発明の実施の形態に係る検索支援サーバ３のサーチデータ生成処理の機能ブロック図である。図４では、コンテンツ提供者がサーチデータを生成し、ガイドデータを生成して、サイネージウィンドウ及び／又はスピーチウィンドウに表示する手順について説明する。 FIG. 4 is a functional block diagram of search data generation processing of the search support server 3 according to the embodiment of the present invention. FIG. 4 describes a procedure in which a content provider generates search data, generates guide data, and displays them on a signage window and/or a speech window.

図４において、サーチデータ生成部４０１は、サイネージウィンドウ及び／又はスピーチウィンドウに表示するガイドデータの基礎となるサーチデータを生成する。サーチデータ生成部４０１は、抽出・表示部４０２、タグ入力受付部４０３、ポインタ探索部４０４、ポインタ割付受付部４０５、及び属性情報受付部４０６を備えている。 In FIG. 4, a search data generation unit 401 generates search data that serves as a basis for guide data displayed in a signage window and/or a speech window. The search data generation unit 401 includes an extraction/display unit 402 , a tag input reception unit 403 , a pointer search unit 404 , a pointer allocation reception unit 405 and an attribute information reception unit 406 .

抽出・表示部４０２は、コンテンツの内容に基づいて、内容を示す単文又は単語群を抽出して、抽出された単文又は単語群の選択を受け付けた場合に対応するコンテンツを表示する。より具体的には、抽出・表示部４０２は、コンテンツ文字起し部４０２１と、コンテンツ表示部４０２２とを備えている。 The extraction/display unit 402 extracts a simple sentence or word group indicating the content based on the content, and displays the corresponding content when the selected simple sentence or word group is selected. More specifically, the extraction/display section 402 includes a content transcription section 4021 and a content display section 4022 .

コンテンツ文字起し部４０２１は、対象となるコンテンツを精査して、含まれている文字列を単文又は単語群として抽出してテキストデータとして出力する。出力されたテキストデータは、Ｗｅｂサイトに含まれている文字列に基づいてタグ名称を生成する参考にすることができる。抽出された単文又は単語群ごとに、ＷｅｂサイトのＵＲＬが対応付けられている。 The content transcription unit 4021 carefully examines the target content, extracts the contained character strings as simple sentences or word groups, and outputs them as text data. The output text data can be used as a reference for generating tag names based on character strings contained in the website. A website URL is associated with each extracted simple sentence or word group.

コンテンツが動画コンテンツの場合には、図示しない音声文字起し部を備えることで、動画コンテンツの音声部分を抽出して、適切な文節ごとのタイムスタンプ情報と対応付けたテキストデータとして出力する。出力されたテキストデータは、動画コンテンツに含まれている音声に基づいてタグ名称を生成する参考にすることができる。 When the content is video content, a voice transcription section (not shown) is provided to extract the audio portion of the video content and output it as text data associated with appropriate time stamp information for each phrase. The output text data can be used as a reference for generating tag names based on the audio included in the video content.

コンテンツ表示部４０２２は、出力されたテキストデータの選択を受け付けた場合、選択を受け付けたテキストデータに対応するコンテンツの特定部分を表示する。タグ名称の候補の選択を受け付けた場合に、正しいコンテンツが対応付けられているか否かを確認することができる。 The content display unit 4022 displays a specific part of the content corresponding to the selected text data when the selection of the output text data is accepted. When the selection of the tag name candidate is received, it is possible to confirm whether or not the correct content is associated.

以下、コンテンツの特定部分に対応付けられるタグ情報の入力を受け付ける。本実施の形態では、タグ情報とは、少なくともコンテンツの特定部分を識別するタグ名称、タグ名称の内容を説明する単文又は単語群からなるテキストデータ及びタグ名称の属性を示す属性情報で構成されている。 Input of tag information associated with a specific part of the content is then accepted. In this embodiment, the tag information is composed of at least a tag name that identifies a specific part of the content, text data consisting of a simple sentence or a group of words describing the content of the tag name, and attribute information that indicates the attribute of the tag name. there is

タグ入力受付部４０３は、表示出力されたテキストデータ、あるいは動画コンテンツの場合には再生している動画コンテンツの特定部分に基づいて、タグ名称及びタグ名称の内容を示す単文又は単語群の入力を受け付ける。 The tag input reception unit 403 receives the input of a tag name and a simple sentence or word group indicating the content of the tag name based on the displayed text data or, in the case of video content, a specific portion of the video content being reproduced. accept.

ポインタ探索部４０４は、表示されたコンテンツの特定部分に基づいて、コンテンツの特定部分を示すポインタ情報を探索する。探索されるポインタ情報は、コンテンツがＷｅｂサイトの場合には既に対応付けられているＵＲＬを結果として出力する。コンテンツが動画コンテンツである場合には、既に対応付けられているＵＲＬだけではなく、再生開始タイミング及び再生終了タイミングに関するタイムスタンプ情報も探索される。 A pointer search unit 404 searches for pointer information indicating a specific portion of the content based on the specific portion of the displayed content. The searched pointer information outputs as a result the already associated URL when the content is a website. When the content is video content, not only the already associated URL but also time stamp information regarding the reproduction start timing and the reproduction end timing are searched.

ポインタ割付受付部４０５は、入力を受け付けたタグ名称ごとに、探索されたポインタ情報の割り付けを受け付ける。これにより、タグ名称の選択を受け付けることにより、割り付けられたポインタ情報に従って、コンテンツの特定部分を表示することができる。 The pointer allocation reception unit 405 receives allocation of searched pointer information for each tag name whose input is received. Accordingly, by receiving the selection of the tag name, it is possible to display the specific portion of the content according to the allocated pointer information.

属性情報受付部４０６は、入力を受け付けたタグ名称ごとに、検索時に発話可能な属性情報の入力を受け付ける。後述するように属性情報をユーザが発話することにより、タグ名称を絞り込むことができる。 The attribute information reception unit 406 receives input of attribute information that can be spoken at the time of searching for each tag name whose input is received. The tag names can be narrowed down by the user uttering the attribute information as will be described later.

ここで、属性情報としては、例えば「新着」、「動画」、「使い方」、「料金」、「医療」など、コンテンツを絞り込むのに役立つ情報であり、しかも発話しやすい言葉であれば特に限定されるものではない。 Here, the attribute information is information useful for narrowing down the content, such as "new", "video", "how to use", "price", "medical", etc., and if it is words that are easy to speak, it is particularly limited. not to be

このようにタグ名称及びタグ名称の内容を示す単文又は単語群、属性情報の入力を受け付け、タグ情報としてポインタ情報に対応付けてサーチデータを生成する。ガイドデータ生成部４０７は、生成されたサーチデータに含まれるタグ情報を、サーチデータを識別するサーチデータ識別情報と対応付けて、ガイドデータを生成する。 In this way, input of a tag name, a simple sentence or a group of words indicating the content of the tag name, and attribute information is received, and search data is generated in association with pointer information as tag information. The guide data generation unit 407 generates guide data by associating the tag information included in the generated search data with the search data identification information for identifying the search data.

図５は、本発明の実施の形態に係る検索支援サーバ３が、コンテンツ提供者が使用する端末装置１ａに表示させる入力受付画面の例示図である。図５（ａ）は、入力受付画面の初期画面の例示図であり、図５（ｂ）は、タグ情報の入力受付画面の例示図であり、図５（ｃ）は、サイネージウィンドウの表示の例示図である。 FIG. 5 is an exemplary diagram of an input reception screen displayed on the terminal device 1a used by the content provider by the search support server 3 according to the embodiment of the present invention. FIG. 5(a) is an exemplary diagram of the initial screen of the input reception screen, FIG. 5(b) is an exemplary diagram of the input reception screen of the tag information, and FIG. 5(c) is a display of the signage window. It is an illustration figure.

図５（ａ）に示す初期画面５０が、コンテンツ提供者の使用する端末装置１ａに、当該Ｗｅｂサイトのトップページの表示とともにポップアップ表示される。コンテンツ文字起し部４０２１において出力されるテキストデータは、初期画面５０の共用ウィンドウ６０に表示される。 An initial screen 50 shown in FIG. 5(a) is popped up on the terminal device 1a used by the content provider along with the top page of the website. The text data output by the content transcription unit 4021 is displayed on the common window 60 of the initial screen 50. FIG.

コンテンツ提供者は、共用ウィンドウ６０に表示されているテキストデータに基づいて、ユーザに提供したい情報が掲載されているＷｅｂページを探し、共用ウィンドウ６０上に表示されているテキストデータを選択することで、選択されたテキストデータに対応付けられているＷｅｂページへと遷移する。コンテンツ提供者は、ポップアップ表示されている初期画面５０の「タグ入力」ボタン５１を選択する。 Based on the text data displayed in shared window 60, the content provider searches for a Web page containing information that he or she wishes to provide to the user, and selects the text data displayed in shared window 60. , to the web page associated with the selected text data. The content provider selects the "input tag" button 51 on the initial screen 50 displayed as a popup.

コンテンツ提供者による「タグ入力」ボタン５１の選択を受け付けた場合、図５（ｂ）に示すタグ情報の入力受付画面５２が別ウィンドウで表示される。コンテンツ提供者は、タグ名称入力領域５３にユーザが選択しやすいタグ名称を、メタタグ入力領域５４に内容を示す単文又は単語群を入力する。コンテンツ提供者は、属性情報選択ボタン５５により、属性情報を選択する。もちろん、属性情報を任意に入力する領域を設けてキーインしても良い。 When the selection of the "input tag" button 51 by the content provider is accepted, a tag information input acceptance screen 52 shown in FIG. 5(b) is displayed in another window. The content provider inputs a tag name that is easy for the user to select in the tag name input area 53 and a simple sentence or word group indicating the content in the meta tag input area 54 . The content provider selects attribute information using the attribute information selection button 55 . Of course, an area for optionally inputting attribute information may be provided for key-in.

コンテンツ提供者が、「マーク」ボタン５６を選択した時点で、検索支援サーバ３は端末装置１ａのブラウザに表示されているＷｅｂサイトのＵＲＬをポインタ情報として探索し、入力されたタグ名称に対応付けてサーチデータとしてサーチデータ記憶部３３１に記憶する。 When the content provider selects the "mark" button 56, the search support server 3 searches for the URL of the website displayed on the browser of the terminal device 1a as pointer information, and associates it with the input tag name. are stored in the search data storage unit 331 as search data.

ガイドデータは、サーチデータのタグ情報部分に、サーチデータを識別する識別情報を対応付けて生成する。図６は、本発明の実施の形態に係る検索支援サーバ３で生成されるサーチデータ及びガイドデータの例示図である。図６（ａ）は、生成されたガイドデータの例示図であり、図６（ｂ）は、生成されたガイドデータの基礎となるサーチデータの例示図である。 Guide data is generated by associating identification information for identifying search data with the tag information portion of search data. FIG. 6 is an exemplary diagram of search data and guide data generated by the search support server 3 according to the embodiment of the present invention. FIG. 6(a) is an exemplary diagram of generated guide data, and FIG. 6(b) is an exemplary diagram of search data that is the basis of the generated guide data.

図６（ａ）に示すように、ガイドデータは、サーチデータの何番目のタグ名称であるかを示す「番号」で対応付けられており、Ｗｅｂサイトを識別する識別情報である共通のサーチデータＩＤ（図６ではサーチデータＩＤ＝ＹＹＹＹＹＹ）を有している。生成されたガイドデータはガイドデータ記憶部３３２に記憶される。 As shown in FIG. 6(a), the guide data are associated with a "number" indicating the order of the tag name of the search data, and common search data that is identification information for identifying a website. ID (search data ID=YYYYYY in FIG. 6). The generated guide data is stored in the guide data storage unit 332 .

そして、「ポインタ情報」としてＵＲＬがタグ名称と対応付けて記憶されているのはサーチデータのみである。したがって、図５（ａ）の「ガイドデータ」ボタンを選択することで、サーチデータのタグ情報をコピーし、サーチデータＩＤ（ＹＹＹＹＹＹ）と、サーチデータの何番目のタグ名称であるかを示す「番号」とを対応付けてガイドデータが生成される。図６（ａ）に示すガイドデータの「サーチデータＩＤ」欄が空白なのは、一のサーチデータのみに基づいて生成されたガイドデータであることを意味しており、他のサーチデータに基づいて生成されたガイドデータである場合には、他のサーチデータのサーチデータＩＤが記載される。 Only the search data is stored as "pointer information" in association with the URL and the tag name. Therefore, by selecting the "guide data" button in FIG. 5(a), the tag information of the search data is copied, and the search data ID (YYYYYY) and the tag name indicating the number of the tag in the search data are displayed. The guide data is generated by associating with the "number". The fact that the "search data ID" column of the guide data shown in FIG. 6(a) is blank means that the guide data is generated based on only one search data, and is generated based on other search data. If the guide data is the guide data that has been written, the search data ID of the other search data is described.

図５（ｃ）のサイネージウィンドウ５８には、生成されたガイドデータの中からコンテンツ提供者がユーザに表示したいデータとして選択したガイドデータのタグ名称が表示されることが好ましい。この場合、「ガイドデータ」ボタンの選択を受け付けると「編集」ボタンと「拡張」ボタン（図示せず）が表示される。「編集」ボタンの選択を受け付けた場合、共用ウィンドウ６０内にキーイン等することにより、サイネージウィンドウ５８に表示するタグ名称を設定することができる。「拡張」ボタンの選択を受け付けた場合には、他のサーチデータから生成されたガイドデータを追加することができる。 The signage window 58 of FIG. 5(c) preferably displays the tag name of the guide data selected by the content provider as the data that the content provider wants to display to the user from among the generated guide data. In this case, when the selection of the "Guide Data" button is accepted, an "Edit" button and an "Expand" button (not shown) are displayed. When the selection of the “edit” button is accepted, the tag name displayed on the signage window 58 can be set by keying in the common window 60 . When the selection of the "expand" button is accepted, guide data generated from other search data can be added.

すなわち、本実施の形態では、コンテンツ提供者ごとにサーチデータを生成しても良いし、同一コンテンツ提供者が複数立ち上げたＷｅｂサイトごとにサーチデータを生成しても良い。例えば、同一の会社のＷｅｂサイトであっても、事業部ごと、商品・サービスごとにＷｅｂサイトを立ち上げる機会が急増しており、コンテンツが日々増加する。これらのコンテンツを迅速にかつ確実に検索できるようにすることは、顧客サービスにおいても重要になる。 That is, in the present embodiment, search data may be generated for each content provider, or search data may be generated for each of multiple websites launched by the same content provider. For example, even if it is a website of the same company, there is a rapid increase in the number of opportunities to set up a website for each business division or product/service, and the amount of content is increasing day by day. Being able to find this content quickly and reliably is also important for customer service.

本実施の形態では、全てのガイドデータは、基礎となるサーチデータに基づいて生成されている。したがって、一のガイドデータに他のサーチデータに基づいて生成されたガイドデータさえ追加すれば、他のサーチデータに対応付けられているコンテンツ（動画コンテンツ含む）、すなわち他のＷｅｂサイトのコンテンツであっても迅速かつ確実に表示・再生することができるようになる。 In the present embodiment, all guide data are generated based on basic search data. Therefore, if only guide data generated based on other search data is added to one guide data, content (including video content) associated with other search data, that is, content of another website can be obtained. can be quickly and reliably displayed and reproduced.

図７は、本発明の実施の形態に係る検索支援サーバ３のサーチデータ及びガイドデータの例示図である。図７（ａ）は、一のサーチデータに基づくガイドデータの例示図であり、図７（ｂ）は、追加されたガイドデータの基礎となる他のサーチデータの例示図である。 FIG. 7 is an exemplary diagram of search data and guide data of the search support server 3 according to the embodiment of the present invention. FIG. 7(a) is an exemplary diagram of guide data based on one search data, and FIG. 7(b) is an exemplary diagram of other search data that is the basis of added guide data.

図６に示すサーチデータ及びガイドデータとの違いは、共通のサーチデータＩＤではなく、異なるサーチデータＩＤを有するガイドデータが含まれている点にある。すなわち、サーチデータ及びガイドデータを生成した時点では、サーチデータＩＤは共通である。本実施の形態では、サーチデータとガイドデータとの二段構造にすることにより、サーチデータＩＤが異なるガイドデータを設定することができる。これにより、サーチデータＩＤが異なるガイドデータの基礎となるサーチデータを読み出すことができ、サーチデータＩＤが異なるサーチデータ、すなわち異なるＷｅｂサイトで提供されるコンテンツの特定部分を表示することが可能となる。 The difference from the search data and guide data shown in FIG. 6 is that guide data with different search data IDs are included instead of common search data IDs. That is, when the search data and the guide data are generated, the search data ID is common. In this embodiment, the two-stage structure of search data and guide data makes it possible to set guide data with different search data IDs. As a result, it is possible to read the search data that is the basis of the guide data with different search data IDs, and display the search data with different search data IDs, that is, the specific parts of the content provided on different websites. .

例えば共通のサーチデータＩＤを「ＹＹＹＹＹＹ」とする。そして、共通のサーチデータＩＤが「ＹＹＹＹＹＹ」を基礎として生成されたガイドデータには、他のサーチデータＩＤを基礎としたガイドデータを随時追加することができる。図７（ａ）の例では、サーチデータＩＤが「ＰＰＰＰＰＰ」であるガイドデータが追加されている。つまり、サーチデータＩＤが「ＰＰＰＰＰＰ」であるガイドデータの基礎となるサーチデータを読み出すことができるようになり、当該サーチデータに対応付けられているコンテンツを表示することができるようになる。 For example, let the common search data ID be "YYYYYY". Guide data based on other search data IDs can be added at any time to the guide data generated based on the common search data ID "YYYYYY". In the example of FIG. 7A, guide data with a search data ID of "PPPPPP" is added. That is, it becomes possible to read the search data that is the basis of the guide data whose search data ID is "PPPPPP", and to display the content associated with the search data.

つまり、図７（ｂ）に示すサーチデータＩＤが「ＰＰＰＰＰＰ」であるサーチデータを参照することができ、所望のコンテンツのポインタ情報を取得することができる。したがって、ガイドデータを追加するだけで、どのＷｅｂサイトのコンテンツであっても所望のコンテンツを表示することができる。なお、図７（ｂ）において、ガイドデータの「サーチデータＩＤ」欄の空欄は、追加されたガイドデータではなく、サーチデータＩＤが共通のサーチデータＩＤ「ＹＹＹＹＹＹ」であるガイドデータであることを意味している。 That is, it is possible to refer to the search data whose search data ID is "PPPPPP" shown in FIG. 7B, and to obtain the pointer information of the desired content. Therefore, by simply adding guide data, desired content can be displayed regardless of the content of any website. In FIG. 7B, blanks in the "search data ID" column of the guide data indicate that the guide data is not the added guide data but the guide data whose search data ID is the common search data ID "YYYYYY". means.

図８は、本発明の実施の形態に係る検索支援サーバ３のサーチデータ及びガイドデータの他の例示図である。図８（ａ）は、一のサーチデータに基づくガイドデータの例示図であり、図８（ｂ）は、追加されたガイドデータの基礎となる他のサーチデータの例示図である。 FIG. 8 is another exemplary diagram of search data and guide data of the search support server 3 according to the embodiment of the present invention. FIG. 8(a) is an exemplary diagram of guide data based on one piece of search data, and FIG. 8(b) is an exemplary diagram of other search data that is the basis of added guide data.

図７に示すサーチデータ及びガイドデータとの違いは、サーチデータのポインタ情報に動画コンテンツのＵＲＬだけでなく、動画コンテンツの再生開始点を示す開始タイムスタンプ及び再生終了点を示す終了タイムスタンプ（タイムスタンプ情報）を含んでいることである。すなわち、動画コンテンツにおいて、コンテンツ提供者がユーザに見せたいシーンだけ再生することができる。 The difference between the search data and the guide data shown in FIG. 7 is that the pointer information of the search data includes not only the URL of the video content, but also the start time stamp indicating the playback start point and the end time stamp (time stamp) indicating the playback end point of the video content. stamp information). That is, in moving image content, only scenes that the content provider wants to show to the user can be reproduced.

例えば共通のサーチデータＩＤを「ＹＹＹＹＹＹ」とする。そして、サーチデータＩＤが「ＹＹＹＹＹＹ」であるサーチデータを基礎として生成されたガイドデータには、他のサーチデータＩＤを基礎とした動画コンテンツのガイドデータを随時追加することができる。図８（ａ）では、サーチデータＩＤが「ＴＴＴＴＴＴ」であるガイドデータが追加されている。つまり、サーチデータＩＤが「ＴＴＴＴＴＴ」であるガイドデータの基礎となるサーチデータを読み出すことができるようになり、当該サーチデータに対応付けられている動画コンテンツを再生することができるようになる。 For example, let the common search data ID be "YYYYYY". In addition, to the guide data generated based on the search data whose search data ID is "YYYYYY", guide data of moving image content based on other search data IDs can be added at any time. In FIG. 8A, guide data whose search data ID is "TTTTTT" is added. That is, it becomes possible to read out the search data that is the basis of the guide data whose search data ID is "TTTTTT", and to reproduce the video content associated with the search data.

つまり、図８（ｂ）に示すサーチデータＩＤが「ＴＴＴＴＴＴ」であるサーチデータを参照することができ、所望のコンテンツのポインタ情報及びタイムスタンプ情報を取得することができる。したがって、ガイドデータを追加するだけで、どのＷｅｂサイトの動画コンテンツであっても所望の動画コンテンツの所望のシーンだけ再生することができる。なお、図８（ｂ）において、ガイドデータの「サーチデータＩＤ」欄の空欄は、追加されたガイドデータではなく、サーチデータＩＤが「ＹＹＹＹＹＹ」であるガイドデータであることを意味している。 That is, it is possible to refer to the search data whose search data ID is "TTTTTT" shown in FIG. Therefore, by simply adding guide data, only desired scenes of desired moving image content can be reproduced regardless of the moving image content of any website. Note that in FIG. 8B, blanks in the "search data ID" column of the guide data mean that the guide data is not added guide data but guide data whose search data ID is "YYYYYY".

図４に戻って、サイネージウィンドウ表示部４０８は、生成されたガイドデータのタグ名称等を、コンテンツ提供者が選択して、ユーザが使用する端末装置１ｂ上でサイネージウィンドウに表示させる。これにより、ユーザに対して、コンテンツの特定部分を示すタグ情報を選択するための発話を誘導することができる。サイネージウィンドウは、検索支援サーバが、コンテンツ提供者が使用する端末装置１ａに、図５（ｃ）に示すような文字列が流れて表示されるような形態で表示されることが好ましい。 Returning to FIG. 4, the signage window display unit 408 causes the content provider to select the tag name and the like of the generated guide data and display it on the signage window on the terminal device 1b used by the user. As a result, the user can be guided to utter an utterance for selecting the tag information indicating the specific portion of the content. The signage window is preferably displayed by the search support server on the terminal device 1a used by the content provider in such a manner that a character string as shown in FIG. 5(c) is displayed.

具体的には、図５（ｃ）に示すサイネージウィンドウ５７の誘導表示領域５８に、選択されたガイドデータのタグ名称が表示される。図５（ｃ）において、矢印は、文字列がその方向に流れるように移動しながら表示することを意味する。サイネージウィンドウの表示を見たユーザは、コンテンツ提供者がどのような情報の提供を意図しているかを知ることができ、発話内容を工夫することができる。コンテンツ提供者は、ユーザに提供したいコンテンツを示すタグ名称あるいは文字列をサイネージウィンドウ５７に表示することで、ユーザを提供したいコンテンツへと誘導することができる。 Specifically, the tag name of the selected guide data is displayed in the guidance display area 58 of the signage window 57 shown in FIG. 5(c). In FIG. 5(c), an arrow means that the character string is displayed while flowing in that direction. A user who sees the display of the signage window can know what kind of information the content provider intends to provide, and can devise the content of the utterance. The content provider can guide the user to the content to be provided by displaying the tag name or character string indicating the content to be provided to the user on the signage window 57 .

また、スピーチウィンドウ表示部４０９は、生成されたガイドデータのタグ名称をすべて、ユーザが使用する端末装置１ｂ上でスピーチウィンドウに表示させる。ユーザが発話することにより、表示されているガイドデータのタグ名称を絞り込むことができる。絞り込むことで、スピーチウィンドウにスクロール等の操作をすることなくタグ名称が表示することができ、クリック、タッチ等の選択操作でタグ名称を選択することにより、対応付けられているコンテンツの特定部分を表示することができる。 Also, the speech window display unit 409 displays all tag names of the generated guide data in a speech window on the terminal device 1b used by the user. The tag names of the displayed guide data can be narrowed down by the user's speech. By narrowing down, the tag names can be displayed in the speech window without operations such as scrolling. can be displayed.

図９は、本発明の実施の形態に係る検索支援サーバ３が、ユーザが使用する端末装置１ｂに表示させるスピーチウィンドウの例示図である。図９（ａ）は、本実施の形態に係る検索支援サーバ３が、ユーザが使用する端末装置１ｂに表示させるスピーチウィンドウの例示図であり、図９（ｂ）は、本実施の形態に係る検索支援サーバ３がコンテンツの特定部分を表示させる端末装置１ｂの例示図である。図９（ａ）に示すように、生成されたガイドデータのタグ名称は、スピーチウィンドウ９０１上で選択可能な状態でガイドデータ表示領域９０２にすべて表示される。 FIG. 9 is an exemplary diagram of a speech window displayed on the terminal device 1b used by the user by the search support server 3 according to the embodiment of the present invention. FIG. 9(a) is an exemplary diagram of a speech window displayed on the terminal device 1b used by the user by the search support server 3 according to the present embodiment, and FIG. 9(b) is a speech window according to the present embodiment. 3 is an exemplary diagram of a terminal device 1b on which a search support server 3 displays a specific portion of content; FIG. As shown in FIG. 9A, all tag names of generated guide data are displayed in a guide data display area 902 in a selectable state on a speech window 901 .

ガイドデータ表示領域９０２に表示しきれない場合も、スクロールボタン９０３を操作することですべてのガイドデータのタグ名称を閲覧することができる。ガイドデータ表示領域９０２に表示されているガイドデータのタグ名称の中から、一のタグ名称の選択を受け付ける。これにより、選択を受け付けたタグ名称のガイドデータの基礎となるサーチデータを特定することができるので、サーチデータに対応付けられているコンテンツの特定部分を表示することができる。 Even if all the tag names cannot be displayed in the guide data display area 902, the tag names of all the guide data can be browsed by operating the scroll button 903. FIG. Selection of one tag name from tag names of the guide data displayed in the guide data display area 902 is accepted. As a result, it is possible to specify the search data that is the basis of the guide data of the tag name whose selection has been accepted, so that the specified portion of the content associated with the search data can be displayed.

図９の例では、スピーチウィンドウ９０１に表示されているタグ名称の選択を受け付けた場合、対応するコンテンツの特定部分を確認できるよう、端末装置１ｂに表示する。ユーザが使用する端末装置１ｂは、図９に示すようにデスクトップ型でも良いし、スマホ、タブレット等の携帯端末であっても良い。 In the example of FIG. 9, when the selection of the tag name displayed in the speech window 901 is accepted, the specific part of the corresponding content is displayed on the terminal device 1b so that it can be confirmed. The terminal device 1b used by the user may be a desktop type as shown in FIG. 9, or may be a mobile terminal such as a smartphone or a tablet.

図１０は、本発明の実施の形態に係る検索支援サーバ３のＣＰＵ３１のコンテンツ提供者の設定処理手順を示すフローチャートである。図１０において、検索支援サーバ３のＣＰＵ３１は、コンテンツの内容に基づいて、内容を示す単文又は単語群を抽出して（ステップＳ１００１）、抽出された単文又は単語群の選択を受け付けた場合に対応するコンテンツを表示する（ステップＳ１００２）。 FIG. 10 is a flow chart showing the content provider setting processing procedure of the CPU 31 of the search support server 3 according to the embodiment of the present invention. In FIG. 10, the CPU 31 of the search support server 3 extracts a simple sentence or word group indicating the content based on the content (step S1001), and accepts selection of the extracted simple sentence or word group. The content to be displayed is displayed (step S1002).

ＣＰＵ３１は、表示出力されたテキストデータ、あるいは動画コンテンツの場合には再生している動画コンテンツの特定部分に基づいて、タグ名称及びタグ名称の内容を示す単文又は単語群の入力を受け付ける（ステップＳ１００３）。 The CPU 31 accepts input of a tag name and a simple sentence or word group indicating the contents of the tag name based on the displayed text data or, in the case of moving image content, a specific portion of the moving image content being reproduced (step S1003). ).

ＣＰＵ３１は、表示されたコンテンツの特定部分に基づいて、コンテンツの特定部分を示すポインタ情報を探索する（ステップＳ１００４）。ＣＰＵ３１は、入力を受け付けたタグ名称ごとに、探索されたポインタ情報の割り付けを受け付ける（ステップＳ１００５）。これにより、タグ名称の選択を受け付けることにより、割り付けられたポインタ情報に従って、コンテンツの特定部分を表示することができる。 Based on the displayed specific portion of the content, CPU 31 searches for pointer information indicating the specific portion of the content (step S1004). The CPU 31 accepts allocation of searched pointer information for each tag name whose input is accepted (step S1005). Accordingly, by receiving the selection of the tag name, it is possible to display the specific portion of the content according to the allocated pointer information.

ＣＰＵ３１は、入力を受け付けたタグ名称ごとに、検索時に発話可能な属性情報の入力を受け付ける（ステップＳ１００６）。後述するように属性情報をユーザが発話することにより、タグ名称を絞り込むことができる。 The CPU 31 receives input of attribute information that can be spoken at the time of searching for each tag name whose input is received (step S1006). The tag names can be narrowed down by the user uttering the attribute information as will be described later.

ＣＰＵ３１は、入力を受け付けたタグ名称及びタグ名称の内容を示す単文又は単語群、属性情報を、ポインタ情報に対応付けてサーチデータを生成する（ステップＳ１００７）。ＣＰＵ３１は、生成されたサーチデータに含まれるタグ情報を、サーチデータを識別するサーチデータ識別情報と対応付けて、ガイドデータを生成する（ステップＳ１００８）。 The CPU 31 associates the input tag name, simple sentence or word group indicating the content of the tag name, and attribute information with the pointer information to generate search data (step S1007). CPU 31 associates the tag information included in the generated search data with the search data identification information for identifying the search data to generate guide data (step S1008).

ＣＰＵ３１は、生成されたガイドデータのタグ名称等を、コンテンツ提供者が選択して、ユーザが使用する端末装置１ｂ上でサイネージウィンドウに表示させ（ステップＳ１００９）、生成されたガイドデータのタグ名称をすべて、ユーザが使用する端末装置１ｂ上でスピーチウィンドウに表示させる（ステップＳ１０１０）。 The CPU 31 causes the content provider to select the tag name and the like of the generated guide data and displays it on the signage window on the terminal device 1b used by the user (step S1009), and displays the tag name of the generated guide data. All are displayed on the speech window on the terminal device 1b used by the user (step S1010).

以下、ユーザが検索処理を実行する手順について説明する。本実施の形態では、コンテンツ提供者がユーザの検索を誘導するサイネージウィンドウ及びスピーチウィンドウを表示させている点に特徴を有している。ただし、それだけではなく、ユーザにとって所望のコンテンツを迅速にかつ確実に検索表示させる工夫として、表示されるガイドデータのタグ名称をユーザが発話することにより絞り込むことができる点が従来の検索システムとの大きな相違点である。 A procedure for a user to execute search processing will be described below. This embodiment is characterized in that the content provider displays a signage window and a speech window for guiding the user's search. However, in addition to this, as a device for quickly and reliably searching and displaying desired content for the user, the user can speak the tag name of the displayed guide data to narrow down the search system, which is different from the conventional search system. This is a big difference.

図１１は、本発明の実施の形態に係る検索支援サーバ３のユーザによる発話の音声認識処理の機能ブロック図である。図１１では、ユーザの使用する端末装置１ｂに、図５（ｃ）に示すサイネージウィンドウ及び図９（ａ）に示すスピーチウィンドウが表示されている状態でユーザによる発話を入力として受け付け、正しく音声認識する手順について説明する。 FIG. 11 is a functional block diagram of speech recognition processing of an utterance by the user of the search support server 3 according to the embodiment of the present invention. In FIG. 11, the terminal device 1b used by the user accepts the user's utterance as an input in a state in which the signage window shown in FIG. 5(c) and the speech window shown in FIG. 9(a) are displayed. I will explain the procedure to do.

図１１に示すように、発話受付部１１０１は、ユーザにより発話された音声データの入力を受け付ける。具体的には、端末装置１ｂにおいてユーザが発話した音声データを受信することで、音声データを取得する。 As shown in FIG. 11, an utterance accepting unit 1101 accepts an input of voice data uttered by a user. Specifically, the voice data is obtained by receiving the voice data uttered by the user in the terminal device 1b.

前処理部１１０２は、入力を受け付けた音声データに対して雑音除去、発話区間の検出等を実行する。前処理された音声データは、メタタグ音声認識部１１０３及び絞り込み音声認識部１１０８へ渡され、メタタグ音声認識部１１０３及び絞り込み音声認識部１１０８を並行して実行する。 The preprocessing unit 1102 performs noise removal, speech period detection, and the like on the received voice data. The preprocessed speech data is passed to the metatag speech recognition unit 1103 and the narrowed speech recognition unit 1108, and the metatag speech recognition unit 1103 and the narrowed speech recognition unit 1108 are executed in parallel.

メタタグ音声認識部１１０３は、前処理された音声データを認識してテキストデータに変換し、変換されたテキストデータに基づいて、ガイドデータのタグ名称及びタグ名称の内容を示す単文又は単語群に対する一致度を算出し、算出された一致度が最大であるタグ名称を特定するとともに、一致度が所定値より大きい場合にタグ名称に対応するコンテンツの特定部分を表示する。つまり、メタタグ音声認識部１１０３で正しく認識できた場合には、他の音声認識処理と統合する処理等余分な処理を実行することなく認識結果を即座に出力することができる。 The metatag speech recognition unit 1103 recognizes the preprocessed speech data and converts it into text data, and based on the converted text data, matches the tag name of the guide data and a simple sentence or word group indicating the content of the tag name. The degree of matching is calculated, the tag name with the highest calculated degree of matching is identified, and when the degree of matching is greater than a predetermined value, a specific portion of the content corresponding to the tag name is displayed. In other words, when the metatag speech recognition unit 1103 correctly recognizes the speech, the recognition result can be immediately output without executing extra processing such as integration with other speech recognition processing.

すなわち、メタタグ音声認識部１１０３は、テキストデータ変換部１１０４、一致度算出部１１０５、タグ名称特定部１１０６、コンテンツ表示・再生部１１０７を備えている。テキストデータ変換部１１０４は、前処理された音声データを、いわゆるディクテーショングラマーに基づいて認識してテキストデータに変換する。 That is, the metatag speech recognition unit 1103 includes a text data conversion unit 1104 , a match calculation unit 1105 , a tag name specification unit 1106 and a content display/playback unit 1107 . A text data conversion unit 1104 recognizes the preprocessed voice data based on a so-called dictation grammar and converts it into text data.

一致度算出部１１０５は、すべてのタグ名称及びタグ名称の内容を示す単文又は単語群で形成された第一のデータセットを照会して、入力を受け付けた音声データの認識結果との一致度を算出する。 The degree-of-match calculation unit 1105 inquires the first data set formed of simple sentences or word groups indicating the contents of all tag names and tag names, and calculates the degree of match with the recognition result of the received voice data. calculate.

タグ名称特定部１１０６は、算出された一致度が最大であるタグ名称を特定する。一致度が最大であるタグ名称が、最も確からしい認識結果だからである。しかし、一致度が所定の閾値以下である場合には誤認識の可能性も高い。 The tag name identification unit 1106 identifies the tag name with the highest calculated degree of matching. This is because the tag name with the highest degree of matching is the most probable recognition result. However, if the degree of matching is equal to or less than a predetermined threshold, there is a high possibility of erroneous recognition.

コンテンツ表示・再生部１１０７は、算出された一致度が所定の閾値より大きいか否かを判断する。所定の閾値以下であると判断した場合には、認識精度が不十分であるとして再度の発話の待ち状態とする。所定の閾値より大きいと判断した場合には、特定されたタグ名称に対応付けられたコンテンツの特定部分を表示あるいは再生して、認識結果が正しいか否かを判断することができる。 Content display/playback unit 1107 determines whether or not the calculated degree of matching is greater than a predetermined threshold. If it is determined that the value is equal to or less than the predetermined threshold value, it is determined that the recognition accuracy is insufficient, and the system waits for another utterance. When it is determined that the value is larger than the predetermined threshold, it is possible to display or reproduce the specific portion of the content associated with the specified tag name and determine whether or not the recognition result is correct.

絞り込み音声認識部１１０８は、前処理された音声データに基づいて、いわゆるルールグラマーを用いて、ガイドデータのタグ情報のうち、属性情報に一致するものを抽出し、一致する一又は複数の属性情報の論理積によるガイドデータの絞り込みを行い、絞り込まれたガイドデータのタグ名称及び属性情報を出力する。これにより、メタタグ音声認識部１１０３で妥当な認識結果を得られない場合であっても、次にユーザが発話する内容をさらに絞り込むことができるので、音声認識の成功率を高めることができ、結果として短時間で正しい認識結果を得ることができる。具体的には、絞り込み音声認識部１１０８は、属性抽出部１１０９、絞り込み部１１１０を備えている。 Based on the preprocessed speech data, the narrowed-down speech recognition unit 1108 uses a so-called rule grammar to extract tag information of the guide data that matches the attribute information, and extracts one or more pieces of attribute information that match. The guide data is narrowed down by the logical product of , and the tag name and attribute information of the narrowed down guide data are output. As a result, even if the metatag speech recognition unit 1103 cannot obtain an appropriate recognition result, it is possible to further narrow down what the user will say next, so that the success rate of speech recognition can be increased. As a result, correct recognition results can be obtained in a short time. Specifically, the narrowed-down speech recognition unit 1108 includes an attribute extraction unit 1109 and a narrowing-down unit 1110 .

属性抽出部１１０９は、すべてのタグ情報に対応付けられている属性情報で形成された第二のデータセットを音声認識フィルタとして用いた認識結果として属性情報を抽出する。絞り込み部１１１０は、抽出された属性情報の論理積としてタグ名称（ガイドデータ）を絞り込む。 The attribute extraction unit 1109 extracts attribute information as a recognition result using the second data set formed of attribute information associated with all tag information as a speech recognition filter. The narrowing down unit 1110 narrows down the tag name (guide data) as a logical product of the extracted attribute information.

絞り込み音声認識部１１０８は、絞り込まれたガイドデータの属性情報により、音声認識フィルタである第二のデータセットを更新する更新部１１１１を備えることが好ましい。第二のデータセットは、属性情報によりガイドデータが絞り込まれる都度更新されるので、ユーザによる次の発話をさらに制限することになり、正しい認識結果を得る確信度を高めることができ、より迅速に正しい認識結果を得ることが可能となる。 The narrowed-down speech recognition unit 1108 preferably includes an updating unit 1111 that updates the second data set, which is the speech recognition filter, according to the attribute information of the narrowed-down guide data. Since the second data set is updated each time the guide data is narrowed down by the attribute information, the user's next utterance is further restricted, and the degree of certainty that correct recognition results are obtained can be increased, and the recognition result can be obtained more quickly. A correct recognition result can be obtained.

図１２は、本発明の実施の形態に係る検索支援サーバ３のＣＰＵ３１の音声認識処理の手順を示すフローチャートである。検索支援サーバ３のＣＰＵ３１は、ユーザにより発話された音声データの入力を受け付ける（ステップＳ１２０１）。具体的には、端末装置１ｂにおいてユーザが発話した音声データを受信することで、音声データを取得する。 FIG. 12 is a flow chart showing the procedure of voice recognition processing of the CPU 31 of the search support server 3 according to the embodiment of the present invention. The CPU 31 of the search support server 3 receives input of voice data uttered by the user (step S1201). Specifically, the voice data is obtained by receiving the voice data uttered by the user in the terminal device 1b.

ＣＰＵ３１は、入力を受け付けた音声データに対して雑音除去、発話区間の検出等を実行する（ステップＳ１２０２）。前処理された音声データに基づいて、以下の２つの処理（メタタグ音声認識及び絞り込み音声認識）が並行して実行される。 CPU 31 performs noise removal, detection of a speech period, and the like on the received voice data (step S1202). Based on the preprocessed audio data, the following two processes (metatag speech recognition and refined speech recognition) are performed in parallel.

まず、ＣＰＵ３１は、前処理された音声データに基づいて、前処理された音声データを、いわゆるディクテーショングラマーに基づいて認識してテキストデータに変換する（ステップＳ１２０３）。ＣＰＵ３１は、すべてのタグ名称及びタグ名称の内容を示す単文又は単語群で形成された第一のデータセットを照会して、入力を受け付けた音声データとの一致度を算出する（ステップＳ１２０４）。ＣＰＵ３１は、算出された一致度が最大であるタグ名称を特定する（ステップＳ１２０５）。 First, the CPU 31 recognizes the preprocessed voice data based on the so-called dictation grammar and converts it into text data (step S1203). The CPU 31 refers to all tag names and the first data set formed of simple sentences or word groups indicating the contents of the tag names, and calculates the degree of matching with the received voice data (step S1204). The CPU 31 identifies the tag name with the highest calculated degree of matching (step S1205).

ＣＰＵ３１は、算出された一致度が所定の閾値より大きいか否かを判断する（ステップＳ１２０６）。ＣＰＵ３１が、所定の閾値以下であると判断した場合（ステップＳ１２０６：ＮＯ）、ＣＰＵ３１は、処理をステップＳ１２０１へ戻して、再度の発話の待ち状態となる。 CPU 31 determines whether or not the calculated degree of matching is greater than a predetermined threshold (step S1206). When the CPU 31 determines that it is equal to or less than the predetermined threshold (step S1206: NO), the CPU 31 returns the process to step S1201 and waits for another utterance.

ＣＰＵ３１が、所定の閾値より大きいと判断した場合（ステップＳ１２０６：ＹＥＳ）、ＣＰＵ３１は、特定されたタグ名称に対応付けられたコンテンツの特定部分を表示あるいは再生する（ステップＳ１２０７）。 When the CPU 31 determines that the value is greater than the predetermined threshold (step S1206: YES), the CPU 31 displays or reproduces the specified portion of the content associated with the specified tag name (step S1207).

一方、ＣＰＵ３１は、前処理された音声データに基づいて、全てのタグ名称に対応付けられている属性情報で形成された第二のデータセットを音声認識フィルタとして用い、認識結果の属性情報を抽出する（ステップＳ１２０８）。ＣＰＵ３１は、抽出された属性情報の論理積としてタグ名称（ガイドデータ）を絞り込む（ステップＳ１２０９）。ＣＰＵ３１は、絞り込まれたガイドデータのタグ名称及び属性情報により、音声認識フィルタである第二のデータセットを更新し（ステップＳ１２１０）、処理をステップＳ１２０１へ戻して、再度の発話の待ち状態となる。 On the other hand, based on the preprocessed voice data, the CPU 31 extracts the attribute information of the recognition result using the second data set formed of attribute information associated with all tag names as a voice recognition filter. (step S1208). The CPU 31 narrows down the tag name (guide data) as a logical product of the extracted attribute information (step S1209). The CPU 31 updates the second data set, which is the voice recognition filter, with the narrowed down tag name and attribute information of the guide data (step S1210), returns the process to step S1201, and waits for another utterance. .

以上のように本実施の形態によれば、Ｗｅｂサイト上に公開されている多数のコンテンツの中から、コンテンツを提供するコンテンツ提供者が、自分のコンテンツへユーザを誘導しやすいガイドデータをスピーチウィンドウやサイネージウィンドウに表示することができるとともに、ユーザ自らの意思で選択対象となるガイドデータを絞り込むことができるので、個々のユーザが表示されているガイドデータ通りに発話あるいは選択することにより、迅速にかつ確実にコンテンツ提供者が提供するコンテンツをユーザに対して表示することが可能となる。 As described above, according to the present embodiment, a content provider who provides content from among a large number of content published on a website can display guide data that easily guides the user to his/her own content in the speech window. or signage window, and the user can narrow down the guide data to be selected at his or her own will. Moreover, it is possible to reliably display the content provided by the content provider to the user.

また、本実施の形態によれば、メタタグ音声認識処理と絞り込み音声認識処理を並行して実行することにより、メタタグ音声認識が正しい場合にはそのまま認識結果を出力することができ、一方で、認識結果の確からしさが足りない場合には、より絞り込んだガイドデータに基づくタグ名称をスピーチウィンドウに表示することができる。したがって、ユーザは、比較的短時間で所望のコンテンツを表示又は再生することが可能となる。 Further, according to the present embodiment, by executing the metatag speech recognition process and the narrowing speech recognition process in parallel, if the metatag speech recognition is correct, the recognition result can be output as it is. If the certainty of the result is insufficient, tag names based on more refined guide data can be displayed in the speech window. Therefore, the user can display or reproduce desired content in a relatively short period of time.

なお、本発明は上記実施例に限定されるものではなく、本発明の趣旨の範囲内であれば多種の変更、改良等が可能である。例えば属性情報を含むタグ情報の入力方法は、上述した方法に限定されるものではなく、タグ名称を絞り込むことができる文字列を入力可能であれば特に限定されるものではない。 The present invention is not limited to the above-described embodiments, and various modifications and improvements are possible within the scope of the present invention. For example, the method of inputting tag information including attribute information is not limited to the method described above, and is not particularly limited as long as it is possible to input a character string that can narrow down the tag name.

また、ユーザによる発話を認識する場合に、メタタグ音声認識部１１０３と並行に処理を実行する認識処理は、絞り込み音声認識部１１０８に限定されるものではなく、ユーザによる次の発話を限定することができる処理であれば特に限定されるものではない。 Further, when recognizing an utterance by a user, the recognition processing that executes processing in parallel with the metatag speech recognition unit 1103 is not limited to the narrowed-down speech recognition unit 1108, and may limit the next utterance by the user. It is not particularly limited as long as it can be processed.

また、音声認識に用いるディクテーショングラマー及びルールグラマーは、テキストデータ、ガイドデータ及び外部から取得したテキストデータを教師データとして学習することにより生成することが望ましい。学習方法としては、いわゆる機械学習、深層学習のようにＡＩを用いても良いし、対応テーブルを拡充するような従来の方法であっても良く、特に限定されるものではない。 Also, the dictation grammar and rule grammar used for speech recognition are desirably generated by learning text data, guide data, and externally acquired text data as teacher data. As a learning method, AI such as so-called machine learning or deep learning may be used, or a conventional method such as expanding a correspondence table may be used, and is not particularly limited.

１、１ａ、１ｂ端末装置
２ネットワーク網
３検索支援サーバ
１１、３１ＣＰＵ
１２、３２メモリ
１３、３３記憶装置
１４、３４Ｉ／Ｏインタフェース
１５、３５ビデオインタフェース
１６、３６可搬型ディスクドライブ
１７、３７通信インタフェース
１８、３８内部バス
９０、９１記憶媒体
１００、１０１コンピュータプログラム
３３１サーチデータ記憶部
３３２ガイドデータ記憶部
Reference Signs List 1, 1a, 1b terminal device 2 network network 3 search support server 11, 31 CPU
12, 32 memory 13, 33 storage device 14, 34 I/O interface 15, 35 video interface 16, 36 portable disk drive 17, 37 communication interface 18, 38 internal bus 90, 91 storage medium 100, 101 computer program 331 search Data storage unit 332 Guide data storage unit

Claims

A search support server that supports searches for content published on a website,
a signage window selectively displaying guide data for guiding speech for selecting tag information indicating a specific portion of content;
causing a terminal device connected to be capable of data communication to display a speech window capable of narrowing down guide data to be displayed from among the guide data by user's utterance,
Accepting the selection by accepting the user's speech of the tag information displayed in the signage window and/or the speech window, narrowing down the displayed tag information, and accepting the selection of the narrowed down tag information. A search support server that causes the terminal device to display a specific portion of the content corresponding to the tag information.

The tag information is composed of at least a tag name identifying a specific part of the content, text data consisting of a simple sentence or a group of words describing the content of the tag name, and attribute information indicating attributes of the tag name. 2. The search support server according to claim 1.

Search data generation means for generating search data that is the basis of guide data to be displayed in the signage window and/or the speech window;
The search data generation means is
Extraction/display means for extracting a simple sentence or word group indicating the content based on the content of the content and displaying the corresponding content when selection of the extracted simple sentence or word group is accepted;
tag input receiving means for receiving an input of the tag name and a simple sentence or word group indicating the content of the tag name based on the displayed content;
pointer searching means for searching for pointer information indicating a specific portion of the content based on the displayed content;
Pointer allocation receiving means for receiving allocation of the searched pointer information for each tag name whose input is received;
3. The search support server according to claim 2, further comprising attribute information receiving means for receiving an input of said attribute information that can be spoken at the time of searching for each said tag name whose input has been received.

The extraction/display means is
a content transcription means for scrutinizing the content, extracting contained character strings as simple sentences or word groups, and outputting them as text data;
4. The search support server according to claim 3, further comprising content display means for, when receiving a selection of output text data, displaying the content corresponding to the selected text data.

5. The search support server according to claim 3, wherein the guide data is generated by associating the tag information included in the search data with search data identification information for identifying the search data.

To narrow down the tag names that are displayed in the speech window and that can be selected by a user,
an utterance receiving means for receiving an input of voice data uttered by a user;
converts voice data received as input into text data, calculates the degree of matching of the converted text data with the tag name of the guide data and a simple sentence or word group indicating the content of the tag name, and the calculated degree of matching is metatag speech recognition means for identifying a maximum tag name and displaying a specific portion of the content corresponding to the tag name when the degree of matching is greater than a predetermined value;
Based on the received voice data, the tag information of the guide data that matches the attribute information is extracted, and the guide data is narrowed down by logical product of one or more pieces of attribute information that match. and narrowing-down speech recognition means for outputting the tag name and the attribute information of the narrowed-down guide data, wherein the meta-tag speech recognition means and the narrowing-down speech recognition means are executed in parallel. 6. The search support server according to any one of 3 to 5.

The metatag speech recognition means calculates the degree of matching between the converted text data and a first data set formed of simple sentences or word groups indicating all the tag names and the contents of the tag names,
The narrowed-down speech recognition means uses a logical product of the attribute information extracted as a recognition result using a second data set formed of the attribute information associated with all tag names as a speech recognition filter to perform the 7. The search support server according to claim 6, wherein the guide data is narrowed down.

8. The search support server according to claim 7, wherein said narrowed-down speech recognition means updates said second data set with said attribute information of the narrowed-down guide data.

A search support method that can be executed by a search support server that supports searches for content published on a website, comprising:
The search support server
a signage window selectively displaying guide data for guiding speech for selecting tag information indicating a specific portion of content;
a step of displaying a speech window capable of narrowing down the guide data to be displayed from among the guide data by user's utterance on a terminal device connected so as to be capable of data communication;
Accepting the selection by accepting the user's speech of the tag information displayed in the signage window and/or the speech window, narrowing down the displayed tag information, and accepting the selection of the narrowed down tag information. and causing the terminal device to display a specific portion of the content corresponding to the tag information.

A computer program that can be executed by a search support server that supports searches for content published on a website,
the search support server,
a signage window selectively displaying guide data for guiding speech for selecting tag information indicating a specific portion of content;
a speech window capable of narrowing down the guide data to be displayed from the guide data by user's utterance;
means for displaying on a terminal device connected for data communication, and
Accepting the selection by accepting the user's speech of the tag information displayed in the signage window and/or the speech window, narrowing down the displayed tag information, and accepting the selection of the narrowed down tag information. means for displaying a specific portion of the content corresponding to the tag information on the terminal device;
A computer program characterized by functioning as