JP2020190836A

JP2020190836A - Video signal processing apparatus and video signal processing method

Info

Publication number: JP2020190836A
Application number: JP2019094658A
Authority: JP
Inventors: 大石丸; Masaru Ishimaru
Original assignee: Toshiba Visual Solutions Corp
Current assignee: Toshiba Visual Solutions Corp
Priority date: 2019-05-20
Filing date: 2019-05-20
Publication date: 2020-11-26
Anticipated expiration: 2039-05-20
Also published as: JP7242423B2

Abstract

To provide a video signal processing apparatus which can implement control easier to use than conventional one by combining a voice control technique with a parental control function in accordance with one embodiment.SOLUTION: The video signal processing apparatus of one embodiment comprises: a voice signal input unit which receives a voice instruction; a speaker specification unit which specifies a speaker who is a source of the voice instruction received by the voice signal input unit, and his or her age; a determination unit which determines whether or not the voice instruction is acceptable for preliminarily set restriction information on the basis of the specified age and the voice instruction; a control execution unit which executes the voice instruction when the determination unit determines that the voice instruction is acceptable; and a warning output unit which outputs warning when the determination unit determines that the voice instruction should be rejected.SELECTED DRAWING: Figure 1

Description

本実施形態は、映像信号処理装置、映像信号処理方法に関する。 The present embodiment relates to a video signal processing device and a video signal processing method.

近年、音声認識技術の向上により、音声で制御を行うことのできる機器が増えてきている。映像信号処理装置もその例外ではない。例えばユーザが、電源のオンオフやチャンネル変更を行うのに、リモートコントローラを操作することなく、例えば「電源つけて」「チャンネル変えて」のような音声を発話するだけで、その制御を行えるようになってきた。 In recent years, with the improvement of voice recognition technology, the number of devices capable of controlling by voice is increasing. Video signal processing equipment is no exception. For example, a user can control the power on / off and channel change by simply uttering a voice such as "turn on the power" or "change channel" without operating the remote controller. It has become.

一方、映像信号処理装置には、パレンタルコントロール（視聴制限）の機能があり、過激な内容の番組コンテンツなどには、番組情報として制限年齢が付与されている。その番組情報を利用することで、その許容年齢に達しない子供には、コンテンツの視聴を制限することができるようになっている。 On the other hand, the video signal processing device has a parental control (viewing restriction) function, and a restricted age is given as program information to program contents having extreme contents. By using the program information, it is possible to restrict viewing of the content to children who do not reach the permissible age.

特開２００５−２２３８４６号公報Japanese Unexamined Patent Publication No. 2005-223846

これまでの映像信号処理装置におけるパレンタルコントロールの仕組みは、主に、制限なく視聴できる年齢情報をあらかじめ映像信号処理装置に設定しておき、もし番組のもつ年齢情報が映像信号処理装置に設定された年齢の範囲外の場合に、解除コードを入力するまではその番組の視聴をできない状態にするという仕組みであった。 The mechanism of parrent control in the conventional video signal processing device is mainly to set the age information that can be viewed without limitation in the video signal processing device in advance, and if the age information of the program is set in the video signal processing device. If the age range is out of the range, the program cannot be viewed until the unlock code is entered.

例えば、映像信号処理装置に、パレンタル制御年齢が１４歳以上と設定されていた場合、対象年齢が１４歳以下のコンテンツを閲覧することは視聴者の年齢に関係なく無制限で行えるが、例えば対象年齢が１９歳という情報が付与されたコンテンツを閲覧する場合には、たとえ視聴者が２０歳以上であったとしても、その年齢に関係なく、視聴するためには、パレンタルロックを解除しなければならない。 For example, if the video signal processing device is set to have a parrent control age of 14 years or older, content with a target age of 14 years or younger can be viewed without limitation regardless of the viewer's age. When viewing content with information that the age is 19, even if the viewer is 20 years old or older, the parrent lock must be released in order to view it regardless of that age. Must be.

このように、映像信号処理装置上では、本当は視聴を許されている人が、番組に付された制限のために視聴できない場合に、解除コードを入力するという手順が必要となっていた。また上記の例のパレンタルロック方式では、その視聴を制限されるべき子供が知ってしまった場合、解除コードを変更するまでの間、子供でも番組の制限を自由に解除して視聴できてしまうという問題があった。 As described above, on the video signal processing device, it is necessary for a person who is actually allowed to watch the program to enter a release code when the viewer cannot watch the program due to the restrictions attached to the program. Also, in the parrent lock method of the above example, if a child whose viewing should be restricted knows, even the child can freely release the restriction on the program and watch it until the release code is changed. There was a problem.

そこでこの実施形態ではパレンタルコントロール機能に音声制御技術を組み合わせることで、従来よりも利用しやすく、パレンタルロック機能も確実となる、映像信号処理装置、映像信号処理方法を提供することを目的とする。 Therefore, in this embodiment, by combining the parental control function with the voice control technology, it is an object to provide a video signal processing device and a video signal processing method that are easier to use than before and the parental lock function is also reliable. To do.

また他の実施形態では、音声コマンドに基づく処理実行するに際して、そのコマンド入力者を事前に特定して、そのコマンド入力者の年齢情報からパレンタルコントロールの制限・解除を、解除コードの操作入力を要せずに、行うことを可能とする映像信号処理装置映像信号処理方法を提供することを目的とする。 In another embodiment, when executing a process based on a voice command, the command input person is specified in advance, the restriction / release of the parrent control is performed from the age information of the command input person, and the operation input of the release code is performed. It is an object of the present invention to provide a video signal processing device video signal processing method capable of performing without requiring.

一実施形態の映像信号処理装置によると、
音声命令が入力される音声信号入力部と、
前記音声信号入力部から入力した前記音声命令の元である話者と該話者の年齢を特定する話者特定部と、
前記特定された前記年齢と、前記音声命令に基づき、前記音声命令は予め設定されている制限情報に対して許容されるべきか否定されるべきかを判断する判断部と、
前記判断部が前記音声命令は許容されるべきと判断した場合は前記音声命令を実行する制御実行部と、
前記判断部が前記音声命令は否定されるべきと判断した場合は警告を出力する警告出力部と、を備えた提供できる。 According to the video signal processing device of one embodiment
A voice signal input section where voice commands are input, and
A speaker that is the source of the voice command input from the voice signal input unit, a speaker identification unit that specifies the age of the speaker, and a speaker identification unit.
Based on the specified age and the voice command, a determination unit for determining whether the voice command should be allowed or denied with respect to preset restriction information.
When the determination unit determines that the voice command should be allowed, the control execution unit that executes the voice command and the control execution unit
A warning output unit that outputs a warning when the determination unit determines that the voice command should be denied can be provided.

また前記音声命令を前記判断部に入力する系統では音声帯域の音声データをテキスト化する音声認識部、テキストデータを機械語（機械的命令）にする自然言語理解部が用いられる。 Further, in a system for inputting the voice command to the determination unit, a voice recognition unit for converting voice data in the voice band into text and a natural language understanding unit for converting text data into machine language (mechanical command) are used.

図１は本発明の一実施形態に係る映像信号処理装置の全体構成を示す構成説明図である。FIG. 1 is a configuration explanatory view showing an overall configuration of a video signal processing device according to an embodiment of the present invention. 図２は図１に示した映像信号処理装置において、ユーザ情報を事前設定するときに機能するブロックを取り出して示す部分構成図である。FIG. 2 is a partial configuration diagram showing a block that functions when user information is preset in the video signal processing device shown in FIG. 図３は図１に示した映像信号処理装置において、パレンタル制限を受けないユーザがチャンネル選択を行う場合の説明図である。FIG. 3 is an explanatory diagram of the video signal processing device shown in FIG. 1 when a user who is not subject to parrent restrictions selects a channel. 図４は図１に示した映像信号処理装置において、パレンタル制限を受けるユーザがチャンネル選択を行う場合の説明図である。FIG. 4 is an explanatory diagram of the video signal processing device shown in FIG. 1 when a user subject to parrent restrictions selects a channel. 図５は図１に示した映像信号処理装置の一動作例を説明するフローチャートである。FIG. 5 is a flowchart illustrating an operation example of the video signal processing device shown in FIG. 図６は図１に示した映像信号処理装置の他の動作例を説明するフローチャートである。FIG. 6 is a flowchart illustrating another operation example of the video signal processing device shown in FIG.

以下、実施の形態について図面を参照して説明する。図１は一実施形態であり、例えば放送受信装置１００に適用された例である。放送受信装置１００における受信系統の基本構成５０は、チューナ装置５１、映像・音声データ処理装置５３、映像信号出力部５４、オーディオ信号出力部５５、記録・再生媒体接続部５６などで構成される。さらにまた、ネットワーク接続５２も設けられており、外部サーバ等と通信を行うことができる。例えば外部サーバには、ビデオオンデマンドによる動画配信機能があり、視聴者は配信画像を視聴することも可能である。さらには、外部サーバに対して、視聴ログをアップロードすることも可能である。外部サーバは、多数の放送受信装置からの視聴ログを解析して、視視聴者に対して今人気のあるおすすめ番組や、商業コマーシャルなどの情報をサービスすることができる。 Hereinafter, embodiments will be described with reference to the drawings. FIG. 1 is an embodiment, for example, an example applied to a broadcast receiving device 100. The basic configuration 50 of the receiving system in the broadcast receiving device 100 includes a tuner device 51, a video / audio data processing device 53, a video signal output unit 54, an audio signal output unit 55, a recording / playback medium connection unit 56, and the like. Furthermore, a network connection 52 is also provided so that communication with an external server or the like can be performed. For example, the external server has a video distribution function by video on demand, and the viewer can also view the distributed image. Furthermore, it is also possible to upload the viewing log to an external server. The external server can analyze viewing logs from a large number of broadcast receiving devices and provide information such as recommended programs and commercial commercials that are currently popular to viewers.

ここで本実施形態の映像信号処理装置は、マイク（音声信号入力部）１１を備える。マイク１１で取得したデータは、音声認識部１２、特徴量検出部１４に入力される。音声認識部１２は、音声帯域の音声データをテキスト化し、このテキストデータを機械語（機械的命令）にする自然言語理解部１３に入力する。つまり音声認識部１２と自然言語理解部１３は、音声による発話内容を辞書データなど用いて理解（解読）して命令を出力し、パレンタル制御判断部１７に入力する。この音声認識部１２及び又は自然言語理解部１３は、インターネットを介して外部サーバに設けられていてもよい。自然言語理解部１３で理解された発話内容による命令は、パレンタル制御部１７に送られる。 Here, the video signal processing device of the present embodiment includes a microphone (audio signal input unit) 11. The data acquired by the microphone 11 is input to the voice recognition unit 12 and the feature amount detection unit 14. The voice recognition unit 12 converts the voice data in the voice band into text and inputs the text data to the natural language understanding unit 13 which converts the text data into a machine language (mechanical instruction). That is, the voice recognition unit 12 and the natural language understanding unit 13 understand (decode) the utterance content by voice using dictionary data or the like, output a command, and input the command to the parrent control determination unit 17. The voice recognition unit 12 and / or the natural language understanding unit 13 may be provided on an external server via the Internet. The command based on the utterance content understood by the natural language understanding unit 13 is sent to the parrent control unit 17.

一方、特徴量検出部１４は、例えば話者の声紋などを解析して声紋解析データを出力する。声紋解析データは、話者特定部１５に入力する。個人個人の話者の声紋解析データ（特徴量）は、予めデータベース１６に登録されている。図の例では、ユーザＡ（年齢４３）、
ユーザＢ（年齢４１）、ユーザＣ（年齢１５）、ユーザＤ（年齢１０）の特徴量がそれぞれデータベース１６に登録されている。 On the other hand, the feature amount detection unit 14 analyzes, for example, the voiceprint of the speaker and outputs the voiceprint analysis data. The voiceprint analysis data is input to the speaker identification unit 15. The voiceprint analysis data (feature amount) of the individual speaker is registered in the database 16 in advance. In the example of the figure, user A (age 43),
The feature amounts of user B (age 41), user C (age 15), and user D (age 10) are registered in the database 16, respectively.

話者特定部１５は、入力した新しい声紋解析データと、データベース１６に登録されている複数の登録済声紋解析データとを次々と比較し、新しい声紋解析データに対応する話者を特定する。 The speaker identification unit 15 compares the input new voiceprint analysis data with the plurality of registered voiceprint analysis data registered in the database 16 one after another, and identifies the speaker corresponding to the new voiceprint analysis data.

特定された話者の年齢もデータベースに登録されている。これにより、現在発話した話者（ユーザ）は、何歳であるかが判明する。 The age of the identified speaker is also registered in the database. This reveals how old the speaker (user) who is currently speaking is.

パレンタル制御判断部１７は、特定された話者と、その年齢と、パレンタル制御（ロック）すべき番組の番組情報を受け取る。 The parrent control determination unit 17 receives the identified speaker, its age, and program information of the program to be parrented and controlled (locked).

パレンタル制御（ロック）すべき番組は、その番組情報において制限年齢が指定されている。即ち、番組情報は、チャンネルと制限すべきマーク（識別データ）が付されてデータベース１８に格納されている。データベース１８には、制限すべき限度となる年齢と制限すべきチャンネルのデータがペアで格納されている。この図の例ではＹｃｈチャンネルでは、１２歳以下が制限されており、Ｚｃｈチャンネルでは、１８歳以下が制限されている。なお番組情報は、映像再生装置２２に記録された番組の番組情報を含んでもよい。 The age limit is specified in the program information for programs that should be controlled (locked). That is, the program information is stored in the database 18 with a channel and a mark (identification data) to be restricted. The database 18 stores a pair of data on the age limit to be restricted and the channel to be restricted. In the example of this figure, the Ych channel is restricted to 12 years or younger, and the Zch channel is restricted to 18 years or younger. The program information may include the program information of the program recorded in the video reproduction device 22.

パレンタル制御判断部１７は、特定された話者と、その年齢と、パレンタル制御（ロック）すべき番組の情報を受け取り、以下のように判断する。 The parrent control determination unit 17 receives information on the specified speaker, its age, and the program to be parrented and controlled (locked), and determines as follows.

即ち、新しい話者がユーザＤであり、例えば「Ｙｃｈチャンネルにして」と発話したとする。このときは、ユーザＤは１２歳以下であり、一方Ｙｃｈチャンネルは、１２歳以下には制限がかかっているので、パレンタル制御判断部１７は、チャンネル切り替えができないものと判断し、その判断結果を制御実行部２１に通知する。すると、制御実行部２１は、警告出力部（表示及び又は音声）２３により、例えば「この番組を視聴することはできません」のように警告を出力する。また、記録再生装置２３に対して出力（或いは再生）停止信号を出力する。 That is, it is assumed that the new speaker is user D and utters, for example, "make it a Ych channel". At this time, the user D is 12 years old or younger, while the Ych channel is restricted to 12 years old or younger. Therefore, the parrent control determination unit 17 determines that the channel cannot be switched, and the determination result is obtained. Is notified to the control execution unit 21. Then, the control execution unit 21 outputs a warning by the warning output unit (display and / or voice) 23, for example, "This program cannot be viewed". Further, an output (or reproduction) stop signal is output to the recording / reproducing device 23.

なお上記各ブロックの動作順序などは、システム制御部３０の制御に基づいてコントロールされている。 The operation order of each block is controlled based on the control of the system control unit 30.

上記したように、本システムでは、音声の声紋などで個人個人を特定することが可能である。このために、パンレンタル制御された番組に対して、個人毎にかつ受信装置毎に「解除」と「制限」が確実に行われることになる。 As described above, in this system, it is possible to identify an individual by voiceprints or the like. For this reason, "release" and "restriction" are surely performed for each individual and each receiving device for the program under pan rental control.

図２は、例えば家庭のユーザＡ，Ｂ，Ｃ，Ｄがそれぞれの音声の特徴と年齢情報を、管理テーブルに構築するための構成を示している。図１と共通する部分には、図１と同じ符号を付して説明する。 FIG. 2 shows a configuration for, for example, home users A, B, C, and D to construct voice features and age information in a management table. The parts common to FIG. 1 will be described with the same reference numerals as those in FIG.

ユーザは、例えばリモートコントローラ（図示せず）、或いは、放送受信装置１００に設けられている特定の操作キーを操作して、ユーザ登録モードに装置を切り替える。この場合、操作キーによる入力は、例えば放送受信装置１００を管理する父親（例えばユーザＡ）或いは母親（例えばユーザＢ）のみが知る特定の暗証番号が好ましい。装置が登録モードになると、話者年齢情報設定部５１が起動し、これから登録すべき話者（例えばユーザＣ，或いはＤ）の音声入力モードとなる。 The user switches the device to the user registration mode by operating, for example, a remote controller (not shown) or a specific operation key provided on the broadcast receiving device 100. In this case, the input by the operation key is preferably a specific personal identification number known only to the father (for example, user A) or the mother (for example, user B) who manages the broadcast receiving device 100, for example. When the device enters the registration mode, the speaker age information setting unit 51 is activated, and the speaker (for example, user C or D) to be registered is set to the voice input mode.

この場合、最初は、管理者（例えばユーザＡ）が管理者としての音声の特徴量を予め登録していることが好ましい。これは、その後、管理者（例えばユーザＡ）を除くこれからの登録者（例えばユーザＣ或いはＤ）の音声の特徴量を登録するとき、管理者が、音声でこれからの登録者に対して指示を出すことがあるからである。このような登録モードのときは、管理者の音声が検知されたとしても、話者特定部１５は、管理者の音声を無視して、新しく検知した話者の特徴量を新しいユーザとして認識して、データベース１６に登録する。そして、当該ユーザの年齢情報の入力を待つ。 In this case, it is preferable that the administrator (for example, user A) first registers the feature amount of the voice as the administrator in advance. After that, when registering the voice features of future registrants (for example, user C or D) excluding the administrator (for example, user A), the administrator gives an instruction to the future registrants by voice. This is because it may be issued. In such a registration mode, even if the administrator's voice is detected, the speaker identification unit 15 ignores the administrator's voice and recognizes the newly detected feature amount of the speaker as a new user. And register it in the database 16. Then, it waits for the input of the age information of the user.

年齢情報は、話者年齢情報設定部５１により検出され、データベース１７に登録される。年齢情報は、例えばリモートコントローラによる入力や、音声入力が可能である。音声入力の場合は、先に検出した声紋を持つユーザが発話した年齢を検出する。例えば１０歳、或いは６歳などの発話を理解して年齢判断を行う。これにより、データベース１６には、ユーザと、このユーザの年齢と、このユーザの音声の特徴量データとが関連付けて登録される。 The age information is detected by the speaker age information setting unit 51 and registered in the database 17. The age information can be input by a remote controller or voice input, for example. In the case of voice input, the age at which the user with the previously detected voiceprint speaks is detected. For example, the age is judged by understanding the utterances such as 10 years old or 6 years old. As a result, the user, the age of the user, and the voice feature amount data of the user are registered in the database 16 in association with each other.

上記の登録に関しては、システム制御部３０のシーケンス制御に基づいて、操作ガイド音声及び又は文字などの操作ガイドの表示が出力される。 Regarding the above registration, the display of the operation guide such as the operation guide voice and / or the character is output based on the sequence control of the system control unit 30.

図３は、例えばユーザＣ（１５歳）が音声により、チャンネル切り替えとして、チャンネルＸｃｈを発話により指示した例を示している。例えば「チャンネルをＸｃｈにして」と発話した例を示している。この発話は、音声認識部１２でテキスト化され、自然言語理解部１３において、命令語（例えばChang ch: X）に変換されて、パレンタル制御判断部１７に入力される。この場合、データベース１６上では話者は、１５歳として特定され、自然言語理解部１３では、話者がＸｃｈチャンネルへの切り替えを指示したことが検出される。 FIG. 3 shows an example in which user C (15 years old) instructed channel Xch by utterance as channel switching by voice, for example. For example, an example of uttering "Set the channel to Xch" is shown. This utterance is converted into text by the voice recognition unit 12, converted into a command word (for example, Chang ch: X) by the natural language understanding unit 13, and input to the parrent control determination unit 17. In this case, the speaker is identified as 15 years old on the database 16, and the natural language understanding unit 13 detects that the speaker has instructed to switch to the Xch channel.

パレンタル制御判定部１７は、データベース１８を参照して、Ｘｃｈチャンネルに対して番組情報に基づく制限が与えられているか否かの判定を行う。データベース１８上では、Ｘｃｈチャンネルに対する視聴制限はないためにパレンタル制御判定部１７は制御実行部２１に対して、Ｘｃｈチャンネルへの切り替えを指示する。同様な動作は、ユーザＤがＸｃｈチャンネルの指示を行ってもＸｃｈチャンネルへの切り替えが実行される。 The parrent control determination unit 17 refers to the database 18 and determines whether or not the Xch channel is restricted based on the program information. Since there is no viewing restriction on the Xch channel on the database 18, the parrent control determination unit 17 instructs the control execution unit 21 to switch to the Xch channel. In the same operation, switching to the Xch channel is executed even if the user D instructs the Xch channel.

図４は、例えばユーザＤ（１０歳）が音声により、チャンネル切り替えとして、Ｚｃｈチャンネルを発話により指示した例を示している。この場合、データベース１６上では話者は、１０歳として特定され、自然言語理解部１３では、話者がＺｃｈチャンネルへの切り替えを指示したことが検出される。 FIG. 4 shows an example in which user D (10 years old) instructed the Zch channel by utterance as channel switching by voice, for example. In this case, the speaker is identified as 10 years old on the database 16, and the natural language understanding unit 13 detects that the speaker has instructed to switch to the Zch channel.

パレンタル制御判定部１７は、データベース１８を参照して、Ｚｃｈチャンネルに対して番組情報に基づく制限が与えられているか否かの判定を行う。データベース１８上では、
１８歳以下の人への制限が与えられている。このために、パレンタル制御判定部１７は、Ｚｃｈチャンネルの切り替えを拒否すべく制御実行部２１へ通知する。するとこの場合は、警告出力部２３により、例えば「この番組を視聴することはできません」のように警告を出力する。なお図３、図４において、図１と同一部には同一符号を付して説明は省略する。 The parrent control determination unit 17 refers to the database 18 and determines whether or not the Zch channel is restricted based on the program information. On database 18,
There are restrictions on people under the age of 18. Therefore, the parrent control determination unit 17 notifies the control execution unit 21 to refuse the switching of the Zch channel. Then, in this case, the warning output unit 23 outputs a warning such as "This program cannot be viewed". In FIGS. 3 and 4, the same parts as those in FIG. 1 are designated by the same reference numerals, and the description thereof will be omitted.

制御実行部２１が実行する番組視聴制限処理のタイプは、各種の方法が可能である。例えば、制限されている番組のチャンネル受信そのものを制限する、或いはチャンネルは受信するが、番組の復調を行わない、或いは復調まで行うが、出力を停止する、さらには出力をすべてクロレベル或いは白レベルの画像とするなど、各種の方法が可能である。 Various methods can be used as the type of program viewing restriction processing executed by the control execution unit 21. For example, the channel reception itself of the restricted program is restricted, or the channel is received but the program is not demodulated, or the program is demodulated but the output is stopped, and all the outputs are black level or white level. Various methods are possible, such as making an image of.

本システムは上記の実施形態に限定されるものではない。音声命令による機器操作機能および話者特定機能は、上記放送受信装置１００内に設けられる必要はなく、ネットワークを通じて外部の装置に設けられていてもよい。したがって、放送受信装置１００内には、例えばマイク１１、パレンタル制御判断部１７、制御実行部２１、警告出力部２３、システム制御部３０と、基本構成５０が設けられ、音声認識部１２、自然言語理解部１３、特徴量検出部１４、話者特定部１５、データベース１６、１８、などは外部に設けられていてもよい。さらにはパレンタル制御判断部１７も外部に設けられてよい。 The system is not limited to the above embodiments. The device operation function and the speaker identification function by voice commands do not need to be provided in the broadcast receiving device 100, and may be provided in an external device through a network. Therefore, in the broadcast receiving device 100, for example, a microphone 11, a parrent control determination unit 17, a control execution unit 21, a warning output unit 23, a system control unit 30, and a basic configuration 50 are provided, and a voice recognition unit 12 and a natural language recognition unit 12 are provided. The language understanding unit 13, the feature amount detection unit 14, the speaker identification unit 15, the databases 16 and 18, and the like may be provided externally. Further, the parrent control determination unit 17 may be provided outside.

話者特定部１５は、事前に機器を操作するユーザの声を記憶（学習）しておき、その声との類似度で同一人物と判断した。しかし、ユーザ声の事前学習は行わず、音声コマンドのデータの特徴量（周波数成分など）を用いて、その声の年齢層を推定する手法も可能である。この場合、正確な年齢の推定までは難しいが、明らかな子供声の場合は視聴制限対象とする一方、明らかな大人声の場合は視聴制限対象にしないという制御も実現可能である。 The speaker identification unit 15 memorizes (learns) the voice of the user who operates the device in advance, and determines that the person is the same person based on the degree of similarity with the voice. However, it is also possible to estimate the age group of the voice by using the feature amount (frequency component, etc.) of the voice command data without pre-learning the user voice. In this case, although it is difficult to estimate the exact age, it is possible to realize control that the viewing restriction is applied to the obvious child voice, while the viewing restriction is not applied to the obvious adult voice.

話者の年齢情報については、事前に設定した。さらに生年月日を入力しておくことで、年齢情報を（誕生日がきたら）自動で調節する機能を付加してもよい。視聴コンテンツとしては、地デジやＢＳなどの放送局が提供する番組を想定しているが、YouTube（登録商標）やNetflix（登録商標）などのネットワークストリーミングコンテンツであっても、制限年齢の情報を持つコンテンツであればすべてに適用できる。 The age information of the speaker was set in advance. Furthermore, by inputting the date of birth, a function to automatically adjust the age information (when the birthday comes) may be added. As the viewing content, programs provided by broadcasting stations such as terrestrial digital broadcasting and BS are assumed, but even if it is network streaming content such as YouTube (registered trademark) or Netflix (registered trademark), information on the age limit can be used. It can be applied to all content that you have.

実際に、「チャンネルを・・・にして」の音声コマンドが、視聴制限に引っかかった場合に、チャンネル変更をしない、チャンネル変更するが黒画面にする、指定されたチャンネル以降の最初の視聴可能チャンネルに変更をする、音声でチャンネル変更が失敗したことを伝える、画面上にチャンネル変更が失敗したことを表示するなどの機能を追加することも可能である。 Actually, when the voice command of "Set channel ..." is caught in the viewing restriction, the channel is not changed, the channel is changed but the screen is black, and the first available channel after the specified channel It is also possible to add functions such as making a change to, telling by voice that the channel change has failed, and displaying on the screen that the channel change has failed.

上記したように本システムは、ユーザの音声認識と年齢認識を行うことができるために次のような動作を得ることも可能である。 As described above, since this system can perform voice recognition and age recognition of the user, it is also possible to obtain the following operations.

図５は、例えば子供（１０歳）が、「Ｙｃｈチャンネルに切り替えて」と発話した場合である。この場合、本システムでは、Ｙｃｈチャンネルは、１０歳の子供に対しては視聴拒否し、警告を発する（ステップＡｓ１、Ａｓ２）。しかし、同じ部屋に父親が居て、「Ｙｃｈの今の番組は子供が視聴しても構わない」と判断した場合、父親が「Ｙｃｈチャンネルに切り替えて」と発話して、Ｙｃｈチャンネルへの切り替えを実現させることが可能である（ステップＡｓ３、Ａｓ４）。 FIG. 5 shows a case where, for example, a child (10 years old) utters "Switch to Ych channel". In this case, in this system, the Ych channel rejects viewing and issues a warning to a 10-year-old child (steps As1, As2). However, if the father is in the same room and decides that "the current program of Ych can be watched by the child", the father utters "switch to the Ych channel" and switches to the Ych channel. Can be realized (steps As3, As4).

図６は、他の動作例を示している。今父親（４３歳）が、Ｙｃｈチャンネルの番組を視聴していたとする（ステップＢｓ１）。ここで、例えば子供（１０歳）が部屋に入ってきて、子供の声を本実施形態のシステムが認識したとする（ステップＢｓ２）。このときの音声は、放送受信装置１００に対する音声命令に限定されない。 FIG. 6 shows another operation example. Suppose that his father (43 years old) is watching a program on the Ych channel (step Bs1). Here, for example, it is assumed that a child (10 years old) enters the room and the system of the present embodiment recognizes the voice of the child (step Bs2). The voice at this time is not limited to the voice command to the broadcast receiving device 100.

すると、システムは、自動的にＸｃｈチャンネル（制限がかかっていない番組）へ自動的に切り替える（ステップＢｓ３１）、或いは、警告のテロップ又は音を出力する（ステップＢｓ３２）、或いは画像を非表示（全面黒または白）に切り替える（ステップＢｓ３３）などの何れかの処理を実行する。そして次の操作があるのを待ち（ステップＢｓ３４）、次の操作があれば処理を終了する。ステップＢｓ３１、Ｂｓ３２，Ｂｓ３３の何れを実行させるかは、ユーザ（管理者）が予め選択して設定することが可能である。或いは、放送受信装置１００の出荷時にいずれかが設定されていてもよい。 Then, the system automatically switches to the Xch channel (program without restrictions) (step Bs31), outputs a warning telop or sound (step Bs32), or hides the image (entire surface). Any process such as switching to black or white) (step Bs33) is executed. Then, it waits for the next operation (step Bs34), and ends the process if there is the next operation. It is possible for the user (administrator) to select and set in advance which of steps Bs31, Bs32, and Bs33 is to be executed. Alternatively, either of them may be set at the time of shipment of the broadcast receiving device 100.

本発明のいくつかの実施形態を説明したが、これらの実施形態は例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態の変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。さらにまた、請求項の各構成要素において、構成要素を分割して表現した場合、或いは複数を合わせて表現した場合、或いはこれらを組み合わせて表現した場合であっても本発明の範疇である。また、複数の実施形態を組み合わせてもよく、この組み合わせで構成される実施例も発明の範疇である。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other embodiments, and various omissions, replacements, and changes can be made without departing from the gist of the invention. The modifications of these embodiments are included in the scope and gist of the invention, and are also included in the scope of the invention described in the claims and the equivalent scope thereof. Furthermore, in each of the constituent elements of the claims, even when the constituent elements are divided and expressed, when a plurality of the constituent elements are expressed together, or when these components are expressed in combination, it is within the scope of the present invention. Further, a plurality of embodiments may be combined, and examples composed of these combinations are also within the scope of the invention.

また請求項を制御ロジックとして表現した場合、コンピュータを実行させるインストラクションを含むプログラムとして表現した場合、及び前記インストラクションを記載したコンピュータ読み取り可能な記録媒体として表現した場合でも本発明の装置を適用したものである。また、使用している名称や用語についても限定されるものではなく、他の表現であっても実質的に同一内容、同趣旨であれば、本発明に含まれるものである。 Further, the apparatus of the present invention is applied even when the claim is expressed as a control logic, when it is expressed as a program including an instruction for executing a computer, and when it is expressed as a computer-readable recording medium in which the instruction is described. is there. Further, the names and terms used are not limited, and other expressions are included in the present invention as long as they have substantially the same contents and the same purpose.

１１・・・マイク、１２・・・音声認識部、１３・・・自然言語理解部、１４・・・特徴量検出部、１５・・・話者特定部、１７・・・パレンタル制御判断部、２１・・・制御実行部、２０・・・基本構成、１００・・・放送受信装置。 11 ... Microphone, 12 ... Voice recognition unit, 13 ... Natural language understanding unit, 14 ... Feature detection unit, 15 ... Speaker identification unit, 17 ... Parent control judgment unit , 21 ... Control execution unit, 20 ... Basic configuration, 100 ... Broadcast receiver.

Claims

A voice signal input section where voice commands are input, and
A speaker that is the source of the voice command input from the voice signal input unit, a speaker identification unit that specifies the age of the speaker, and a speaker identification unit.
Based on the specified age and the voice command, a determination unit for determining whether the voice command should be allowed or denied with respect to preset restriction information.
When the determination unit determines that the voice command should be allowed, the control execution unit that executes the voice command and the control execution unit
A warning output unit that outputs a warning when the judgment unit determines that the voice command should be denied.
Video signal processing device equipped with.

The video signal processing device according to claim 1, wherein the voice command is input to the determination unit after the language is recognized by the language understanding unit.

The execution in response to the voice command
The video signal processing device according to claim 1, which is either a display process or a reception process of a program of a broadcast channel or a playback process from a recording / playback device.

The video signal processing device according to claim 2 or 3, wherein at least one of the speaker identification unit, the determination unit, and the language understanding unit is arranged outside via a network.

Input a voice command to the voice signal input section,
The speaker identification unit identifies the speaker who is the source of the voice command input from the voice signal input unit and the age of the speaker.
The language comprehension department recognizes the language of the voice command and
Based on the specified age, the voice command, and the recognized language, the determination unit should allow or deny the voice command for preset restriction information. Judging,
When the control execution unit determines that the voice command should be allowed, the judgment unit executes the voice command.
A video signal processing method that outputs a warning when the judgment unit determines that the audio command should be denied by the warning output unit.

Even when the warning is output by the warning output unit, if a second voice command from a different speaker is input as the voice command, the determination unit determines that the second voice command is allowed. The video signal processing method according to claim 5.

When the voice command is executed by the control execution unit and the judgment unit determines from the age related to the newly input voice signal that the current execution should be denied, a warning or a channel The video signal processing method according to claim 5, wherein the switching is determined.