JP2008309959A

JP2008309959A - Audio signal recording device and electronic file

Info

Publication number: JP2008309959A
Application number: JP2007156628A
Authority: JP
Inventors: Tomoki Oku; 智岐奥
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2007-06-13
Filing date: 2007-06-13
Publication date: 2008-12-25

Abstract

<P>PROBLEM TO BE SOLVED: To introduce a privacy protection mechanism with respect to a recording system of an audio signal. <P>SOLUTION: The audio signal recording device detects a speech section including a signal component of human speech from a whole section of an input original audio signal, and performs encryption processing for the signal in the speech section, and thereby, an encrypted audio signal is generated from the original audio signal. Then, the encrypted audio signal, decode information for decoding the encrypted audio signal, and right management information for controlling and switching permission/forbidding for decoding of the encrypted audio signal according to the information for decoding, and playback output of the original audio signal, are stored in the electronic file correspondingly to each other. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、オーディオ信号を記録するオーディオ信号記録装置、オーディオ信号の記録及び再生を行うオーディオ信号記録再生装置、並びに、デジタルビデオカメラ等の撮像装置に関する。また本発明は、オーディオ信号を格納した電子ファイルに関する。また本発明は、上記の各装置又は電子ファイルと関連する、情報提供装置、端末装置、オーディオ信号再生装置及び電子ファイルの記録方式に関する。 The present invention relates to an audio signal recording apparatus for recording audio signals, an audio signal recording / reproducing apparatus for recording and reproducing audio signals, and an imaging apparatus such as a digital video camera. The present invention also relates to an electronic file storing an audio signal. The present invention also relates to an information providing device, a terminal device, an audio signal reproducing device, and an electronic file recording method related to each of the above devices or electronic files.

近年、デジタルビデオカメラやボイスレコーダで記録したオーディオ信号のデータを、インターネットを介して、ソーシャルネットワーキングサービス（以下、「ＳＮＳ」と略記する）や所謂ブログ等のウェブサイト上で公開することが多くなっている。これまでは、家族や知人の間だけで閲覧（視聴）していた記録データも、インターネットを介して公開することで不特定多数の人間が閲覧することが可能となる。このような背景の中、公開データに関するプライバシーを保護する技術の必要性も指摘されている。オーディオ信号に関してプライバシーの保護に重要なのは発話者の音声を含む部分であり、その部分のデータが適切に扱われれば、発話者のプライバシーが保護される。 In recent years, audio signal data recorded by a digital video camera or voice recorder is often published on a website such as a social networking service (hereinafter abbreviated as “SNS”) or a so-called blog via the Internet. ing. Until now, recorded data that has been viewed (viewed) only among family members and acquaintances can be viewed by an unspecified number of people by publishing it via the Internet. Against this background, the need for technology that protects the privacy of public data has been pointed out. What is important for protecting the privacy of the audio signal is a portion including the voice of the speaker, and if the data in the portion is appropriately handled, the privacy of the speaker is protected.

尚、オーディオ信号に関する従来技術として下記特許文献１及び２に記載された技術がある。 Note that there are techniques described in Patent Documents 1 and 2 below as conventional techniques related to audio signals.

特許文献１の技術は、著作権保護対象となる音楽情報を保護するための技術であり、人が会話しているシーンなどに対して提供可能な技術ではない。 The technique of Patent Document 1 is a technique for protecting music information that is subject to copyright protection, and is not a technique that can be provided for a scene where a person is talking.

特許文献２の技術では、提供するデータを複数のブロックに分割して各ブロックにセキュリティレベルを設定する。そして、再生時に、設定されたセキュリティレベルに応じて各ブロックのデータの提供を許可するか否かを判断する。この技術では、データ提供時にブロックを分割して各ブロックにセキュリティレベルを設定するという作業が必要となり、手間がかかる。 In the technique of Patent Literature 2, the provided data is divided into a plurality of blocks, and a security level is set for each block. Then, at the time of reproduction, it is determined whether or not the provision of data of each block is permitted according to the set security level. In this technique, it is necessary to work to divide the blocks and set the security level for each block when data is provided.

特開２００１−７６４３１号公報JP 2001-76431 A 特開２００４−２８７５２５号公報JP 2004-287525 A

そこで本発明は、オーディオ信号に関するプライバシー保護に寄与する、オーディオ信号記録装置、オーディオ信号記録再生装置、撮像装置、電子ファイル、情報提供装置、端末装置、オーディオ信号再生装置及び電子ファイルの記録方式を提供することを目的とする。 Therefore, the present invention provides an audio signal recording device, an audio signal recording / reproducing device, an imaging device, an electronic file, an information providing device, a terminal device, an audio signal reproducing device, and an electronic file recording method that contribute to privacy protection related to audio signals. The purpose is to do.

上記目的を達成するために本発明に係るオーディオ信号記録装置は、入力された元オーディオ信号の全区間から人の音声の信号成分が含まれている音声区間を検出する音声区間検出手段と、前記元オーディオ信号の内の、前記音声区間における信号に対して暗号化処理を施すことにより、前記元オーディオ信号から暗号化オーディオ信号を生成する暗号化手段と、前記暗号化オーディオ信号と、前記暗号化オーディオ信号を復号するための復号用情報と、を互いに関連付けて格納した電子ファイルを記録手段に記録する記録制御手段と、を備えたことを特徴とする。 In order to achieve the above object, an audio signal recording apparatus according to the present invention includes a speech section detecting means for detecting a speech section including a signal component of human speech from all sections of an input original audio signal, An encryption means for generating an encrypted audio signal from the original audio signal by performing an encryption process on the signal in the voice section of the original audio signal, the encrypted audio signal, and the encryption And a recording control unit that records in the recording unit an electronic file in which the decoding information for decoding the audio signal is stored in association with each other.

オーディオ信号記録装置上で電子ファイルを保存する時点で暗号化処理が実施され、電子ファイル内の音声区間におけるオーディオ信号は暗号化されるため、仮に、電子ファイルが不正に流出した場合でも、音声に関するプライバシーが保護される。また、プライバシー保護にとって重要な音声区間が自動的に検出され、音声区間に対して暗号化処理が自動的に施されるため、プライバシー保護を図るためのユーザ負担が極めて少ない。 When the electronic file is stored on the audio signal recording device, the encryption process is performed, and the audio signal in the audio section in the electronic file is encrypted. Therefore, even if the electronic file leaks illegally, Privacy is protected. In addition, since a voice section important for privacy protection is automatically detected and encryption processing is automatically performed on the voice section, the burden on the user for protecting privacy is extremely small.

そして例えば、前記記録制御手段は、前記復号用情報に従って前記暗号化オーディオ信号を復号して該復号によって得られた前記元オーディオ信号を再生出力することに対する許可／禁止を切替制御するための権限管理情報を、更に、前記暗号化オーディオ信号及び前記復号用情報に関連付けて前記電子ファイルに格納する。 And, for example, the recording control means controls authority to switch permission / prohibition for decrypting the encrypted audio signal according to the decryption information and reproducing and outputting the original audio signal obtained by the decryption. Information is further stored in the electronic file in association with the encrypted audio signal and the decryption information.

権限管理情報を適切に設定することにより、ユーザの意図に沿った、元オーディオ信号の再生制御が可能となる。 By appropriately setting the authority management information, it is possible to control the reproduction of the original audio signal in accordance with the user's intention.

具体的には例えば、前記復号用情報は、前記全区間中の何れの区間が前記音声区間であるかを表す音声区間情報を含む。 Specifically, for example, the decoding information includes speech section information indicating which section of the entire section is the speech section.

また具体的には例えば、前記暗号化手段は、前記元オーディオ信号の内の、前記音声区間以外の区間における信号を、前記暗号化処理の対象から除外する。 More specifically, for example, the encryption unit excludes a signal in a section other than the speech section in the original audio signal from the target of the encryption process.

また例えば、前記音声区間は、互いに異なる複数の要素区間から成り、当該オーディオ信号記録装置は、前記音声の発話者と予め登録された登録話者との一致又は不一致を要素区間ごとに判別する話者認識手段を更に備え、前記復号用情報は、前記全区間中の何れの区間が前記音声区間であるかを表すとともに各要素区間に対する前記話者認識手段の判別結果をも表す音声区間情報を含み、前記記録制御手段は、前記復号用情報に従って各要素区間の前記暗号化オーディオ信号を復号して該復号によって得られた各要素区間の前記元オーディオ信号を再生出力することに対する許可／禁止を切替制御するための権限管理情報を、更に、前記暗号化オーディオ信号及び前記復号用情報に関連付けて前記電子ファイルに格納し、前記権限管理情報は、前記登録話者の音声の信号成分を含む要素区間に対する第１の権限管理情報と、それ以外の要素区間に対する第２の権限管理情報と、を個別に含む。 In addition, for example, the voice section is composed of a plurality of different element sections, and the audio signal recording apparatus determines whether the voice speaker and the registered speaker registered in advance are identical or inconsistent for each element section. Further comprising speaker recognition means, wherein the decoding information includes speech section information indicating which section of the entire section is the speech section and also indicating a discrimination result of the speaker recognition means for each element section. The recording control means includes permission / prohibition for decoding the encrypted audio signal of each element section according to the decryption information and reproducing and outputting the original audio signal of each element section obtained by the decryption. Authority management information for switching control is further stored in the electronic file in association with the encrypted audio signal and the decryption information, and the authority management information Includes a first rights management information for the element section including a signal component of a voice of the registered speaker, and the second authorization control information for the other elements section, the individual.

また、本発明に係るオーディオ信号記録再生装置は、上記のオーディオ信号記録装置を備えている。そして、前記権限管理情報は、認証コードを含み、当該オーディオ信号記録再生装置と他のオーディオ信号記録再生装置との間で互いに異なる固有コードが当該オーディオ信号記録再生装置に予め与えられており、当該オーディオ信号記録再生装置は、オーディオ信号を再生出力する再生出力手段と、前記復号用情報に基づいて前記暗号化オーディオ信号を復号する復号処理手段と、前記認証コードと当該オーディオ信号記録再生装置に対する固有コードとを照合する照合手段と、前記照合手段による照合結果に基づいて、前記復号処理手段の復号によって得られた前記元オーディオ信号の前記再生出力手段での再生出力を許可するか否かを判別する判別手段と、を備え、前記判別手段の判別結果に応じて前記復号処理手段及び前記再生出力手段を制御することを特徴とする。 An audio signal recording / reproducing apparatus according to the present invention includes the above-described audio signal recording apparatus. The authority management information includes an authentication code, and unique codes different from each other between the audio signal recording / reproducing device and the other audio signal recording / reproducing device are given in advance to the audio signal recording / reproducing device, The audio signal recording / reproducing apparatus includes reproduction / output means for reproducing / outputting an audio signal, decryption processing means for decoding the encrypted audio signal based on the decryption information, and the authentication code and the audio signal recording / reproducing apparatus. It is determined whether or not reproduction output of the original audio signal obtained by decoding of the decoding processing unit is permitted in the reproduction output unit based on a verification result of the verification unit and a verification result of the verification unit Discriminating means, and the decoding processing means and the reproduction output according to the discrimination result of the discriminating means And controlling the stage.

これにより、電子ファイルを記録した記録媒体の紛失等があった場合でも、音声に関するプライバシーが保護される。 Thereby, even when the recording medium on which the electronic file is recorded is lost, the privacy regarding the voice is protected.

また、本発明に係る撮像装置は、被写体に応じた画像を取得する。そして、上記のオーディオ信号記録装置又はオーディオ信号記録再生装置を備えている。 In addition, the imaging apparatus according to the present invention acquires an image corresponding to a subject. The audio signal recording apparatus or the audio signal recording / reproducing apparatus is provided.

また、本発明に係る電子ファイルは、元オーディオ信号の内の、人の音声の信号成分が含まれている音声区間における信号に対して暗号化処理を施すことによって得られた暗号化オーディオ信号のデータと、前記暗号化オーディオ信号を復号するための復号用情報のデータと、を互いに関連付けて格納したことを特徴とする。 In addition, the electronic file according to the present invention is an encrypted audio signal obtained by performing encryption processing on a signal in a voice section including a signal component of human voice in the original audio signal. Data and decryption information data for decrypting the encrypted audio signal are stored in association with each other.

そして例えば、前記復号用情報に従って前記暗号化オーディオ信号を復号して該復号によって得られた前記元オーディオ信号を再生出力することに対する許可／禁止を切替制御するための権限管理情報のデータを、更に、前記暗号化オーディオ信号のデータ及び前記復号用情報のデータに関連付けて電子ファイルに格納するとよい。 And for example, authority management information data for switching permission / prohibition for decrypting the encrypted audio signal according to the decryption information and reproducing and outputting the original audio signal obtained by the decryption, and The data may be stored in an electronic file in association with the data of the encrypted audio signal and the data of the decryption information.

また、本発明に係る情報提供装置は、上記の電子ファイルを提供元装置から受け取り、前記提供元装置と所定の関係を有し且つオーディオ信号を再生出力する再生出力手段を備えた端末装置からの送信要求に従って通信網を介して前記電子ファイルに基づく情報を前記端末装置に送信する情報提供装置であって、前記電子ファイル内の復号用情報に基づいて前記電子ファイル内の前記暗号化オーディオ信号を復号する復号処理手段を備え、前記送信要求があった際、前記電子ファイル内の権限管理情報と前記関係に基づいて、前記復号処理手段の復号によって得られた元オーディオ信号の前記端末装置に対する送信を許可するか否かを判別し、その判別結果に応じて前記端末装置への送信内容を制御することを特徴とする。 Also, an information providing apparatus according to the present invention receives a digital file from the providing apparatus, receives the electronic file from the providing apparatus, and has a predetermined relationship with the providing apparatus and includes a reproduction output means for reproducing and outputting an audio signal. An information providing device that transmits information based on the electronic file to the terminal device via a communication network in accordance with a transmission request, wherein the encrypted audio signal in the electronic file is transmitted based on decryption information in the electronic file. Decoding processing means for decoding, and when there is a transmission request, based on the authority management information in the electronic file and the relationship, transmission of the original audio signal obtained by decoding of the decoding processing means to the terminal device It is characterized in that whether or not to permit is determined, and the transmission content to the terminal device is controlled according to the determination result.

これにより、ＳＮＳ等において、オーディオ信号に関するプライバシーを保護する仕組みを導入することが可能である。 Thereby, it is possible to introduce a mechanism for protecting privacy related to audio signals in SNS and the like.

具体的には例えば、当該情報提供装置は、当該情報提供装置は、前記関係を事前に認識しており、前記権限管理情報と前記関係に応じて、前記端末装置に、第１及び第２の権限を含む複数段階の権限の内の何れかを与え、前記第１の権限を前記端末装置に与えているときにおいて、前記送信要求があった際、前記復号処理手段の復号によって得られた元オーディオ信号を前記端末装置に対して送信する一方、前記第２の権限を前記端末装置に与えているときにおいて、前記送信要求があった際、前記復号処理手段の復号によって得られた元オーディオ信号の前記端末装置に対する送信を禁止する。 Specifically, for example, the information providing apparatus recognizes the relationship in advance, and the first and second information is transmitted to the terminal device according to the authority management information and the relationship. An element obtained by decryption of the decryption processing means when there is the transmission request when giving any one of a plurality of authorities including an authority and giving the first authority to the terminal device An original audio signal obtained by decoding of the decoding processing means when the transmission request is made while the audio signal is transmitted to the terminal device and the second authority is given to the terminal device Is prohibited from being transmitted to the terminal device.

或いは例えば、当該情報提供装置は、当該情報提供装置は、前記関係を事前に認識しており、前記権限管理情報と前記関係に応じて、前記端末装置に、第１、第２及び第３の権限を含む複数段階の権限の内の何れかを与え、前記第１の権限を前記端末装置に与えているときにおいて、前記送信要求があった際、前記復号処理手段の復号によって得られた元オーディオ信号を前記端末装置に対して送信し、前記第２の権限を前記端末装置に与えているときにおいて、前記送信要求があった際、前記元オーディオ信号から第１加工オーディオ信号を生成して該第１加工オーディオ信号を前記端末装置に対して送信し、前記第３の権限を前記端末装置に与えているときにおいて、前記送信要求があった際、前記元オーディオ信号から第２加工オーディオ信号を生成して該第２加工オーディオ信号を前記端末装置に対して送信し、前記第１加工オーディオ信号は、前記元オーディオ信号の前記音声区間における音声の特徴を変化させることによって生成され、前記第２加工オーディオ信号は、前記元オーディオ信号の前記音声区間から音声の信号成分を排除することによって生成される。 Alternatively, for example, the information providing apparatus recognizes the relationship in advance, and the first, second, and third information is sent to the terminal device according to the authority management information and the relationship. An element obtained by decryption of the decryption processing means when there is the transmission request when giving any one of a plurality of authorities including an authority and giving the first authority to the terminal device When the audio signal is transmitted to the terminal device and the second authority is given to the terminal device, the first processed audio signal is generated from the original audio signal when the transmission request is made. When the first processed audio signal is transmitted to the terminal device and the third authority is given to the terminal device, a second processed audio signal is transmitted from the original audio signal when the transmission request is made. Generating a second signal and transmitting the second processed audio signal to the terminal device, wherein the first processed audio signal is generated by changing a voice characteristic in the voice section of the original audio signal; The second processed audio signal is generated by excluding an audio signal component from the audio section of the original audio signal.

また、本発明に係る端末装置は、オーディオ信号を再生出力する再生出力手段を備え、上記の情報提供装置から通信網を介して前記電子ファイルに基づく情報を受け取って、受け取った情報に基づくオーディオ信号を前記再生出力手段にて再生出力することを特徴とする。 Further, the terminal device according to the present invention includes reproduction output means for reproducing and outputting an audio signal, receives information based on the electronic file from the information providing device via the communication network, and receives an audio signal based on the received information. Is reproduced and output by the reproduction output means.

また、本発明に係るオーディオ信号再生装置は、上記の電子ファイルを受け取るファイル入力手段と、オーディオ信号を再生出力する再生出力手段と、を備えている。そして、前記電子ファイル内の権限管理情報は、認証コードを含み、当該オーディオ信号再生装置は、前記電子ファイル内の復号用情報に基づいて前記電子ファイル内の暗号化オーディオ信号を復号する復号処理手段と、前記認証コードと当該オーディオ信号再生装置に登録されたコードとを照合する照合手段と、前記照合手段による照合結果に基づいて、前記復号処理手段の復号によって得られた前記元オーディオ信号の前記再生出力手段での再生出力を許可するか否かを判別する判別手段と、を備え、前記判別手段の判別結果に応じて前記復号処理手段及び前記再生出力手段を制御することを特徴とする。 An audio signal reproducing apparatus according to the present invention includes file input means for receiving the electronic file and reproduction output means for reproducing and outputting the audio signal. The authority management information in the electronic file includes an authentication code, and the audio signal reproduction device decrypts the encrypted audio signal in the electronic file based on the decoding information in the electronic file. A verification unit for verifying the authentication code and a code registered in the audio signal reproduction device, and based on a verification result by the verification unit, the original audio signal obtained by decoding by the decoding processing unit Discriminating means for discriminating whether or not reproduction output by the reproduction output means is permitted, and controlling the decoding processing means and the reproduction output means in accordance with the discrimination result of the discrimination means.

これにより、電子ファイルが不正に流出した場合でも、音声に関するプライバシーが保護される。 As a result, even when an electronic file is illegally leaked, privacy related to voice is protected.

また、本発明に係る電子ファイルの記録方式は、元オーディオ信号の内の、人の音声の信号成分が含まれている音声区間における信号に対して暗号化処理を施すことによって得られた暗号化オーディオ信号のデータと、前記暗号化オーディオ信号を復号するための復号用情報のデータと、を互いに関連付けて記録することを特徴とする。 Also, the electronic file recording method according to the present invention is an encryption method obtained by performing encryption processing on a signal in a voice section including a signal component of a human voice in an original audio signal. Audio signal data and decryption information data for decrypting the encrypted audio signal are recorded in association with each other.

本発明によれば、オーディオ信号に関するプライバシー保護に寄与するオーディオ信号記録装置等を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the audio signal recording device etc. which contribute to the privacy protection regarding an audio signal can be provided.

本発明の意義ないし効果は、以下に示す実施の形態の説明により更に明らかとなろう。ただし、以下の実施の形態は、あくまでも本発明の一つの実施形態であって、本発明ないし各構成要件の用語の意義は、以下の実施の形態に記載されたものに制限されるものではない。 The significance or effect of the present invention will become more apparent from the following description of embodiments. However, the following embodiment is merely one embodiment of the present invention, and the meaning of the term of the present invention or each constituent element is not limited to that described in the following embodiment. .

以下、本発明の実施の形態につき、図面を参照して具体的に説明する。参照される各図において、同一の部分には同一の符号を付し、同一の部分に関する重複する説明を原則として省略する。以下に、第１〜第８実施例を説明するが、或る実施例に記載した事項は、矛盾なき限り他の実施例にも適用される。 Hereinafter, embodiments of the present invention will be specifically described with reference to the drawings. In each of the drawings to be referred to, the same part is denoted by the same reference numeral, and redundant description regarding the same part is omitted in principle. The first to eighth embodiments will be described below, but the matters described in a certain embodiment can be applied to other embodiments as long as there is no contradiction.

＜＜第１実施例＞＞
まず、本発明の第１実施例について説明する。図１は、第１実施例に係るオーディオ信号記録装置１（以下、「記録装置１」と略記する）の内部ブロック図である。記録装置１は、符号１１〜１５にて参照される各部位を備える。 << First Example >>
First, a first embodiment of the present invention will be described. FIG. 1 is an internal block diagram of an audio signal recording apparatus 1 (hereinafter abbreviated as “recording apparatus 1”) according to the first embodiment. The recording apparatus 1 is provided with each part referred by the codes | symbols 11-15.

マイク部１１は、単数又は複数のマイクロホンから成り、記録装置１の周辺音を集音して該周辺音をデジタルの電気信号に変換する。このデジタルの電気信号は、記録装置１の周辺音を表すオーディオ信号としてオーディオ信号処理部１２に与えられる。以下の説明では、或る特定の区間におけるオーディオ信号を考え、その特定の区間の全体を全区間と捉える。その特定の区間（即ち、全区間）の開始時点及び終了時点は、操作部１５に対する操作によって指定される。 The microphone unit 11 includes one or a plurality of microphones, collects ambient sounds of the recording apparatus 1 and converts the ambient sounds into digital electrical signals. This digital electrical signal is given to the audio signal processing unit 12 as an audio signal representing the peripheral sound of the recording apparatus 1. In the following description, an audio signal in a specific section is considered, and the entire specific section is regarded as all sections. The start time and end time of the specific section (that is, all sections) are designated by an operation on the operation unit 15.

オーディオ信号処理部１２は、マイク部１１からのオーディオ信号に基づいてオーディオ信号の全区間内から人の音声の信号成分を含む区間を検出する音声検出部（図１において不図示）を備える。検出された、人の音声の信号成分を含む区間を、以下、「音声区間」と呼ぶ。また、オーディオ信号の全区間の内、音声区間以外の区間を「非音声区間」と呼ぶ。また、オーディオ信号に含まれる音声を発した人を発話者（又は話者）とも呼ぶ。 The audio signal processing unit 12 includes a voice detection unit (not shown in FIG. 1) that detects a section including a signal component of a human voice from all sections of the audio signal based on the audio signal from the microphone unit 11. The detected section including the signal component of the human voice is hereinafter referred to as “voice section”. Further, of all the sections of the audio signal, sections other than the voice section are referred to as “non-voice sections”. A person who utters the voice included in the audio signal is also called a speaker (or speaker).

オーディオ信号処理部１２は、マイク部１１から与えられたオーディオ信号の全区間の内、音声区間における信号に対して所定の暗号化方式に従った暗号化処理を施すことにより暗号化オーディオ信号を生成する。この際、非音声区間における信号に対しては暗号化処理は施されない。暗号化オーディオ信号との区別を明確化するため、暗号化される前のオーディオ信号を、以下、元オーディオ信号と呼ぶことにする。 The audio signal processing unit 12 generates an encrypted audio signal by performing encryption processing according to a predetermined encryption method on the signal in the audio section among all the sections of the audio signal given from the microphone section 11. To do. At this time, the encryption process is not performed on the signal in the non-voice section. In order to clarify the distinction from the encrypted audio signal, the audio signal before being encrypted is hereinafter referred to as an original audio signal.

元オーディオ信号における音声区間内の信号に上記暗号化処理を施した信号と、元オーディオ信号における非音声区間内の信号（暗号化されていない信号）と、の合成信号が、暗号化オーディオ信号となる。但し、元オーディオ信号の全体に対して、暗号化以外の所定の処理（符号化処理等）は実施されうる。 A synthesized signal of a signal obtained by performing the above-described encryption processing on a signal in a voice section of the original audio signal and a signal (non-encrypted signal) in a non-voice section of the original audio signal is an encrypted audio signal. Become. However, predetermined processing (encoding processing or the like) other than encryption can be performed on the entire original audio signal.

オーディオ信号処理部１２にて実施される暗号化処理の暗号化方式として任意の公知の暗号化方式を利用することが可能である。例えば、音声区間における暗号化前の信号のビット列を暗号鍵に従って所定のアルゴリズムで並べ替え、これによって暗号化後の信号を生成する。 Any known encryption method can be used as the encryption method of the encryption process performed by the audio signal processing unit 12. For example, the bit string of the signal before encryption in the voice section is rearranged by a predetermined algorithm according to the encryption key, thereby generating the signal after encryption.

メモリカード１３は、外部記録媒体であり、例えばＳＤ（Secure Digital）メモリカードである。尚、外部記録媒体としてメモリカード１３を例示しているが、外部記録媒体を、１または複数のランダムアクセス可能な記録媒体（半導体メモリ、メモリカード、光ディスク、磁気ディスク等）で構成することができる。 The memory card 13 is an external recording medium, for example, an SD (Secure Digital) memory card. Although the memory card 13 is illustrated as an external recording medium, the external recording medium can be configured by one or a plurality of randomly accessible recording media (semiconductor memory, memory card, optical disk, magnetic disk, etc.). .

主制御部１４は、オーディオ信号処理部１２にて生成された暗号化オーディオ信号を格納したファイルを作成し、このファイルをメモリカード１３に保存する（記録する）。尚、記録装置１は、暗号化オーディオ信号を格納したファイル以外にも、全く暗号化されていないオーディオ信号を含むファイルをもメモリカード１３に保存することが可能である。但し、記録装置１は、暗号化オーディオ信号を格納したファイルを生成することに特徴点を有するため、暗号化オーディオ信号を格納したファイルを特殊ファイルと呼んで他のファイルと区別し、以下、特殊ファイルに関する説明を行うものとする。 The main control unit 14 creates a file storing the encrypted audio signal generated by the audio signal processing unit 12 and saves (records) the file in the memory card 13. Note that the recording apparatus 1 can store a file including an audio signal that is not encrypted at all in the memory card 13 in addition to the file that stores the encrypted audio signal. However, since the recording apparatus 1 has a feature in generating a file storing an encrypted audio signal, the file storing the encrypted audio signal is called a special file to distinguish it from other files. A description about the file shall be given.

操作部１５は、操作キー等から成り、ユーザによる操作を受け付ける。操作部１５に対する操作内容は主制御部１４に伝達される。操作部１５に対して所定操作を施すことにより、元オーディオ信号の取得並びに特殊ファイルの作成及び保存が実施される。 The operation unit 15 includes operation keys and the like, and accepts user operations. The operation content for the operation unit 15 is transmitted to the main control unit 14. By performing a predetermined operation on the operation unit 15, an original audio signal is acquired and a special file is created and stored.

図２に、メモリカード１３に保存される特殊ファイル３００のデータ構造を示す。特殊ファイル３００は、ヘッダ領域３０１と本体領域３０２から形成される。当然ではあるが、同一の特殊ファイル内に定義されたヘッダ領域と本体領域は互いに関連付けられている。特殊ファイル３００の場合、ヘッダ領域３０１と本体領域３０２は互いに関連付けられ、特殊ファイル３００内に格納された各データは互いに関連付けられている。尚、ヘッダ領域を、ユーザ領域と呼ぶこともできる。 FIG. 2 shows the data structure of the special file 300 stored in the memory card 13. The special file 300 is formed from a header area 301 and a main body area 302. As a matter of course, the header area and the main body area defined in the same special file are associated with each other. In the case of the special file 300, the header area 301 and the main body area 302 are associated with each other, and the data stored in the special file 300 are associated with each other. The header area can also be called a user area.

本体領域３０２には暗号化オーディオ信号を表すデータが格納される。ヘッダ領域３０１には暗号化オーディオ信号の関連情報等が格納される。具体的には、ヘッダ領域３０１には、音声区間情報と復号鍵情報と権限管理情報とが格納される。 The main body area 302 stores data representing the encrypted audio signal. The header area 301 stores information related to the encrypted audio signal. Specifically, in the header area 301, voice section information, decryption key information, and authority management information are stored.

音声区間情報は、元オーディオ信号（又は暗号化オーディオ信号）の全区間中の何れの区間が音声区間であるかを表す情報である。 The voice section information is information indicating which section of all sections of the original audio signal (or encrypted audio signal) is a voice section.

復号鍵情報は、暗号化された信号を復号するための復号処理に用いられる復号鍵を表す情報である。上記の暗号化処理は、記録装置１に予め与えられた或いは記録装置１内で発生した暗号鍵を用いて行われ、この暗号鍵に対応する復号用の鍵が復号鍵である。暗号化オーディオ信号から元オーディオ信号を復元することのできる装置（以下、「復元用装置」という）上において、この復号鍵を用いれば暗号化オーディオ信号を復号することができる（復号に復号鍵は必須であるものとする）。 The decryption key information is information representing a decryption key used for decryption processing for decrypting an encrypted signal. The above encryption processing is performed using an encryption key given in advance to the recording apparatus 1 or generated in the recording apparatus 1, and a decryption key corresponding to this encryption key is a decryption key. On a device that can restore the original audio signal from the encrypted audio signal (hereinafter referred to as “restoration device”), the encrypted audio signal can be decrypted using this decryption key (the decryption key is used for decryption). Must be required).

音声区間情報を参照すれば、暗号化されている音声区間の、全区間内における位置が分かる。このため、音声区間情報と復号鍵情報に基づけば、本体領域３０２内の暗号化オーディオ信号を復号して元オーディオ信号を復元することができる。従って、音声区間情報と復号鍵情報は、暗号化オーディオ信号を復号するための復号用情報を形成する。 By referring to the voice section information, the position of the encrypted voice section in all the sections can be known. Therefore, based on the voice section information and the decryption key information, the encrypted audio signal in the main body area 302 can be decrypted to restore the original audio signal. Therefore, the voice section information and the decryption key information form decryption information for decrypting the encrypted audio signal.

尚、復元用装置がヘッダ領域３０１を参照することなく復号鍵を知っている場合は、特殊ファイル３００内に復号鍵情報を含めておく必要は無い（後述の各実施例においても共通）。即ち例えば、復元用装置に予め復号鍵が与えられている場合、或いは、ユーザが手動操作等によって復元用装置に復号鍵を与える場合は、特殊ファイル３００内に復号鍵情報を含めておく必要は無い。 Note that when the restoration device knows the decryption key without referring to the header area 301, it is not necessary to include the decryption key information in the special file 300 (common also in each embodiment described later). That is, for example, when the decryption key is given to the restoration device in advance, or when the user gives the decryption key to the restoration device by manual operation or the like, it is necessary to include the decryption key information in the special file 300. No.

復元用装置において、復号用情報に従って暗号化オーディオ信号を復号し、この復号によって得た元オーディオ信号を再生出力することが可能であるが、この復号及び再生出力を実施することに対する許可又は禁止を切替制御するための情報が、権限管理情報である。この権限管理情報の利用例については他の実施例にて詳説する。尚、記録装置１においてユーザが権限管理情報を設定する場合、その設定のための操作は、操作部１５に対して行われる。 In the restoration device, it is possible to decrypt the encrypted audio signal according to the decryption information, and reproduce and output the original audio signal obtained by this decryption, but permission or prohibition to perform this decryption and reproduction output Information for switching control is authority management information. This usage example of the authority management information will be described in detail in another embodiment. When the user sets authority management information in the recording apparatus 1, an operation for the setting is performed on the operation unit 15.

暗号化オーディオ信号を復号することなく、そのまま再生しても音声区間における人の音声を聞き取ることはできない。暗号化オーディオ信号を元オーディオ信号に戻すためには、上述の暗号化方式のアルゴリズムに従った復号処理の実施が必須である。復元用装置は、オーディオ信号処理部１２での暗号化方式のアルゴリズムに関する情報を事前に認識しており、復号用情報に基づいて、その暗号化方式のアルゴリズムに従った復号処理を実行可能である。その復号処理によって暗号化オーディオ信号から元オーディオ信号が復元される。後述のユーザＰＣ２、サーバ３、オーディオ信号記録再生装置６及び撮像装置７は、暗号化オーディオ信号から元オーディオ信号を復元可能な復元用装置として機能する。 Even if the encrypted audio signal is reproduced without being decrypted, it is impossible to hear the human voice in the voice section. In order to return the encrypted audio signal to the original audio signal, it is essential to perform a decryption process according to the algorithm of the above-described encryption method. The restoration apparatus recognizes in advance information related to the algorithm of the encryption method in the audio signal processing unit 12, and can execute a decryption process according to the algorithm of the encryption method based on the information for decryption. . The original audio signal is restored from the encrypted audio signal by the decryption process. A user PC 2, a server 3, an audio signal recording / reproducing device 6, and an imaging device 7 described later function as a restoration device that can restore the original audio signal from the encrypted audio signal.

オーディオ信号に関してプライバシーの保護に重要なのは発話者の音声を含む部分であり、発話者の音声が含まれるオーディオ信号が不特定の人間に対して流通すると、発話者のプライバシーが害されるおそれがある。しかしながら、記録装置１上で特殊ファイルを保存する時点で暗号化処理が実施され、特殊ファイル内のオーディオ信号は暗号化されているため、仮に、メモリカード１３が紛失又は盗難されたり、コンピュータウイルスやファイル交換ソフトの悪用等によってネット経由で特殊ファイルが流出した場合でも、プライバシーが保護される。また、プライバシー保護に関与する音声区間が自動的に検出され、音声区間に対して暗号化処理が自動的に施されるため、プライバシー保護を図るためのユーザ負担が極めて少ない（特許文献２のようなブロック分割の手間がかからない）。 What is important for privacy protection regarding the audio signal is a portion including the voice of the speaker, and if the audio signal including the voice of the speaker is distributed to an unspecified person, the privacy of the speaker may be impaired. However, since the encryption process is performed when the special file is stored on the recording device 1 and the audio signal in the special file is encrypted, the memory card 13 is lost or stolen, or a computer virus, Even if special files are leaked over the Internet due to misuse of file exchange software, privacy is protected. In addition, since a voice section involved in privacy protection is automatically detected and encryption processing is automatically performed on the voice section, the burden on the user for protecting the privacy is extremely small (as in Patent Document 2). It does n’t take a lot of time to divide the blocks.)

ヘッダ領域に復号鍵情報を含めた場合でも、暗号化のアルゴリズムを知る正規の装置上でしか復号鍵を用いて正しく元オーディオ信号を復元することができず、その正規の装置は、記録装置１のユーザの意図に沿うようにヘッダ領域内の権限管理情報に応じて復号処理の実施／非実施を管理するため（詳細は後述）、プライバシーは適切に保護される。勿論、ヘッダ領域に復号鍵情報を含めない方が意図しない復号がなされる危険性は減る。 Even when the decryption key information is included in the header area, the original audio signal can be correctly restored using the decryption key only on a legitimate device that knows the encryption algorithm. Since the execution / non-execution of the decryption process is managed according to the authority management information in the header area so as to conform to the user's intention (details will be described later), privacy is appropriately protected. Of course, the risk of unintentional decryption is reduced if the decryption key information is not included in the header area.

尚、ヘッダ領域へのデータ保存形式及び特殊ファイル全体のデータ保存形式は、任意に定めることができるが、各データを既存の規格に沿って保存することも可能である。既存の規格に沿って特殊ファイルを作成及び保存する場合は、その既存の規格に沿った任意の再生装置上で特殊ファイル内のオーディオ信号を再生出力することが可能となるが、この場合でも、特殊ファイル内のオーディオ信号は暗号化されているためプライバシーは保護される。 The data storage format in the header area and the data storage format of the entire special file can be arbitrarily determined, but each data can be stored in accordance with existing standards. When creating and saving a special file in accordance with an existing standard, it is possible to reproduce and output the audio signal in the special file on an arbitrary playback device in accordance with the existing standard. Since the audio signal in the special file is encrypted, privacy is protected.

第１実施例に記載した事項は、後述の各実施例に適用され、後述の各実施例の説明の基礎となる。 The matters described in the first embodiment are applied to each embodiment described later and serve as a basis for the description of each embodiment described later.

＜＜第２実施例＞＞
図１のオーディオ信号処理部１２の内部構成例を示す実施例として、第２実施例を説明する。第２実施例は、第１実施例と組み合わせて実施される。 << Second Example >>
A second embodiment will be described as an embodiment showing an example of the internal configuration of the audio signal processing unit 12 of FIG. The second embodiment is implemented in combination with the first embodiment.

図３は、第２実施例に係る、図１のオーディオ信号処理部１２の内部ブロック図である。図３のオーディオ信号処理部１２は、符号２１〜２３にて参照される各部位を備える。 FIG. 3 is an internal block diagram of the audio signal processing unit 12 of FIG. 1 according to the second embodiment. The audio signal processing unit 12 in FIG. 3 includes each part referred to by reference numerals 21 to 23.

ＡＡＣエンコーダ２１は、元オーディオ信号に符号化処理を施す。この符号化処理は、ＡＡＣ(Advanced Audio Coding)の符号化方式に従う。ＡＡＣは、ＭＰＥＧ（Moving Picture Experts Group）において規格化された、オーディオ信号に対する符号化方式（圧縮方式）である。 The AAC encoder 21 performs encoding processing on the original audio signal. This encoding process follows an AAC (Advanced Audio Coding) encoding method. AAC is an encoding method (compression method) for audio signals, which is standardized in MPEG (Moving Picture Experts Group).

オーディオ信号処理部１２に与えられる元オーディオ信号は、時間軸上に離散的に並ぶデジタル信号（以下、離散オーディオ信号という）の集まりである。ＡＡＣでは、１０２４サンプルの離散オーディオ信号を単位区間にして符号化及び記録が行われるため、本実施例では、１０２４サンプルごとに音声区間と非音声区間を判別する。上記の単位区間をフレームと呼ぶ。元オーディオ信号の全体は、順次訪れる第１、第２、第３、・・・、第（ｎ−１）及び第ｎのフレームの離散オーディオ信号から形成され、各フレームは、１０２４サンプル分の離散オーディオ信号を含む（ｎは２以上の整数）。 The original audio signal given to the audio signal processing unit 12 is a collection of digital signals (hereinafter referred to as discrete audio signals) arranged discretely on the time axis. In AAC, encoding and recording are performed using a discrete audio signal of 1024 samples as a unit interval, so in this embodiment, a speech interval and a non-speech interval are determined for each 1024 samples. The above unit section is called a frame. The entire original audio signal is formed from discrete audio signals of the first, second, third,..., (N−1) th and nth frames that come in sequence, and each frame is a discrete number of 1024 samples. An audio signal is included (n is an integer of 2 or more).

ＡＡＣエンコーダ２１は、フレームごとに離散オーディオ信号をＡＡＣに従って符号化する。この符号化によって得られた信号を符号化信号と呼び、第ｋのフレームの離散オーディオ信号を符号化することによって得られた符号化信号を、第ｋのフレームの符号化信号と呼ぶ（ｋは、１≦ｋ≦ｎを満たす整数）。各フレームの符号化信号は、暗号化処理部２３に送られる。 The AAC encoder 21 encodes a discrete audio signal according to AAC for each frame. A signal obtained by this encoding is called an encoded signal, and an encoded signal obtained by encoding the discrete audio signal of the kth frame is called an encoded signal of the kth frame (k is 1 ≦ k ≦ n). The encoded signal of each frame is sent to the encryption processing unit 23.

音声／非音声判別部２２は、ピッチ抽出に基づく手法によって、フレームごとに、そのフレームが音声区間と非音声区間の何れであるかを判別する。ピッチとは、声帯振動によるオーディオ信号の基本周波数のことである。一般的に、雑音が多い場合は正確にピッチを探し出すことが難しくなるが、本実施例では基本周波数を正確に探す必要はなく、ピッチの有無のみを判別すれば足る。ピッチ抽出手法として、一般的に自己相関処理を利用した手法が広く用いられ、本実施例でも自己相関処理を利用する。 The voice / non-voice determination unit 22 determines, for each frame, whether the frame is a voice segment or a non-speech segment by a technique based on pitch extraction. Pitch is the fundamental frequency of an audio signal due to vocal cord vibration. Generally, when there is a lot of noise, it is difficult to find the pitch accurately, but in this embodiment, it is not necessary to search for the fundamental frequency accurately, and it is sufficient to determine only the presence or absence of the pitch. As a pitch extraction method, generally a method using autocorrelation processing is widely used, and autocorrelation processing is also used in this embodiment.

或る１つのフレームに着目して、音声／非音声判別部２２の判別手法について説明する。着目したフレームを着目フレームと呼ぶ。着目フレームに含まれる１０２４サンプル分の離散オーディオ信号の内、ｔ番目の離散オーディオ信号の信号値をｘ（ｔ）で表す。ｔは、１〜１０２４の間の整数値をとる。 The discriminating method of the voice / non-voice discriminating unit 22 will be described focusing on a certain frame. The focused frame is called a focused frame. Of the 1024 samples of discrete audio signals included in the frame of interest, the signal value of the t-th discrete audio signal is represented by x (t). t takes an integer value between 1 and 1024.

そして、図４に示す如く、１〜１２８番目の離散オーディオ信号から成るブロックを基準ブロックとして自己相関を計算する。つまり、着目フレーム内に、１２８個の連続する離散オーディオ信号から成る評価ブロックを定義し、評価ブロックを順次ずらしながら、基準ブロックと評価ブロックとの間の相関を求めてゆく。より具体的には、下記式（１）に従って自己相関値Ｓ（ｐ）を算出する。自己相関値Ｓ（ｐ）は、評価ブロックの位置を決める変数ｐの関数であり、ｐは、０≦ｐ≦（１０２４−１２８）、を満たす各整数をとる。 Then, as shown in FIG. 4, the autocorrelation is calculated using the block composed of the 1st to 128th discrete audio signals as a reference block. That is, an evaluation block consisting of 128 continuous discrete audio signals is defined in the frame of interest, and the correlation between the reference block and the evaluation block is obtained while sequentially shifting the evaluation block. More specifically, the autocorrelation value S (p) is calculated according to the following formula (1). The autocorrelation value S (p) is a function of a variable p that determines the position of the evaluation block, and p is an integer that satisfies 0 ≦ p ≦ (1024−128).

図５に、求められた自己相関値Ｓ（ｐ）の変数ｐ依存性を示す。図５において、横軸は、変数ｐである。図５は、着目フレームにピッチが含まれている場合に対応している。着目フレームにピッチが含まれていると自己相関値Ｓ（ｐ）が周期的に大きな値をとる。音声／非音声判別部２２は、自己相関値Ｓ（ｐ）が周期的に所定の閾値ＴＨを超えていると判断される場合、着目フレームが音声区間であると判断し、そうでない場合は、着目フレームが非音声区間であると判断する。例えば、不等式「Ｓ（ｐ）＞ＴＨ」を満たす変数ｐの間隔が一定（或いは略一定）の場合に、自己相関値Ｓ（ｐ）が周期的に所定の閾値ＴＨを超えていると判断する。 FIG. 5 shows the variable p dependency of the calculated autocorrelation value S (p). In FIG. 5, the horizontal axis is the variable p. FIG. 5 corresponds to the case where the pitch of interest is included in the frame of interest. When the pitch is included in the frame of interest, the autocorrelation value S (p) takes a large value periodically. When it is determined that the autocorrelation value S (p) periodically exceeds the predetermined threshold TH, the speech / non-speech discrimination unit 22 determines that the frame of interest is a speech interval, and otherwise, It is determined that the frame of interest is a non-voice segment. For example, when the interval of the variable p satisfying the inequality “S (p)> TH” is constant (or substantially constant), it is determined that the autocorrelation value S (p) periodically exceeds a predetermined threshold value TH. .

音声／非音声判別部２２は、各フレームが音声区間と非音声区間の何れであるかを表す情報を暗号化処理部２３に出力する。この情報は、第１実施例で述べた音声区間情報に相当し、第２実施例では、音声／非音声判別部２２の出力情報が音声区間情報として図２のヘッダ領域３０１内に格納されることとなる。 The speech / non-speech discrimination unit 22 outputs information indicating whether each frame is a speech segment or a non-speech segment to the encryption processing unit 23. This information corresponds to the voice section information described in the first embodiment. In the second embodiment, the output information of the voice / non-voice discrimination unit 22 is stored in the header area 301 of FIG. 2 as the voice section information. It will be.

暗号化処理部２３は、音声／非音声判別部２２の出力情報（音声区間情報）に基づきつつ、ＡＡＣエンコーダ２１から出力される第１〜第ｎのフレームの符号化信号の内、音声区間に分類されるフレームの符号化信号のみを所定の暗号化方式に従って暗号化する。非音声区間に分類されるフレームの符号化信号は暗号化されない。第１実施例でも述べたように、オーディオ信号処理部１２（暗号化処理部２３）にて実施される暗号化処理の暗号化方式として任意の公知の暗号化方式を利用することが可能である。 The encryption processing unit 23 is based on the output information (speech interval information) of the voice / non-speech discriminating unit 22 and includes the first to nth frame encoded signals output from the AAC encoder 21 in the voice interval. Only the encoded signals of the classified frames are encrypted according to a predetermined encryption method. The encoded signal of the frame classified into the non-voice section is not encrypted. As described in the first embodiment, any known encryption method can be used as the encryption method of the encryption process performed by the audio signal processing unit 12 (encryption processing unit 23). .

音声区間に分類されるフレームの符号化信号に暗号化処理を施した信号と、非音声区間に分類されるフレームの符号化信号そのものと、の合成信号が、暗号化オーディオ信号として暗号化処理部２３から出力される。この暗号化オーディオ信号は、図２の特殊ファイル３００の本体領域３０２内に格納されることとなる。 A combined signal of a signal obtained by performing encryption processing on an encoded signal of a frame classified as a speech section and an encoded signal itself of a frame classified as a non-speech section is an encryption processing unit as an encrypted audio signal. 23. This encrypted audio signal is stored in the main body area 302 of the special file 300 shown in FIG.

本体領域３０２に格納された暗号化オーディオ信号から元オーディオ信号を復元するためには、まず、音声区間情報と復号鍵情報に基づいて、音声区間における暗号化された符号化信号を復号して通常の符号化信号を復元し、これによって第１〜第ｎフレームについての符号化信号を得る。その後、各フレームについての符号化信号に対して、ＡＡＣの符号化を元に戻すためのデコード処理を施すことによって、元オーディオ信号を復元することができる。この復元手法は、暗号化オーディオ信号から元オーディオ信号を復元する各復元用装置にて利用可能である（復元用装置については後述）。 In order to restore the original audio signal from the encrypted audio signal stored in the main body region 302, first, the encrypted encoded signal in the speech section is first decrypted based on the speech section information and the decryption key information. Thus, the encoded signals for the first to nth frames are obtained. Thereafter, the original audio signal can be restored by performing a decoding process for returning the AAC encoding to the encoded signal for each frame. This restoration method can be used in each restoration device that restores the original audio signal from the encrypted audio signal (the restoration device will be described later).

ＡＡＣの符号化方式に適合した、音声区間と非音声区間の判別手法を例示したが、音声区間と非音声区間の判別手法として他の任意の公知手法を用いることも可能である。 Although the speech section and non-speech section discrimination method adapted to the AAC coding scheme has been illustrated, any other known method can be used as the speech section and non-speech section discrimination method.

＜＜第３実施例＞＞
次に、第３実施例について説明する。第３実施例では、記録装置１にて生成された特殊ファイルを通信網を介して公開する場合を想定する。 << Third Example >>
Next, a third embodiment will be described. In the third embodiment, it is assumed that a special file generated by the recording apparatus 1 is disclosed via a communication network.

図６に、記録装置１と協働して特徴的な動作を行うファイル管理システムの全体構成図を示す。図６において、符号２は、ユーザ用のコンピュータ（以下、「ユーザＰＣ」という）である。符号３は、サーバコンピュータ（以下、単に「サーバ」という）である。符号４は、コンピュータ・ネットワークとしての通信網である。今、通信網４は、インターネットであるとする。符号１０１、１０２及び１０３は、夫々、サーバ３に格納されたファイルにアクセスするコンピュータ（以下、「閲覧側ＰＣ」という）である。 FIG. 6 shows an overall configuration diagram of a file management system that performs a characteristic operation in cooperation with the recording apparatus 1. In FIG. 6, reference numeral 2 denotes a user computer (hereinafter referred to as “user PC”). Reference numeral 3 denotes a server computer (hereinafter simply referred to as “server”). Reference numeral 4 denotes a communication network as a computer network. Now, it is assumed that the communication network 4 is the Internet. Reference numerals 101, 102, and 103 denote computers (hereinafter referred to as “browsing-side PCs”) that access files stored in the server 3, respectively.

図７に、ユーザＰＣ２の概略内部ブロック図を示す。ユーザＰＣ２は、符号４１〜４８にて参照される各部位を備える。図８に、サーバ３の概略内部ブロック図を示す。サーバ３は、符号６１〜６５にて参照される各部位を備える。閲覧側ＰＣ１０１〜１０３も、ユーザＰＣ２と同様の構成を有し、それらの内部ブロック図は、図７のそれと同様であるため重複する図示を省略する。ユーザＰＣ２とサーバ３は、通信網４を介して任意のデータの送受信が可能であり、閲覧側ＰＣ１０１〜１０３の夫々とサーバ３は、通信網４を介して任意のデータの送受信が可能である。尚、図示の簡略化及び説明の便宜上、図６には閲覧用ＰＣが３台しか示されていないが、閲覧用ＰＣ１０１、１０２又は１０３と同様の多数の閲覧用ＰＣが通信網４を介してサーバ３に接続されうる。 FIG. 7 shows a schematic internal block diagram of the user PC 2. User PC2 is provided with each part referred by numerals 41-48. FIG. 8 shows a schematic internal block diagram of the server 3. The server 3 is provided with each part referred with the codes | symbols 61-65. The browsing-side PCs 101 to 103 also have the same configuration as the user PC 2, and their internal block diagram is the same as that of FIG. The user PC 2 and the server 3 can transmit / receive arbitrary data via the communication network 4, and the server 3 and each of the viewing side PCs 101 to 103 can transmit / receive arbitrary data via the communication network 4. . For simplicity of illustration and convenience of explanation, only three browsing PCs are shown in FIG. 6, but many browsing PCs similar to the browsing PCs 101, 102, or 103 are connected via the communication network 4. It can be connected to the server 3.

図１の記録装置１は特殊ファイルを作成し、特殊ファイルはメモリカード１３に記録される。メモリカード１３は記録装置１に対して着脱自在となっている。メモリカード１３を記録装置１から取り外して図７のユーザＰＣ２のカードスロット４４に装着することにより、ユーザＰＣ２は、メモリカード１３内に記録された特殊ファイルを読み出すことが可能となる。また、記録装置１に装着されたメモリカード１３内の特殊ファイルを、ＵＳＢ（Universal Serial Bus）などの規格に従った通信線を介してユーザＰＣ２に読み込ませることもできる。 The recording device 1 in FIG. 1 creates a special file, and the special file is recorded on the memory card 13. The memory card 13 is detachable from the recording device 1. By removing the memory card 13 from the recording device 1 and inserting it into the card slot 44 of the user PC 2 in FIG. 7, the user PC 2 can read the special file recorded in the memory card 13. In addition, the special file in the memory card 13 mounted on the recording device 1 can be read by the user PC 2 via a communication line in accordance with a standard such as USB (Universal Serial Bus).

ユーザＰＣ２に読み込まれた特殊ファイルは、オーディオ信号処理部４２を含む主制御部４１に与えられる。液晶ディスプレイ等から成る表示部４５は、主制御部４１の制御の下、映像の表示を行う。オーディオ信号再生出力部４６は、スピーカを含み、主制御部４１の制御の下、オーディオ信号の再生出力を行う（オーディオ信号を音として出力する）。以下、「オーディオ信号再生出力部」を「再生出力部」と略記する。ＰＣ操作キー４７は、ユーザＰＣ２のユーザの操作を受け付け、その操作内容を主制御部４１に伝達する。また、メモリ４８の記憶内容は、主制御部４１に参照される。 The special file read by the user PC 2 is given to the main control unit 41 including the audio signal processing unit 42. A display unit 45 including a liquid crystal display or the like displays an image under the control of the main control unit 41. The audio signal reproduction output unit 46 includes a speaker, and reproduces and outputs an audio signal (outputs the audio signal as sound) under the control of the main control unit 41. Hereinafter, “audio signal reproduction output unit” is abbreviated as “reproduction output unit”. The PC operation key 47 receives a user operation of the user PC 2 and transmits the operation content to the main control unit 41. The stored contents of the memory 48 are referred to by the main control unit 41.

図７のＰＣ操作キー４７に所定の操作が施された時、主制御部４１に与えられた特殊ファイルは、図８のサーバ３の、オーディオ信号処理部６２を含む主制御部６１に送信される（所謂アップロードが行われる）。この送信は、図７の通信部４３、図６の通信網４及び図８の通信部６３を介して行われる。主制御部６１が受信した特殊ファイルは、ユーザＰＣ２に割り当てられたＩＤ番号と関連付けられつつサーバＨＤＤ（ハードディスク）６４に保存される。 When a predetermined operation is performed on the PC operation key 47 of FIG. 7, the special file given to the main control unit 41 is transmitted to the main control unit 61 including the audio signal processing unit 62 of the server 3 of FIG. (So-called uploading is performed). This transmission is performed via the communication unit 43 in FIG. 7, the communication network 4 in FIG. 6, and the communication unit 63 in FIG. The special file received by the main control unit 61 is stored in the server HDD (hard disk) 64 while being associated with the ID number assigned to the user PC 2.

サーバ３は、インターネット上で所定のＩＰアドレスが割り当てられたＷＷＷ（World Wide Web）サーバであり、ＳＮＳ（ソーシャルネットワーキングサービス）や所謂ブログ等のウェブサイトを運営する。 The server 3 is a WWW (World Wide Web) server to which a predetermined IP address is assigned on the Internet, and operates a website such as an SNS (social networking service) or a so-called blog.

サーバ３は、通信網４を介して閲覧側ＰＣ１０１〜１０３の夫々からの送信要求（配信要求）を受け付ける。サーバ３は、閲覧側ＰＣ１０１からサーバＨＤＤ６４に格納された特殊ファイルに対する送信要求を受け取った時、特殊ファイルの内容と公開管理メモリ６５の記憶内容に基づいて送信すべきデータを判断し、その判断結果に応じたデータを閲覧側ＰＣ１０１に送信する。閲覧側ＰＣ１０２及び１０３についても同様である。尚、公開管理メモリ６５を、サーバＨＤＤ６４内の一部記録領域から形成することも可能である。 The server 3 receives a transmission request (distribution request) from each of the viewing-side PCs 101 to 103 via the communication network 4. When the server 3 receives a transmission request for the special file stored in the server HDD 64 from the viewing-side PC 101, the server 3 determines the data to be transmitted based on the content of the special file and the storage content of the public management memory 65, and the determination result Is sent to the viewing-side PC 101. The same applies to the viewing side PCs 102 and 103. The public management memory 65 can also be formed from a partial recording area in the server HDD 64.

サーバ３を含むファイル管理システムは、特殊ファイルのヘッダ領域内の情報を参照して特徴的な動作を行う。これについて具体的に説明する。 The file management system including the server 3 performs a characteristic operation with reference to information in the header area of the special file. This will be specifically described.

今、説明の具体化のため、記録装置１にて作成されユーザＰＣ２を介してサーバ３にアップロードされる（即ち、送信される）特殊ファイルが、図２の特殊ファイル３００である場合を想定する。 For the sake of concrete explanation, it is assumed that the special file created in the recording apparatus 1 and uploaded (that is, transmitted) to the server 3 via the user PC 2 is the special file 300 in FIG. .

サーバ３が運営するＳＮＳやウェブサイトの規定に従って、ユーザＰＣ２と閲覧側ＰＣ１０１〜１０３の夫々には、互いに異なる固有のＩＤ番号が事前に割り振られているものとする。サーバ３の公開管理メモリ６５（図８参照）には、各ＩＤ番号と、ユーザＰＣ２と閲覧側ＰＣ１０１〜１０３の夫々との関係を示す公開管理関係情報が予め格納されている。公開管理関係情報によって示される関係を、以下「公開関係」と呼ぶ。 It is assumed that different unique ID numbers are assigned in advance to the user PC 2 and the viewing-side PCs 101 to 103 in accordance with the rules of the SNS operated by the server 3 and the website. The public management memory 65 (see FIG. 8) of the server 3 stores in advance public management relation information indicating each ID number and the relation between the user PC 2 and each of the viewing side PCs 101 to 103. The relationship indicated by the public management relationship information is hereinafter referred to as “public relationship”.

この公開管理関係情報の内容例を図９に示す。今、ユーザＰＣ２と閲覧側ＰＣ１０１、１０２及び１０３との間の公開関係が、夫々、第１、第２及び第３の公開関係であったとし、それを表す情報が公開管理関係情報となっていたとする。実際には、各ＩＤ番号を用いてサーバ３は図９に示すような各公開関係を認識している。 An example of the contents of the public management related information is shown in FIG. Now, it is assumed that the public relationships between the user PC 2 and the viewing side PCs 101, 102, and 103 are the first, second, and third public relationships, respectively, and information that represents them is public management relationship information. Suppose. Actually, the server 3 recognizes each public relation as shown in FIG. 9 using each ID number.

ところで、ＳＮＳ等においては、一般的に、公開しようとする各情報に対して公開制限を加えることができ、“或る情報を「友人」に相当するＩＤ番号を有した閲覧用ＰＣには公開するが、その情報を「友人の友人」又は「一般」に相当するＩＤ番号を有した閲覧用ＰＣには公開しない”といったことを公開側のＰＣ（ユーザＰＣ２に相当）で自由に設定することが可能である。「一般」とは、「友人」でも「友人の友人」でもないことを示す。そこで、第１〜第３の公開関係を、サーバ３が運営するＳＮＳやウェブサイトの公開制限に沿って規定しておく。今、第１、第２及び第３の公開関係を、夫々、公開制限における「友人」、「友人の友人」及び「一般」に割り当てるものとする。 By the way, in SNS or the like, generally, it is possible to apply disclosure restrictions to each piece of information to be disclosed. “Some information is disclosed to a browsing PC having an ID number corresponding to“ friend ”. However, the information on the public side (corresponding to the user PC 2) can be freely set such that the information is not disclosed to the browsing PC having the ID number corresponding to “friend of friend” or “general”. “General” means neither “friend” nor “friend of friend”. Therefore, the first to third public relations are defined in accordance with the SNS operated by the server 3 and the website public restrictions. Assume that the first, second, and third public relationships are assigned to “friends”, “friends of friends”, and “general” in the public restrictions, respectively.

本実施例では、公開管理関係情報とヘッダ領域３４１内の権限管理情報（図２参照）に基づいて公開の仕方を制御し、権限管理情報が「公開レベル情報」を含むものとする。 In this embodiment, it is assumed that the publication method is controlled based on the publication management relation information and the authority management information (see FIG. 2) in the header area 341, and the authority management information includes “publication level information”.

ユーザは、図１の操作部１５の操作を介して記録装置１上で、又は、図７のＰＣ操作キー４７の操作を介してユーザＰＣ２上で、公開レベル情報を任意に設定することができる。記録装置１上で公開レベル情報の設定のための操作を全く行わなかった場合は、記録装置１が自動的に規定の初期データを公開レベル情報としてヘッダ領域３４１に格納する。特殊ファイル３００は、ユーザＰＣ２及び通信網４を経由してサーバ３にアップロードされ（即ち、送信され）、サーバ３は、特殊ファイル３００をユーザＰＣ２のＩＤ番号と関連付けてサーバＨＤＤ６４（図８参照）に保存する。 The user can arbitrarily set the public level information on the recording device 1 through the operation of the operation unit 15 in FIG. 1 or on the user PC 2 through the operation of the PC operation key 47 in FIG. . When no operation for setting the public level information is performed on the recording device 1, the recording device 1 automatically stores specified initial data in the header area 341 as the public level information. The special file 300 is uploaded (that is, transmitted) to the server 3 via the user PC 2 and the communication network 4, and the server 3 associates the special file 300 with the ID number of the user PC 2 and the server HDD 64 (see FIG. 8). Save to.

今、公開レベル情報によって、公開の仕方を３段階に分類する場合を考える。サーバＨＤＤ６４に格納された特殊ファイル３００のヘッダ領域３０１内の公開レベル情報の内容例を図１０に示す。今、ユーザＰＣ２との間で第１、第２及び第３の公開関係を有する閲覧側ＰＣに、夫々、第１、第２及び第３の権限を与えることを指示する内容が、公開レベル情報によって規定されていた場合を考える。この場合、図８のサーバ３の主制御部６１は、サーバＨＤＤ６４内の特殊ファイル３００に対するアクセス権限として、閲覧側ＰＣ１０１、１０２及び１０３に、夫々、第１、第２及び第３の権限を与えることになる。 Consider a case where the disclosure method is classified into three stages according to the disclosure level information. An example of the content of the public level information in the header area 301 of the special file 300 stored in the server HDD 64 is shown in FIG. The contents for instructing the viewing-side PC having the first, second, and third public relations with the user PC 2 to give the first, second, and third authorities, respectively, are the public level information. Consider the case defined by In this case, the main control unit 61 of the server 3 in FIG. 8 gives the first, second, and third authorities to the viewing side PCs 101, 102, and 103, respectively, as the access authority for the special file 300 in the server HDD 64. It will be.

以上のような想定の下、各閲覧用ＰＣがサーバ３内の特殊ファイル３００にアクセスしようとした場合の、サーバ３の動作を説明する。 The operation of the server 3 when each browsing PC tries to access the special file 300 in the server 3 under the above assumption will be described.

図１１（ａ）、（ｂ）、（ｃ）及び（ｄ）に、各閲覧用ＰＣに送信されるオーディオ信号を示す。図１１（ａ）、（ｂ）及び（ｃ）において、符号３３０は、着目した音声区間における元オーディオ信号のアナログ波形である。図１１（ａ）の符号３３１、図１１（ｂ）の符号３３２及び図１１（ｃ）の符号３３３は、夫々、その音声区間に対応する、閲覧用ＰＣ１０１、１０２及び１０３への送信オーディオ信号のアナログ波形である。図１１（ｄ）において、符号３４０は着目した非音声区間における元オーディオ信号のアナログ波形であり、符号３４１はその非音声区間における各閲覧用ＰＣ（１０１、１０２及び１０３）への送信オーディオ信号のアナログ波形であり、両アナログ波形（３４０及び３４１）は同じである。 FIGS. 11A, 11B, 11C, and 11D show audio signals transmitted to each browsing PC. 11A, 11B, and 11C, reference numeral 330 denotes an analog waveform of the original audio signal in the focused speech section. Reference numeral 331 in FIG. 11 (a), reference numeral 332 in FIG. 11 (b), and reference numeral 333 in FIG. 11 (c) respectively indicate the audio signals transmitted to the viewing PCs 101, 102, and 103 corresponding to the audio sections. It is an analog waveform. In FIG. 11D, reference numeral 340 denotes an analog waveform of the original audio signal in the focused non-speech section, and reference numeral 341 denotes the audio signal transmitted to each viewing PC (101, 102, and 103) in the non-speech section. It is an analog waveform, and both analog waveforms (340 and 341) are the same.

閲覧用ＰＣ１０１が、サーバ３内の特殊ファイル３００に対応するオーディオ信号を閲覧用ＰＣ１０１に送信するように送信要求をサーバ３に対して出した際、サーバ３の主制御部６１は、サーバＨＤＤ６４内の特殊ファイル３００の公開レベル情報と公開管理メモリ６５内の公開管理関係情報に基づいて、閲覧用ＰＣ１０１に与えられるべき権限を認識する。上述したように、閲覧用ＰＣ１０１に与えられるべき権限は第１の権限である。この場合、サーバ３のオーディオ信号処理部６２（図８参照）は、サーバＨＤＤ６４から特殊ファイル３００内の暗号化オーディオ信号と復号用情報を読み出し、復号用情報に基づいて暗号化オーディオ信号を復号することによって元オーディオ信号を復元する。そして、復元した元オーディオ信号を表すデータを閲覧用ＰＣ１０１に送信する。これにより、閲覧用ＰＣ１０１に備えられる、スピーカを含む再生出力部から、図１１（ａ）のアナログ波形３３１及び図１１（ｄ）のアナログ波形３４１の信号を含む元オーディオ信号が再生出力される。 When the browsing PC 101 issues a transmission request to the server 3 to transmit an audio signal corresponding to the special file 300 in the server 3 to the browsing PC 101, the main control unit 61 of the server 3 The authority to be given to the viewing PC 101 is recognized based on the public level information of the special file 300 and the public management relation information in the public management memory 65. As described above, the authority to be given to the browsing PC 101 is the first authority. In this case, the audio signal processing unit 62 (see FIG. 8) of the server 3 reads the encrypted audio signal and the decryption information in the special file 300 from the server HDD 64, and decrypts the encrypted audio signal based on the decryption information. To restore the original audio signal. Then, the data representing the restored original audio signal is transmitted to the browsing PC 101. As a result, the original audio signal including the analog waveform 331 in FIG. 11A and the analog waveform 341 in FIG. 11D is reproduced and output from the reproduction output unit including the speaker provided in the browsing PC 101.

閲覧用ＰＣ１０３が、サーバ３内の特殊ファイル３００に対応するオーディオ信号を閲覧用ＰＣ１０３に送信するように送信要求をサーバ３に対して出した際、サーバ３の主制御部６１は、サーバＨＤＤ６４内の特殊ファイル３００の公開レベル情報と公開管理メモリ６５内の公開管理関係情報に基づいて、閲覧用ＰＣ１０３に与えられるべき権限を認識する。上述したように、閲覧用ＰＣ１０３に与えられるべき権限は第３の権限である。この場合、サーバ３のオーディオ信号処理部６２（図８参照）は、サーバＨＤＤ６４から特殊ファイル３００内の暗号化オーディオ信号を読み出し、その暗号化オーディオ信号と音声区間情報に基づいて、音声区間における発話内容が特定できないような第２加工オーディオ信号を作成する。 When the browsing PC 103 issues a transmission request to the server 3 to transmit an audio signal corresponding to the special file 300 in the server 3 to the browsing PC 103, the main control unit 61 of the server 3 The authority to be given to the viewing PC 103 is recognized based on the public level information of the special file 300 and the public management relation information in the public management memory 65. As described above, the authority to be given to the browsing PC 103 is the third authority. In this case, the audio signal processing unit 62 (see FIG. 8) of the server 3 reads the encrypted audio signal in the special file 300 from the server HDD 64, and based on the encrypted audio signal and the voice section information, utterance in the voice section. A second processed audio signal whose content cannot be specified is created.

例えば、第２加工オーディオ信号の音声区間には、元オーディオ信号とは全く異なる、正弦波のオーディオ信号又は無音のオーディオ信号が挿入される。或いは、暗号化オーディオ信号がＡＡＣに従って符号化されている場合は、音声区間の前フレームの信号（非音声区間の信号）が音声区間に挿入される。これらにより、音声区間から発話者の音声の信号成分が排除される。一方において、非音声区間に関しては、特別な処理を施さない。従って、例えば、図１１（ｃ）のアナログ波形３３３及び図１１（ｄ）のアナログ波形３４１の信号を含む第２加工オーディオ信号を表すデータが閲覧用ＰＣ１０３に送信され、閲覧用ＰＣ１０３に備えられる、スピーカを含む再生出力部から、第２加工オーディオ信号が再生出力される。 For example, a sine wave audio signal or a silent audio signal, which is completely different from the original audio signal, is inserted into the voice section of the second processed audio signal. Alternatively, when the encrypted audio signal is encoded in accordance with AAC, the signal of the previous frame of the speech section (the signal of the non-speech section) is inserted into the speech section. As a result, the signal component of the speaker's voice is excluded from the voice section. On the other hand, no special processing is applied to the non-voice section. Therefore, for example, data representing the second processed audio signal including the signals of the analog waveform 333 in FIG. 11C and the analog waveform 341 in FIG. 11D is transmitted to the browsing PC 103 and provided in the browsing PC 103. The second processed audio signal is reproduced and output from a reproduction output unit including a speaker.

閲覧用ＰＣ１０２が、サーバ３内の特殊ファイル３００に対応するオーディオ信号を閲覧用ＰＣ１０２に送信するように送信要求をサーバ３に対して出した際、サーバ３の主制御部６１は、サーバＨＤＤ６４内の特殊ファイル３００の公開レベル情報と公開管理メモリ６５内の公開管理関係情報に基づいて、閲覧用ＰＣ１０２に与えられるべき権限を認識する。上述したように、閲覧用ＰＣ１０２に与えられるべき権限は第２の権限である。この場合、サーバ３のオーディオ信号処理部６２（図８参照）は、サーバＨＤＤ６４から特殊ファイル３００内の暗号化オーディオ信号と復号用情報を読み出し、復号用情報に基づいて暗号化オーディオ信号を復号することによって元オーディオ信号を復元する。その後、音声区間情報に基づきつつ、音声区間における元オーディオ信号に所定の加工処理を施し、音声区間における発話者が特定できないような第１加工オーディオ信号を作成する。 When the browsing PC 102 issues a transmission request to the server 3 to transmit an audio signal corresponding to the special file 300 in the server 3 to the browsing PC 102, the main control unit 61 of the server 3 The authority to be given to the viewing PC 102 is recognized based on the public level information of the special file 300 and the public management relation information in the public management memory 65. As described above, the authority to be given to the browsing PC 102 is the second authority. In this case, the audio signal processing unit 62 (see FIG. 8) of the server 3 reads the encrypted audio signal and the decryption information in the special file 300 from the server HDD 64, and decrypts the encrypted audio signal based on the decryption information. To restore the original audio signal. Thereafter, based on the voice section information, a predetermined processing is performed on the original audio signal in the voice section to create a first processed audio signal that cannot identify a speaker in the voice section.

具体的には、第１加工オーディオ信号の音声区間における発話者の音声の特徴（例えば、声色）を元オーディオ信号のそれと異ならせるような加工処理を施す。但し、発話内容が特定できないほどの加工処理を施さないようにする。音声の特徴を変える手法として公知の手法を採用可能である。例えば、音声区間における元オーディオ信号を書き込むバッファメモリを用意し、そのバッファメモリへの書き込み周波数と読み出し周波数を変えることにより、発話者の声色（音程）を変えることができる。また、第１加工オーディオ信号を作成する際、非音声区間に関しては、特別な処理を施さない。従って、例えば、図１１（ｂ）のアナログ波形３３２及び図１１（ｄ）のアナログ波形３４１の信号を含む第１加工オーディオ信号を表すデータが閲覧用ＰＣ１０２に送信され、閲覧用ＰＣ１０２に備えられる、スピーカを含む再生出力部から、第１加工オーディオ信号が再生出力される。 Specifically, a processing process is performed so that the voice characteristics (for example, voice color) of the speaker in the voice section of the first processed audio signal are different from those of the original audio signal. However, the processing is not performed so that the utterance contents cannot be specified. A known method can be adopted as a method of changing the characteristics of the voice. For example, the voice color (pitch) of the speaker can be changed by preparing a buffer memory for writing the original audio signal in the voice section and changing the writing frequency and the reading frequency to the buffer memory. Further, when the first processed audio signal is created, no special processing is applied to the non-voice section. Therefore, for example, data representing the first processed audio signal including the signal of the analog waveform 332 in FIG. 11B and the analog waveform 341 in FIG. 11D is transmitted to the viewing PC 102 and provided in the viewing PC 102. A first processed audio signal is reproduced and output from a reproduction output unit including a speaker.

上述のようにすれば、ＳＮＳやブログ等のウェブサイトにおいて、オーディオ信号に関するプライバシーを保護する仕組みを導入することが可能である。公開側のユーザは、公開レベル情報を設定しておくだけで（或いは記録装置１で自動的に設定される公開レベル情報を利用するだけで）、簡便にプライバシーの保護効果を享受することができる。 If it carries out as mentioned above, it is possible to introduce the mechanism which protects the privacy regarding an audio signal in websites, such as SNS and a blog. The user on the public side can simply enjoy the privacy protection effect simply by setting the public level information (or simply using the public level information automatically set by the recording device 1). .

また、公開レベル情報に基づく公開制御の他に、認証コードを利用した公開制御も実施することが可能である。この場合、例えば、特殊ファイル３００のヘッダ領域３０１内の権限管理情報に、記録装置１又はユーザＰＣ２上で設定可能な認証コードを含めておく。例えば、ユーザＰＣ２のユーザが、第３の権限しか有さない閲覧用ＰＣ１０３のユーザに、特別に元オーディオ信号をサーバ３経由で提供したい場合、ユーザＰＣ２のユーザが閲覧用ＰＣ１０３のユーザに、上記の認証コードをパスワードとして伝える。閲覧用ＰＣ１０３のユーザが該パスワードを沿えて特殊ファイル３００に対応するオーディオ信号の送信要求をサーバ３に対して出力した際、サーバ３の主制御部６１は、閲覧用ＰＣ１０３から送られてきたパスワードとサーバＨＤＤ６４内の特殊ファイル３００に格納された認証コードとを照合し、両者の合致が確認されれば、特殊ファイル３００内の暗号化オーディオ信号と復号用情報から元オーディオ信号を復元して元オーディオ信号を表すデータを閲覧用ＰＣ１０３に送信する（合致しなければ該送信は行われない）。これにより、閲覧用ＰＣ１０３に備えられる、スピーカを含む再生出力部から、元オーディオ信号が再生出力される。 In addition to public control based on public level information, public control using an authentication code can be performed. In this case, for example, the authorization management information in the header area 301 of the special file 300 includes an authentication code that can be set on the recording device 1 or the user PC 2. For example, when the user of the user PC 2 wants to provide the original audio signal to the user of the browsing PC 103 who has only the third authority via the server 3, the user of the user PC 2 gives the user of the browsing PC 103 the above-mentioned Tell your authentication code as a password. When the user of the browsing PC 103 outputs an audio signal transmission request corresponding to the special file 300 along with the password to the server 3, the main control unit 61 of the server 3 sends the password sent from the browsing PC 103. And the authentication code stored in the special file 300 in the server HDD 64 are verified, and if they match, the original audio signal is restored from the encrypted audio signal and the decryption information in the special file 300 to obtain the original Data representing the audio signal is transmitted to the browsing PC 103 (if it does not match, the transmission is not performed). Thereby, the original audio signal is reproduced and output from the reproduction output unit including the speaker provided in the browsing PC 103.

尚、パスワードを伝える代わりに、閲覧用ＰＣ１０３のＩＤ番号と認証コードをリンクさせておいてもよい。つまり、単純には例えば、ヘッダ領域３０１の認証コードが閲覧用ＰＣ１０３のＩＤ番号と同じとなるように、認証コードを記録装置１又はユーザＰＣ２上で設定しておき、その認証コードを含む特殊ファイル３００をサーバ３にアップロードするようにする。閲覧用ＰＣ１０３のユーザが特殊ファイル３００に対応するオーディオ信号の送信要求をサーバ３に対して出力した際、サーバ３の主制御部６１は、閲覧用ＰＣ１０３のＩＤ番号とヘッダ領域３０１の認証コードを照合し、両者の合致が確認されれば、特殊ファイル３００内の暗号化オーディオ信号と復号用情報から元オーディオ信号を復元して元オーディオ信号を表すデータを閲覧用ＰＣ１０３に送信する（合致しなければ該送信は行われない）。 Instead of transmitting the password, the ID number of the browsing PC 103 and an authentication code may be linked. That is, for example, an authentication code is set on the recording device 1 or the user PC 2 so that the authentication code in the header area 301 is the same as the ID number of the viewing PC 103, and a special file including the authentication code is set. 300 is uploaded to the server 3. When the user of the browsing PC 103 outputs an audio signal transmission request corresponding to the special file 300 to the server 3, the main control unit 61 of the server 3 determines the ID number of the browsing PC 103 and the authentication code of the header area 301. If matching is confirmed, the original audio signal is restored from the encrypted audio signal and the decryption information in the special file 300, and data representing the original audio signal is transmitted to the viewing PC 103 (must match). If not, the transmission is not performed).

この他、様々な変形例があるが、以下に利用可能な変形例を列記する。 In addition, there are various modifications, but the modifications that can be used are listed below.

変形例１：図１０に示すような内容を規定の記述形式に従って記述したものを公開レベル情報としてもよいが、例えば、図６に示されるファイル管理システムの全体で図１０に示す内容が「１」であることを規定しておくようにしてもよい。この場合、ヘッダ領域３０１には「１」を公開レベル情報として記述しておけば足る。この場合、勿論、図１０に示す公開レベル情報と異なる公開レベル情報には１以外の文字列が割り当てられる。それ以外にも、ファイル管理システム内で規定された任意の記述方式で公開レベル情報を表すことが可能である。また、図１０に示す各公開関係と各権限との関係は例示であり、ユーザがそれを様々に変更可能である。 Modification 1: The content described in FIG. 10 according to a prescribed description format may be used as the public level information. For example, the content shown in FIG. 10 is “1” in the entire file management system shown in FIG. You may make it prescribe | regulate that it is. In this case, it is sufficient to describe “1” as the public level information in the header area 301. In this case, of course, a character string other than 1 is assigned to the public level information different from the public level information shown in FIG. In addition to this, it is possible to represent the public level information by an arbitrary description method defined in the file management system. Further, the relationship between each public relationship and each authority shown in FIG. 10 is an example, and the user can change it in various ways.

変形例２：上述の例では、第１、第２、第３の権限を有する閲覧用ＰＣからの送信要求に対して、夫々、図１１（ａ）に対応する元オーディオ信号、図１１（ｂ）に対応する第１加工オーディオ信号及び図１１（ｃ）に対応する第２加工オーディオ信号を送信しているが、これはあくまでも例示であり、様々に変形可能である。例えば、第１、第２、第３の権限を有する閲覧用ＰＣからの送信要求に対して、夫々、元オーディオ信号、第１加工オーディオ信号及び第１加工オーディオ信号を送信するようにしてもよいし、夫々、元オーディオ信号、元オーディオ信号及び第１加工オーディオ信号を送信するようにしてもよい。また、第３の権限を有する閲覧用ＰＣからの送信要求に対してはオーディオ信号の送信を一切行わないといったことも可能である。 Modification 2: In the above example, in response to a transmission request from a viewing PC having the first, second, and third authorities, the original audio signal corresponding to FIG. ) And the second processed audio signal corresponding to FIG. 11C are transmitted merely as examples, and various modifications can be made. For example, the original audio signal, the first processed audio signal, and the first processed audio signal may be transmitted in response to a transmission request from the viewing PC having the first, second, and third authorities, respectively. However, the original audio signal, the original audio signal, and the first processed audio signal may be transmitted, respectively. It is also possible that no audio signal is transmitted in response to a transmission request from a viewing PC having the third authority.

変形例３：上述の例では、公開レベル情報によって、公開の仕方を３段階に分類しているが、この段階数は任意である。ＳＮＳ等の規定に応じて、公開の仕方を４段階以上に分類することも可能であるし、２段階までの分類に限定されることもある。また、同様に、権限の分類数も３以外となりうる。尚、変形例２で例示した如く、第１、第２、第３の権限を有する閲覧用ＰＣからの送信要求に対して、夫々、元オーディオ信号、第１加工オーディオ信号及び第１加工オーディオ信号を送信するようにした場合は、結果的に公開の仕方が２段階に分類されていることになる。 Modification 3: In the above example, the disclosure method is classified into three stages according to the disclosure level information, but the number of stages is arbitrary. It is possible to classify the disclosure method into four or more stages according to the SNS or the like, or the classification may be limited to classification up to two stages. Similarly, the number of authority classifications can be other than three. As illustrated in the second modification, in response to a transmission request from the viewing PC having the first, second, and third authorities, the original audio signal, the first processed audio signal, and the first processed audio signal, respectively. As a result, the release method is classified into two stages.

＜＜第４実施例＞＞
次に、第４実施例について説明する。第４実施例は、第３実施例の変形例に相当し、第３実施例に記載した事項は本実施例にも適用される（また、第１実施例に記載した事項も適用される）。 << 4th Example >>
Next, a fourth embodiment will be described. The fourth embodiment corresponds to a modification of the third embodiment, and the items described in the third embodiment are also applied to the present embodiment (and the items described in the first embodiment are also applied). .

第４実施例では、図１の記録装置１が話者認識機能を備えているものとする。話者認識機能は、図１のオーディオ信号処理部１２内に設けられた話者認識部によって実現される。この話者認識部は、マイク部１１から与えられる元オーディオ信号に基づき、音声区間における発話者（話者）が予め登録された登録話者と一致しているか否かを判断する。 In the fourth embodiment, it is assumed that the recording apparatus 1 in FIG. 1 has a speaker recognition function. The speaker recognition function is realized by a speaker recognition unit provided in the audio signal processing unit 12 of FIG. The speaker recognizing unit determines whether or not the speaker (speaker) in the voice section matches a registered speaker registered in advance based on the original audio signal given from the microphone unit 11.

例えば、任意の言葉にて話者認識を行う発声内容独立型の話者認識処理を利用する。話者認識処理を実現する手法として、公知の任意の手法を採用可能である。話者認識の手法について簡単に説明する。図１２に、公知の話者認識処理を行うための話者認識部のブロック図を示す。 For example, an utterance content independent type speaker recognition process for performing speaker recognition with an arbitrary word is used. Any known method can be adopted as a method for realizing the speaker recognition processing. A method for speaker recognition will be briefly described. FIG. 12 shows a block diagram of a speaker recognition unit for performing known speaker recognition processing.

まず、話者認識を行う前に話者の登録を行う。話者登録時において、マイク部１１から出力されるオーディオ信号は、特徴量抽出部７１に供給される。話者が発生した文章などの音声に含まれる、ケプストラムやピッチ等の特徴量が特徴量抽出部７１にて抽出される。話者登録時において、該特徴量は話者モデル作成部７２に与えられる。話者モデル作成部７２は、抽出された特徴量に基づき、話者の音声のモデル（以下、登録話者モデルという）、例えば隠れマルコフモデル（Hidden Markov Model）を作成する。登録話者モデルは、話者モデル記録部７３に記録される。 First, speaker registration is performed before speaker recognition. At the time of speaker registration, the audio signal output from the microphone unit 11 is supplied to the feature amount extraction unit 71. The feature quantity extraction unit 71 extracts feature quantities such as cepstrum and pitch included in speech such as sentences generated by the speaker. At the time of speaker registration, the feature amount is given to the speaker model creation unit 72. The speaker model creation unit 72 creates a speaker speech model (hereinafter referred to as a registered speaker model), for example, a hidden Markov model, based on the extracted feature quantity. The registered speaker model is recorded in the speaker model recording unit 73.

話者認識時には、マイク１１部からの元オーディオ信号に含まれる認識対象話者（現在の話者）の音声は、特徴量抽出部７１に供給される。認識対象話者が発生した文章などの音声に含まれる、ケプストラムやピッチ等の特徴量が特徴量抽出部７１にて抽出される。話者認識時において、該特徴量は尤度算出部７４に与えられる。尤度算出部７４は、話者モデル記録部７３に記録されている登録話者モデルと認識対象話者に対応する特徴量とを比較し、それらの尤度（類似度）を算出する。そして、尤度が所定の閾値より大きければ認識対象話者が登録話者と一致すると判断し、そうでなければ、認識対象話者は登録話者と相違すると判断する。 At the time of speaker recognition, the speech of the recognition target speaker (current speaker) included in the original audio signal from the microphone 11 unit is supplied to the feature amount extraction unit 71. A feature quantity extraction unit 71 extracts feature quantities such as cepstrum and pitch included in speech such as sentences generated by the recognition target speaker. At the time of speaker recognition, the feature amount is given to the likelihood calculating unit 74. The likelihood calculating unit 74 compares the registered speaker model recorded in the speaker model recording unit 73 with the feature amount corresponding to the recognition target speaker, and calculates the likelihood (similarity). If the likelihood is larger than a predetermined threshold, it is determined that the recognition target speaker matches the registered speaker. Otherwise, it is determined that the recognition target speaker is different from the registered speaker.

図１３に、本実施例で想定される元オーディオ信号の模式図を示す。この元オーディオ信号の全区間には、音声区間と非音声区間が含まれるが、音声区間は、時間的に互いに分離した複数の区間から形成される。今、図１３に示す如く、音声区間が時間的に互いに分離した３つの区間を含む場合を考え、３つの区間を、第１、第２及び第３の要素区間と呼ぶことにする。そして、図１２に示されるような話者認識部を利用することによって、第１及び第２の要素区間に含まれる音声の発話者（話者）が登録話者と一致し、第３の要素区間に含まれる音声の発話者（話者）が登録話者と相違すると判断されたとする。以下、登録話者と相違する発話者（話者）を非登録話者と呼ぶ。 FIG. 13 shows a schematic diagram of an original audio signal assumed in this embodiment. The entire section of the original audio signal includes a voice section and a non-voice section, and the voice section is formed from a plurality of sections that are separated from each other in time. Now, as shown in FIG. 13, considering a case where a speech section includes three sections separated from each other in time, the three sections will be referred to as first, second, and third element sections. Then, by using the speaker recognition unit as shown in FIG. 12, the voice speaker (speaker) included in the first and second element sections matches the registered speaker, and the third element Assume that it is determined that the voice speaker (speaker) included in the section is different from the registered speaker. Hereinafter, a speaker (speaker) different from the registered speaker is referred to as a non-registered speaker.

本実施例で想定される特殊ファイルを、特殊ファイル４００と呼び、それのデータ構造を図１４に示す。図１のメモリカード１３に保存される特殊ファイル４００の本体領域には暗号化オーディオ信号のデータが格納され、特殊ファイル４００のヘッダ領域には、音声区間情報、復号鍵情報及び権限管理情報が格納される。 A special file assumed in this embodiment is called a special file 400, and its data structure is shown in FIG. The data of the encrypted audio signal is stored in the main body area of the special file 400 stored in the memory card 13 of FIG. 1, and the voice section information, the decryption key information, and the authority management information are stored in the header area of the special file 400. Is done.

特殊ファイル４００の音声区間情報には、各要素区間の発話者を表す情報も含められる。今の例の場合、音声区間情報は、元オーディオ信号（又は暗号化オーディオ信号）の全区間中の何れの区間が音声区間であるかを表すだけでなく、第１及び第２の要素区間の発話者が登録話者であり且つ第３の要素区間の発話者が非登録話者であることをも表す。そして、第３実施例の如く、特殊ファイル４００の権限管理情報に公開レベル情報を含めるようにし、更に、この公開レベル情報を登録話者に対応する要素区間と非登録話者に対応する要素区間とで区別して設定できるようにする。 The voice section information of the special file 400 includes information representing the speaker in each element section. In the case of the present example, the voice section information not only indicates which section of all sections of the original audio signal (or encrypted audio signal) is the voice section, but also the first and second element sections. It also represents that the speaker is a registered speaker and the speaker in the third element section is a non-registered speaker. Then, as in the third embodiment, public level information is included in the authority management information of the special file 400, and further, this public level information includes element intervals corresponding to registered speakers and element intervals corresponding to non-registered speakers. And can be set separately.

登録話者に対応する要素区間についての公開レベル情報を登録話者用公開レベル情報と呼び、非登録話者に対応する要素区間についての公開レベル情報を非登録話者用公開レベル情報と呼ぶ。登録話者用公開レベル情報と非登録話者用公開レベル情報は、図１０を参照して述べたような各公開関係と各権限との関係を規定する。ユーザは、記録装置１又はユーザＰＣ２上で、登録話者用公開レベル情報と非登録話者用公開レベル情報を個別に設定することが可能である。また、本実施例において、サーバ３の公開管理メモリ６５（図８）には、図９に示す公開管理関係情報が格納されていたものとする。 The public level information for the element section corresponding to the registered speaker is referred to as registered speaker public level information, and the public level information for the element section corresponding to the non-registered speaker is referred to as non-registered speaker public level information. The registered speaker public level information and the non-registered speaker public level information define the relationship between each public relationship and each authority as described with reference to FIG. The user can individually set the registered speaker public level information and the unregistered speaker public level information on the recording device 1 or the user PC 2. In this embodiment, it is assumed that the public management memory 65 (FIG. 8) of the server 3 stores the public management relation information shown in FIG.

仮に、登録話者（例えば、記録装置１の所持者）の音声は友人の友人までに公開されても構わないが、登録話者の周りで話している話者の会話を友人以外に公開したくない場合は、例えば、図１５（ａ）に示す登録話者用公開レベル情報と図１５（ｂ）に示す非登録話者用公開レベル情報を設定しておく。 For example, the voice of a registered speaker (for example, the owner of the recording apparatus 1) may be disclosed to a friend of a friend, but the conversation of a speaker talking around the registered speaker is disclosed to a friend other than the friend. If not, for example, the registered speaker public level information shown in FIG. 15A and the unregistered speaker public level information shown in FIG. 15B are set.

設定された登録話者用公開レベル情報は、ユーザＰＣ２との間で第１、第２及び第３の公開関係を有する閲覧側ＰＣに、夫々、第１、第１及び第２の権限を与えることを指示する。この指示は、登録話者が発話していた第１及び第２の要素区間のみに対する指示である。
設定された非登録話者用公開レベル情報は、ユーザＰＣ２との間で第１、第２及び第３の公開関係を有する閲覧側ＰＣに、夫々、第１、第３及び第３の権限を与えることを指示する。この指示は、非登録話者が発話していた第３の要素区間のみに対する指示である。 The set registered speaker public level information gives the first, first, and second authorities to the viewing side PC having the first, second, and third public relations with the user PC 2, respectively. I will tell you. This instruction is an instruction only for the first and second element sections in which the registered speaker is speaking.
The set public level information for the non-registered speaker gives the first, third, and third authorities to the viewing-side PC having the first, second, and third public relations with the user PC 2, respectively. Instruct to give. This instruction is an instruction only for the third element section in which the unregistered speaker is speaking.

この設定後、特殊ファイル４００を、図６のユーザＰＣ２及び通信網４を介してサーバ３にアップロードすると、それは図８のサーバＨＤＤ６４に格納される。 After this setting, when the special file 400 is uploaded to the server 3 via the user PC 2 and the communication network 4 in FIG. 6, it is stored in the server HDD 64 in FIG.

各閲覧用ＰＣ（１０１〜１０３）からサーバ３内の特殊ファイル４００に対する送信要求が出力された際、サーバ３の主制御部６１は、サーバＨＤＤ６４内の特殊ファイル４００の公開レベル情報と公開管理関係情報に基づいて、各閲覧用ＰＣ（１０１〜１０３）に与えられるべき権限を認識する。そして、認識した権限と特殊ファイル４００の音声区間情報に基づいて、各閲覧用ＰＣに対する送信内容を制御する。 When a transmission request for the special file 400 in the server 3 is output from each browsing PC (101 to 103), the main control unit 61 of the server 3 determines the public level information and the public management relationship of the special file 400 in the server HDD 64. Based on the information, the authority to be given to each browsing PC (101 to 103) is recognized. Then, based on the recognized authority and the audio section information of the special file 400, the transmission contents for each browsing PC are controlled.

今の例の場合、主制御部６１は、閲覧用ＰＣ１０１に対しては、第１〜第３の要素区間の全てに関して第１の権限を与える。このため、第１〜第３の要素区間における暗号化オーディオ信号を全て正しく復号し、復号後のオーディオ信号（即ち、元オーディオ信号の全て）を閲覧用ＰＣ１０１に送信する。 In the case of the present example, the main control unit 61 gives the first authority to the browsing PC 101 for all of the first to third element sections. Therefore, all the encrypted audio signals in the first to third element sections are correctly decrypted, and the decrypted audio signals (that is, all of the original audio signals) are transmitted to the viewing PC 101.

主制御部６１は、閲覧用ＰＣ１０２に対しては、第１及び第２の要素区間に関して第１の権限を与える一方、第３の要素区間に関しては第３の権限を与える。このため、第１及び第２の要素区間における暗号化オーディオ信号を正しく復号し、第１及び第２の要素区間における元オーディオ信号を閲覧用ＰＣ１０２に送信する。但し、第３の要素区間に関しては、元オーディオ信号の代わりに発話内容が特定できないような信号を閲覧用ＰＣ１０２に送信する（図１１（ｃ）参照）。閲覧用ＰＣ１０３に対しても、これらに準じた送信制御がなされる。また、非音声区間については、第３実施例と同様、記録時におけるオーディオ信号と同じオーディオ信号が各閲覧用ＰＣ（１０１〜１０３）に送信される。 The main control unit 61 gives the first authority to the viewing PC 102 with respect to the first and second element sections, and gives the third authority with respect to the third element section. Therefore, the encrypted audio signal in the first and second element intervals is correctly decrypted, and the original audio signal in the first and second element intervals is transmitted to the viewing PC 102. However, for the third element section, a signal that cannot specify the utterance content is transmitted to the viewing PC 102 instead of the original audio signal (see FIG. 11C). Transmission control according to these is also performed for the browsing PC 103. In the non-speech section, as in the third embodiment, the same audio signal as the audio signal at the time of recording is transmitted to each viewing PC (101 to 103).

上述の如く処理することにより、発話者に応じた公開制御が可能となり、きめ細かなプライバシー保護が可能となる。 By performing the processing as described above, it is possible to perform public control according to the speaker, and fine privacy protection is possible.

尚、登録話者が複数である場合は、登録話者ごとに公開レベル情報を設定できるようにするとよい。 When there are a plurality of registered speakers, public level information may be set for each registered speaker.

［複数の非登録話者間の区別］
また、非登録話者が複数である場合も、非登録話者ごとに公開レベル情報を設定できるようにするとよい。これについて説明を加える。今、非登録話者として、互いに異なる第１及び第２の非登録話者が存在していた場合を想定する。また、登録話者を無視して考える。この場合、本実施例に係る話者認識部は、音声区間における発話者が第１と第２の非登録話者の何れであるかを区別する。この区別の手法に対して、公知の手法を適用可能である。例えば、話者認識時において、マイク１１部からの元オーディオ信号に含まれる認識対象話者（現在の話者）の音声を特徴量抽出部７１に供給し、認識対象話者が発生した文章などの音声に含まれる、ケプストラムやピッチ等の特徴量を特徴量抽出部７１にて抽出する。この特徴量に基づけば、話者認識部は、或る区間における発話者と他の区間における発話者が同一であるか否かを判別可能である。 [Distinction between multiple unregistered speakers]
Even when there are a plurality of non-registered speakers, it is preferable that public level information can be set for each non-registered speaker. This will be explained. Assume that there are first and second unregistered speakers different from each other as unregistered speakers. Also, ignore the registered speaker. In this case, the speaker recognition unit according to the present embodiment distinguishes whether the speaker in the voice section is the first or second non-registered speaker. A known technique can be applied to this distinction technique. For example, at the time of speaker recognition, the speech of the recognition target speaker (current speaker) included in the original audio signal from the microphone 11 unit is supplied to the feature amount extraction unit 71, and the sentence generated by the recognition target speaker, etc. The feature amount extraction unit 71 extracts feature amounts such as cepstrum and pitch included in the voice. Based on this feature amount, the speaker recognition unit can determine whether or not the speaker in a certain section is the same as the speaker in another section.

今の例でも、上述の第１〜第３の要素区間を想定する（図１３参照）。そして例えば、話者認識部によって、第１及び第２の要素区間における音声の発話者が同一であり、且つ、第１及び第２の要素区間における音声の発話者と第３の要素区間における音声の発話者が異なる、と判断されたとする。そうすると、話者認識部によって、第１及び第２の要素区間に含まれる音声の発話者が第１の非登録話者であり、第３の要素区間に含まれる音声の発話者が第２の非登録話者であると判断されることになる。 Also in the present example, the above-described first to third element sections are assumed (see FIG. 13). And, for example, the speaker recognizer has the same voice speaker in the first and second element sections, and the voice speaker in the first and second element sections and the voice in the third element section. Suppose that the speakers are different. Then, by the speaker recognition unit, the voice speaker included in the first and second element sections is the first non-registered speaker, and the voice speaker included in the third element section is the second speaker. It is determined that the speaker is an unregistered speaker.

そして、特殊ファイル４００（図１４参照）の音声区間情報に、各要素区間の発話者を表す情報を含めるようにする。今の例の場合、音声区間情報は、元オーディオ信号（又は暗号化オーディオ信号）の全区間中の何れの区間が音声区間であるかを表すだけでなく、第１及び第２の要素区間の発話者が第１の非登録話者であり且つ第３の要素区間の発話者が第２の非登録話者であることをも表す。そして、第３実施例の如く、特殊ファイル４００の権限管理情報に公開レベル情報を含めるようにし、更に、この公開レベル情報を発話者ごとに区別して設定できるようにする（換言すれば、公開レベル情報を第１の非登録話者に対応する要素区間と第２の非登録話者に対応する要素区間とで区別して設定できるようにする）。 Then, information representing the speaker in each element section is included in the voice section information of the special file 400 (see FIG. 14). In the case of the present example, the voice section information not only indicates which section of all sections of the original audio signal (or encrypted audio signal) is the voice section, but also the first and second element sections. It also represents that the speaker is the first unregistered speaker and the speaker in the third element section is the second unregistered speaker. Then, as in the third embodiment, the public level information is included in the authority management information of the special file 400, and the public level information can be set separately for each speaker (in other words, the public level). Information can be set separately for the element section corresponding to the first non-registered speaker and the element section corresponding to the second non-registered speaker).

今の例の場合、第１及び第２の非登録話者に対する公開レベル情報として、夫々、第１及び第２の非登録話者用公開レベル情報が設けられる。第１の非登録話者用公開レベル情報は、第１の非登録話者が発話していた第１及び第２の要素区間に対する公開レベル情報として取り扱われ、第２の非登録話者用公開レベル情報は、第２の非登録話者が発話していた第３の要素区間に対する公開レベル情報として取り扱われる。ユーザは、記録装置１又はユーザＰＣ２上で、第１及び第２の非登録話者用公開レベル情報を個別に設定することが可能である。この設定内容に従ったサーバ３の動作は、上述したものと同様である。 In the case of the present example, as public level information for the first and second non-registered speakers, first and second non-registered speaker public level information are provided, respectively. The first public level information for non-registered speakers is treated as public level information for the first and second element sections spoken by the first non-registered speaker, and is disclosed for the second non-registered speaker. The level information is handled as public level information for the third element section spoken by the second non-registered speaker. The user can individually set the first and second unregistered speaker public level information on the recording device 1 or the user PC 2. The operation of the server 3 according to this setting content is the same as described above.

尚、非登録話者ごとに公開レベル情報を設定するという手法は、登録話者と非登録話者とを区別する手法と切り離して実施することが可能である。 Note that the method of setting the public level information for each non-registered speaker can be performed separately from the method of distinguishing the registered speaker from the non-registered speaker.

［音源分離］
また、実際の会話では、同一区間において複数の発話者が存在することもある（同時に複数の人間が発話することがある）。これを考慮し、上述してきた内容に、音源分離処理を組み合わせるようにしてもよい。これについて説明する。 [Sound source separation]
In an actual conversation, there may be a plurality of speakers in the same section (a plurality of people may speak at the same time). In consideration of this, the above-described content may be combined with sound source separation processing. This will be described.

複数の発話者が同時に発話している区間に対して音源分離処理を適用することにより、その区間内における各発話者を区別して認識することが可能となる。音源分離処理では、互いに異なる発話者を互いに異なる音源として捉える。音源分離を行うためには、例えば、図１のマイク部１１を互いに異なる位置に配置された複数のマイクロホンにて形成するようにし、各マイクロホンの出力信号間の遅延量や各マイクロホンの出力信号間の信号レベル差などから音源を分離する。音源分離の処理内容自体は公知であるため、詳細な説明を割愛する。音源を分離することができれば、互いに異なる複数の発話者を分離して認識することができる。 By applying sound source separation processing to a section in which a plurality of speakers are speaking at the same time, it is possible to distinguish and recognize each speaker in the section. In the sound source separation process, different speakers are regarded as different sound sources. In order to perform sound source separation, for example, the microphone unit 11 in FIG. 1 is formed by a plurality of microphones arranged at different positions, and the delay amount between the output signals of the microphones and the output signals of the microphones are separated. The sound source is separated from the difference in signal level. Since the processing content of the sound source separation itself is known, a detailed description is omitted. If the sound source can be separated, a plurality of different speakers can be separated and recognized.

例えば、図１３における第１の要素区間において、第１の発話者と第２の発話者が同時に発話していた場合を考える。第１及び第２の発話者がそれぞれ第１及び第２の音源であると想定する。また、第１及び第２の発話者の音声をそれぞれ第１及び第２の音声と呼ぶ。この場合、本実施例に係る話者認識部は、第１の要素区間に対して音源分離処理を施し、これによって音源を分離する。つまり、第１の要素区間に含まれる、第１の音源からの第１の音声（第１の音声の信号成分）と第２の音源からの第２の音声（第２の音声の信号成分）を分離する。この後、話者認識部は、分離によって得られた各音声に対して上述してきた話者認識を行う。 For example, consider a case where the first speaker and the second speaker are speaking simultaneously in the first element section in FIG. Assume that the first and second speakers are first and second sound sources, respectively. The voices of the first and second speakers are referred to as first and second voices, respectively. In this case, the speaker recognition unit according to the present embodiment performs sound source separation processing on the first element section, thereby separating the sound sources. That is, the first sound from the first sound source (the signal component of the first sound) and the second sound from the second sound source (the signal component of the second sound) included in the first element section. Isolate. Thereafter, the speaker recognition unit performs the above-described speaker recognition for each voice obtained by the separation.

図１３の第２及び第３の要素区間に対しても同様の処理を行う。これにより、例えば、第１の要素区間において第１及び第２の非登録話者が発話しており、第２の要素区間において第１の非登録話者のみが発話しており、第３の要素区間において第２の非登録話者のみが発話していると判断された場合を想定する。この判断結果は、特殊ファイル４００（図１４参照）の音声区間情報に含められる。そして、第３実施例の如く、特殊ファイル４００の権限管理情報に公開レベル情報を含めるようにし、更に、この公開レベル情報を発話者ごとに区別して設定できるようにする。 Similar processing is performed for the second and third element sections in FIG. Thus, for example, the first and second non-registered speakers are speaking in the first element interval, only the first non-registered speaker is speaking in the second element interval, and the third Assume that it is determined that only the second unregistered speaker is speaking in the element section. This determination result is included in the audio section information of the special file 400 (see FIG. 14). Then, as in the third embodiment, the public level information is included in the authority management information of the special file 400, and the public level information can be set separately for each speaker.

今の例の場合、第１及び第２の非登録話者に対する公開レベル情報として、夫々、第１及び第２の非登録話者用公開レベル情報が設けられる。第１の非登録話者用公開レベル情報は、第１の非登録話者が発話していた第１及び第２の要素区間に対する公開レベル情報として取り扱われ、第２の非登録話者用公開レベル情報は、第２の非登録話者が発話していた第１及び第３の要素区間に対する公開レベル情報として取り扱われる。ユーザは、記録装置１又はユーザＰＣ２上で、第１及び第２の非登録話者用公開レベル情報を個別に設定することが可能である。 In the case of the present example, as public level information for the first and second non-registered speakers, first and second non-registered speaker public level information are provided, respectively. The first public level information for non-registered speakers is treated as public level information for the first and second element sections spoken by the first non-registered speaker, and is disclosed for the second non-registered speaker. The level information is handled as public level information for the first and third element sections spoken by the second unregistered speaker. The user can individually set the first and second unregistered speaker public level information on the recording device 1 or the user PC 2.

この設定内容に従ったサーバ３の動作は、上述したものと同様であるが、第１の要素区間に対しては、第１及び第２の非登録話者用公開レベル情報が競合するため、どちらか一方が優先して使用される。例えば、第１の非登録話者用公開レベル情報が「ユーザＰＣ２との間で第３の公開関係を有する閲覧側ＰＣに、第１の権限を与えることを指示」している一方で、第２の非登録話者用公開レベル情報が「ユーザＰＣ２との間で第３の公開関係を有する閲覧側ＰＣに、第３の権限を与えることを指示」している場合において、ユーザＰＣ２との間で第３の公開関係を有する閲覧用ＰＣ１０３がサーバ３内の特殊ファイル４００に対する送信要求をサーバ３に対して出力した際、サーバ３は以下のように動作する。 The operation of the server 3 according to this setting content is the same as that described above, but the first and second unregistered speaker public level information competes for the first element section. Either one is used preferentially. For example, the first unregistered speaker public level information “instructs the viewing side PC having the third public relationship with the user PC 2 to give the first authority” If the public level information for the non-registered speaker 2 “instructs the viewing-side PC having the third public relationship with the user PC 2 to give the third authority”, When the browsing PC 103 having the third public relationship among them outputs a transmission request for the special file 400 in the server 3 to the server 3, the server 3 operates as follows.

サーバ３は、第１の非登録話者用公開レベル情報が指示する第１の権限と、第２の非登録話者用公開レベル情報が指示する第３の権限と、を比較し、権限を表す数値の大きい方を選択する。今の例の場合、第３の権限を選択されることになる。従って、サーバ３は、第１の要素区間に関しては、元オーディオ信号の代わりに発話内容が特定できないような信号を閲覧用ＰＣ１０３に送信する（図１１（ｃ）参照）。このように処理することにより、第２の非登録話者のプライバシーが適切に保護される。 The server 3 compares the first authority indicated by the first non-registered speaker public level information with the third authority specified by the second non-registered speaker public level information, and determines the authority. Select the one with the larger value. In the case of this example, the third authority is selected. Therefore, for the first element section, the server 3 transmits a signal such that the utterance content cannot be specified instead of the original audio signal to the viewing PC 103 (see FIG. 11C). By processing in this way, the privacy of the second unregistered speaker is appropriately protected.

＜＜第５実施例＞＞
次に、第５実施例について説明する。図１の記録装置１に再生機能を付加することにより、オーディオ信号記録再生装置を形成することができる。図１６に、オーディオ信号記録再生装置６（以下、「記録再生装置６」と略記する）の内部ブロック図を示す。記録再生装置６は、図１の記録装置１を含み、更にオーディオ信号処理部１６及びスピーカを含む再生出力部１７を備える。記録再生装置６に含まれる記録装置１は、上述の各実施例で述べたそれと同じものである。 << 5th Example >>
Next, a fifth embodiment will be described. By adding a reproduction function to the recording apparatus 1 of FIG. 1, an audio signal recording / reproducing apparatus can be formed. FIG. 16 shows an internal block diagram of the audio signal recording / reproducing apparatus 6 (hereinafter abbreviated as “recording / reproducing apparatus 6”). The recording / reproducing apparatus 6 includes the recording apparatus 1 of FIG. 1 and further includes an audio signal processing unit 16 and a reproduction output unit 17 including a speaker. The recording apparatus 1 included in the recording / reproducing apparatus 6 is the same as that described in the above embodiments.

再生時において、オーディオ信号処理部１６は、主制御部１４の制御の下、メモリカード１３に保存されている特殊ファイルから暗号化オーディオ信号と復号用情報を読み出し、復号用情報に基づいて暗号化オーディオ信号を復号することにより元オーディオ信号を復元する。そして、復元した元オーディオ信号を再生出力部１７に与えることにより、再生出力部１７から元オーディオ信号が再生出力される。 At the time of reproduction, the audio signal processing unit 16 reads the encrypted audio signal and the decryption information from the special file stored in the memory card 13 under the control of the main control unit 14, and encrypts based on the decryption information The original audio signal is restored by decoding the audio signal. Then, by supplying the restored original audio signal to the reproduction output unit 17, the original audio signal is reproduced and output from the reproduction output unit 17.

＜＜第６実施例＞＞
次に、第６実施例について説明する。図１６の記録再生装置６に撮影機能を付加することにより、撮像装置を形成することができる。図１７に、撮像装置７の内部ブロック図を示す。撮像装置７は、図１６の記録再生装置６を含み、更に、撮像部１８、映像信号処理部１９及び表示部２０を備える。撮像装置７は、静止画像又は動画像を撮影可能なデジタルビデオカメラである。 << Sixth Example >>
Next, a sixth embodiment will be described. An imaging device can be formed by adding a photographing function to the recording / reproducing device 6 of FIG. FIG. 17 shows an internal block diagram of the imaging device 7. The imaging device 7 includes the recording / reproducing device 6 of FIG. 16, and further includes an imaging unit 18, a video signal processing unit 19, and a display unit 20. The imaging device 7 is a digital video camera that can capture still images or moving images.

撮像部１８は、ＣＣＤ（Charge Coupled Devices）イメージセンサやＣＭＯＳ（Complementary Metal Oxide Semiconductor）イメージセンサ等からなる撮像素子と、光学系と、絞りとを含み、被写体の光学像を電気信号に変換することによって該光学像に応じた画像を取得する。映像信号処理部１９は、取得された画像を表す映像信号を生成し、映像信号に所定の圧縮処理を施してからメモリカード１３に送る。静止画像又は動画像の撮影及び記録時において、映像信号処理部１９からの映像信号はメモリカード１３に記録される。 The imaging unit 18 includes an imaging device such as a CCD (Charge Coupled Devices) image sensor or a CMOS (Complementary Metal Oxide Semiconductor) image sensor, an optical system, and a diaphragm, and converts an optical image of a subject into an electrical signal. To obtain an image corresponding to the optical image. The video signal processing unit 19 generates a video signal representing the acquired image, performs a predetermined compression process on the video signal, and sends the video signal to the memory card 13. The video signal from the video signal processing unit 19 is recorded on the memory card 13 when a still image or a moving image is captured and recorded.

特に、動画像の撮影及び記録時には、オーディオ信号処理部１２にて生成された暗号化オーディオ信号と映像信号処理部１９からの映像信号（即ち、撮像部１８にて取得された画像に応じた画像データ）が互いに関連付けられて１つのファイルに格納される。つまり、上述してきた特殊ファイルの本体領域内に暗号化オーディオ信号と映像信号が格納される。この際、オーディオ信号に対するプライバシー保護手法に関連して、特殊ファイルに格納される画像データに対しても何らかの加工処理を施しておいても良い（例えば、顔領域にモザイク処理等を施してから特殊ファイルに保存する）。 In particular, when capturing and recording a moving image, the encrypted audio signal generated by the audio signal processing unit 12 and the video signal from the video signal processing unit 19 (that is, an image corresponding to the image acquired by the imaging unit 18). Data) are associated with each other and stored in one file. That is, the encrypted audio signal and the video signal are stored in the main body area of the special file described above. At this time, in connection with the privacy protection method for the audio signal, some processing may be applied to the image data stored in the special file (for example, the face area is subjected to mosaic processing or the like and then special processing is performed). Save to file).

表示部２０は、撮像部１８による現時点の取得画像又はメモリカード１３に記録された映像信号によって表される画像を表示する。 The display unit 20 displays the image acquired by the imaging unit 18 at the current time or the image represented by the video signal recorded on the memory card 13.

尚、撮像装置７から、オーディオ信号処理部１６及び再生出力部１７を省くことも可能である。 Note that the audio signal processing unit 16 and the reproduction output unit 17 can be omitted from the imaging device 7.

＜＜第７実施例＞＞
次に、第７実施例について説明する。第７実施例では、通信網４などを介して公開する予定がないオーディオ信号に対してプライバシーの保護を図る手法を説明する。 << Seventh Embodiment >>
Next, a seventh embodiment will be described. In the seventh embodiment, a method for protecting privacy for an audio signal that is not scheduled to be disclosed via the communication network 4 will be described.

複数の記録再生装置間におけるプライバシーの保護手法を説明する。今、図１８に示す如く、２台の記録再生装置６及び６ａがあったとする。図１８における記録再生装置６は、図１６に示すそれと同じものである。記録再生装置６ａは、記録再生装置６と同一の構成を有する記録再生装置である。記録再生装置６ａの内部ブロック図は、図１６のそれと同じであるため、重複する図示を省略する。 A method for protecting privacy between a plurality of recording / reproducing apparatuses will be described. Assume that there are two recording / reproducing apparatuses 6 and 6a as shown in FIG. The recording / reproducing apparatus 6 in FIG. 18 is the same as that shown in FIG. The recording / reproducing apparatus 6 a is a recording / reproducing apparatus having the same configuration as the recording / reproducing apparatus 6. Since the internal block diagram of the recording / reproducing apparatus 6a is the same as that of FIG. 16, overlapping illustration is omitted.

記録再生装置６及び６ａを含む各記録再生装置には、互いに異なる固有コードが割り振られている。この固有コードは、例えば、各記録再生装置に割り当てられたシリアル番号である。例えば、各記録再生装置の出荷時において、記録再生装置（６又は６ａ）の内部に設けられた不揮発性メモリ（不図示）に上記の固有コードを保存しておく。今、記録再生装置６に割り当てられた固有コードを第１の固有コードと呼び、記録再生装置６ａに割り当てられた固有コードを第２の固有コードと呼ぶ。上述の説明から明らかなように、第１の固有コードと第２の固有コードは互いに異なる。 Different recording codes are allocated to the recording / reproducing apparatuses including the recording / reproducing apparatuses 6 and 6a. This unique code is, for example, a serial number assigned to each recording / reproducing apparatus. For example, at the time of shipment of each recording / reproducing apparatus, the above unique code is stored in a nonvolatile memory (not shown) provided in the recording / reproducing apparatus (6 or 6a). Now, the unique code assigned to the recording / reproducing apparatus 6 is called a first unique code, and the unique code assigned to the recording / reproducing apparatus 6a is called a second unique code. As is clear from the above description, the first unique code and the second unique code are different from each other.

本実施例において、特殊ファイルの権限管理情報は認証コードを含み、ユーザは該認証コードを変更できないものとする。そして、各記録再生装置（６、６ａ）は、自身に割り当てられた固有コードを認証コードとして特殊ファイルのヘッダ領域内に書き込むものとする。 In the present embodiment, it is assumed that the authority management information of the special file includes an authentication code, and the user cannot change the authentication code. Each recording / reproducing device (6, 6a) writes the unique code assigned to it in the header area of the special file as an authentication code.

また、本実施例において、各記録再生装置（６、６ａ）には特殊ファイルに対する閲覧用ソフトウェアが導入されており、各記録再生装置（６、６ａ）は該閲覧用ソフトウェア上で特殊ファイル内のデータを読み出すものとする（該閲覧用ソフトウェアを用いることなく、特殊ファイル内のデータを読み出すことができないとする）。閲覧用ソフトウェアは、図１６の主制御部１４上で動作する。 In the present embodiment, each recording / playback device (6, 6a) is installed with browsing software for a special file, and each recording / playback device (6, 6a) is stored in the special file on the browsing software. Data is read (assuming that data in a special file cannot be read without using the browsing software). The browsing software operates on the main control unit 14 in FIG.

そして、今、記録再生装置６において、元オーディオ信号が取得され、その元オーディオ信号に対応する特殊ファイル４３０が記録再生装置６のメモリカード１３に保存された場合を考える（図１８参照）。特殊ファイル４３０は図２の特殊ファイル３００と同様のデータ構造を有しているが、特殊ファイル４３０のヘッダ領域には、記録再生装置６に割り当てられた第１の固有コードが認証コードとして書き込まれることになる。 Now, consider a case where the original audio signal is acquired in the recording / reproducing apparatus 6 and the special file 430 corresponding to the original audio signal is stored in the memory card 13 of the recording / reproducing apparatus 6 (see FIG. 18). The special file 430 has the same data structure as that of the special file 300 in FIG. 2, but the first unique code assigned to the recording / reproducing apparatus 6 is written as an authentication code in the header area of the special file 430. It will be.

閲覧用ソフトウェアは、特殊ファイル４３０のヘッダ領域に書き込まれた認証コードと当該閲覧用ソフトウェアが導入された記録再生装置の固有コードとを照合し、両者の一致が確認された場合にのみ、復号用情報に基づいて特殊ファイル４３０内の暗号化オーディオ信号を復号し、この復号によって得られた元オーディオ信号を当該閲覧用ソフトウェアが導入された記録再生装置にて再生出力させる。尚、認証コードと固有コードとの照合を行う照合部と、その照合結果に基づく暗号化オーディオ信号の復号及び元オーディオ信号の再生出力に対する許可／禁止を判別する判別部は、主制御部１４によって実現されることになる。 The browsing software collates the authentication code written in the header area of the special file 430 with the unique code of the recording / playback apparatus in which the browsing software is installed, and only when the matching is confirmed, Based on the information, the encrypted audio signal in the special file 430 is decrypted, and the original audio signal obtained by the decryption is reproduced and output by the recording / reproducing apparatus in which the browsing software is introduced. The main control unit 14 includes a verification unit that performs verification between the authentication code and the unique code, and a determination unit that determines whether the encrypted audio signal is decoded and whether the original audio signal is reproduced and output based on the verification result. Will be realized.

具体的には、記録再生装置６において以下のような動作が行われる。記録再生装置６のユーザが特殊ファイル４３０に対応するオーディオ信号を再生出力しようとする場合、再生出力を指示する操作を操作部１５に対して行う。この操作がなされると、閲覧用ソフトウェア（即ち、記録再生装置６の主制御部１４）が特殊ファイル４３０のヘッダ領域に書き込まれた認証コードをメモリカード１３から読み出すと共に記録再生装置６に割り当てられた固有コードを上記不揮発性メモリ等から読み出し、読み出した認証コードと固有コードを照合する。今の場合、両者が一致しているため、閲覧用ソフトウェア（即ち、記録再生装置６の主制御部１４）は暗号化オーディオ信号の復号及び元オーディオ信号の再生出力を許可する。従って、記録再生装置６のオーディオ信号処理部１６にて特殊ファイル４３０内の暗号化オーディオ信号が復号されて、復号によって得られた元オーディオ信号が記録再生装置６の再生出力部１７から再生出力される。 Specifically, the following operation is performed in the recording / reproducing apparatus 6. When the user of the recording / reproducing apparatus 6 tries to reproduce and output an audio signal corresponding to the special file 430, an operation for instructing reproduction output is performed on the operation unit 15. When this operation is performed, the browsing software (that is, the main control unit 14 of the recording / reproducing apparatus 6) reads the authentication code written in the header area of the special file 430 from the memory card 13 and is assigned to the recording / reproducing apparatus 6. The unique code is read from the non-volatile memory or the like, and the read authentication code and the unique code are collated. In this case, since the two match, the browsing software (that is, the main control unit 14 of the recording / reproducing apparatus 6) permits the decryption of the encrypted audio signal and the reproduction output of the original audio signal. Therefore, the audio signal processing unit 16 of the recording / reproducing apparatus 6 decrypts the encrypted audio signal in the special file 430, and the original audio signal obtained by the decryption is reproduced and output from the reproduction output unit 17 of the recording / reproducing apparatus 6. The

一方、特殊ファイル４３０を格納したメモリカード１３を記録再生装置６ａに装着し、記録再生装置６ａのユーザが特殊ファイル４３０に対応するオーディオ信号を再生出力しようとした場合は、記録再生装置６ａにおいて以下のような動作が行われる。記録再生装置６ａにおいて、再生出力を指示する操作がなされると、閲覧用ソフトウェア（即ち、記録再生装置６ａの主制御部１４）が特殊ファイル４３０のヘッダ領域に書き込まれた認証コードをメモリカード１３から読み出すと共に記録再生装置６ａに割り当てられた固有コードを上記不揮発性メモリ等から読み出し、読み出した認証コードと固有コードを照合する。今の場合、両者は一致していないため、閲覧用ソフトウェア（即ち、記録再生装置６ａの主制御部１４）は暗号化オーディオ信号の復号及び元オーディオ信号の再生出力を禁止する。この場合、例えば、暗号化オーディオ信号をそのまま記録再生装置６ａの再生出力部１７に与える。このようにすれば、記録再生装置６ａのユーザは、音声区間における人の音声を聞き取ることができない。或いは、オーディオ信号を一切再生出力部１７に与えないようにしてもよい。 On the other hand, when the memory card 13 storing the special file 430 is attached to the recording / reproducing apparatus 6a and the user of the recording / reproducing apparatus 6a tries to reproduce and output an audio signal corresponding to the special file 430, the recording / reproducing apparatus 6a Operation like this is performed. When the recording / reproducing apparatus 6a is operated to instruct reproduction output, the browsing software (that is, the main control unit 14 of the recording / reproducing apparatus 6a) uses the authentication code written in the header area of the special file 430 to the memory card 13. The unique code assigned to the recording / reproducing device 6a is read from the nonvolatile memory or the like, and the read authentication code and the unique code are collated. In this case, since the two do not match, the browsing software (that is, the main control unit 14 of the recording / reproducing apparatus 6a) prohibits the decryption of the encrypted audio signal and the reproduction output of the original audio signal. In this case, for example, the encrypted audio signal is supplied to the reproduction output unit 17 of the recording / reproduction device 6a as it is. In this way, the user of the recording / reproducing apparatus 6a cannot hear the human voice in the voice section. Alternatively, no audio signal may be given to the reproduction output unit 17.

尚、記録再生装置６及び６ａの内、記録再生装置６においてのみ、暗号化オーディオ信号の復号の許可／禁止に対する、上述のような処理を解除可能としておくとよい。この解除は、閲覧用ソフトウェア上で、認証コードと固有コードとの合致を条件として実行される（従って、記録再生装置６ａでは実行できない）。この解除を指示する操作が記録再生装置６に対してなされた後は、特殊ファイル４３０に対応する元オーディオ信号の復元及び再生出力を記録再生装置６ａ上でも行うことが可能となる。 Of the recording / reproducing apparatuses 6 and 6a, only the recording / reproducing apparatus 6 may be configured to cancel the above-described processing for permission / prohibition of decryption of the encrypted audio signal. This cancellation is executed on the browsing software on the condition that the authentication code and the unique code match (therefore, the recording / reproducing apparatus 6a cannot execute this cancellation). After the operation for instructing the cancellation is performed on the recording / reproducing apparatus 6, the original audio signal corresponding to the special file 430 can be restored and reproduced on the recording / reproducing apparatus 6a.

また、複数の記録再生装置間の動作について説明したが、記録再生装置を含む複数の撮像装置間でも同様の動作を行うことができる。 Moreover, although the operation | movement between several recording / reproducing apparatuses was demonstrated, the same operation | movement can also be performed between several imaging devices containing a recording / reproducing apparatus.

次に、上記の特殊ファイル４３０を例にとりつつ、ＰＣ上におけるプライバシー保護手法を説明する。今、図１９に示す如く、ユーザＰＣ２の他に、ユーザＰＣ２とは異なるパーソナルコンピュータ２ａ（以下、ＰＣ２ａという）があったとする。ＰＣ２ａは、ユーザＰＣ２と同一の構成を有する。ＰＣ２ａの内部ブロック図は、図７のそれと同じであるため、重複する図示を省略する。特殊ファイル４３０は、ユーザＰＣ２とＰＣ２ａに提供される。 Next, a privacy protection method on the PC will be described using the special file 430 as an example. Now, as shown in FIG. 19, it is assumed that there is a personal computer 2a (hereinafter referred to as PC2a) different from the user PC2 in addition to the user PC2. The PC 2a has the same configuration as the user PC 2. Since the internal block diagram of the PC 2a is the same as that of FIG. 7, overlapping illustration is omitted. The special file 430 is provided to the user PC2 and the PC2a.

本実施例において、ユーザＰＣ２及びＰＣ２ａには特殊ファイルに対する専用ソフトウェアが導入されており、ユーザＰＣ２及びＰＣ２ａを含む、特殊ファイルにアクセス可能な再生機器（オーディオ信号再生装置）は、その専用ソフトウェア上でしか特殊ファイル内の各データの読み出し及び編集ができないものとする。専用ソフトウェアを表すプログラムは、ハードディスク等から成る図７のメモリ４８に格納され、専用ソフトウェアは主制御部４１上で動作する。 In the present embodiment, dedicated software for special files is introduced into the user PC2 and PC2a, and playback devices (audio signal playback devices) including the user PC2 and PC2a that can access the special file are on the dedicated software. However, it is assumed that each data in the special file cannot be read and edited. The program representing the dedicated software is stored in the memory 48 of FIG.

専用ソフトウェアには、ＰＣ用登録コードを登録可能である。この登録は、専用ソフトウェアを再生機器（例えば、ユーザＰＣ２及びＰＣ２ａ）に導入する際に行われる。記録再生装置６及びユーザＰＣ２のユーザは、ユーザＰＣ２に専用ソフトウェアを導入する際に、記録再生装置６に割り当てられた第１の固有コードをユーザＰＣ２に与えることにより、第１の固有コードをＰＣ用登録コードとしてユーザＰＣ２の専用ソフトウェア上に登録する。一方、記録再生装置６ａ及びＰＣ２ａのユーザは、ＰＣ２ａに専用ソフトウェアを導入する際に、記録再生装置６ａに割り当てられた第２の固有コードをＰＣ２ａに与えることにより、第２の固有コードをＰＣ用登録コードとしてＰＣ２ａの専用ソフトウェア上に登録する。第１の固有コードは、記録再生装置６ａ及びＰＣ２ａのユーザにとって不明である。 A PC registration code can be registered in the dedicated software. This registration is performed when the dedicated software is introduced into the playback device (for example, the user PC 2 and the PC 2a). When the user of the recording / reproducing device 6 and the user PC 2 introduces the dedicated software to the user PC 2, the user assigns the first unique code assigned to the recording / reproducing device 6 to the user PC 2. It is registered on the dedicated software of the user PC 2 as a registration code. On the other hand, when the user of the recording / reproducing apparatus 6a and the PC 2a introduces the dedicated software to the PC 2a, the second unique code assigned to the recording / reproducing apparatus 6a is given to the PC 2a so that the second unique code is used for the PC. It is registered on the dedicated software of the PC 2a as a registration code. The first unique code is unknown to the users of the recording / reproducing apparatus 6a and the PC 2a.

専用ソフトウェアは、特殊ファイル４３０のヘッダ領域に書き込まれた認証コード（今の例の場合、第１の固有コード）と当該専用ソフトウェアに登録されたＰＣ用登録コードとを照合し、両者の一致が確認された場合にのみ、特殊ファイル４３０内の暗号化オーディオ信号を復号して、復号によって得られた元オーディオ信号を当該専用ソフトウェアが導入された再生機器の再生出力部から再生出力させる。尚、暗号化オーディオ信号の復号は、例えば、図７のオーディオ信号処理部４２（復号処理部）によって実施される。また、認証コードとＰＣ用登録コードとの照合を行う照合部と、その照合結果に基づく暗号化オーディオ信号の復号及び元オーディオ信号の再生出力に対する許可／禁止を判別する判別部は、主制御部４１によって実現されることになる。 The dedicated software collates the authentication code written in the header area of the special file 430 (in this case, the first unique code) with the registration code for PC registered in the dedicated software. Only when it is confirmed, the encrypted audio signal in the special file 430 is decrypted, and the original audio signal obtained by the decryption is reproduced and output from the reproduction output unit of the reproduction apparatus in which the dedicated software is installed. The decryption of the encrypted audio signal is performed by, for example, the audio signal processing unit 42 (decryption processing unit) in FIG. The main control unit includes a verification unit that performs verification between the authentication code and the PC registration code, and a determination unit that determines whether the encrypted audio signal is decoded and whether the original audio signal is reproduced and output based on the verification result. 41 is realized.

具体的には、ユーザＰＣ２において以下のような動作が行われる。ユーザＰＣ２のユーザが特殊ファイル４３０に対応するオーディオ信号を再生出力しようとする場合、再生出力を指示する操作をＰＣ操作キー４７に対して行う。この操作がなされると、専用ソフトウェア（即ち、ユーザＰＣ２の主制御部４１）が特殊ファイル４３０のヘッダ領域に書き込まれた認証コードとユーザＰＣ２の専用ソフトウェアに登録されたＰＣ用登録コードを照合する。今の場合、両者が一致しているため、専用ソフトウェア（即ち、ユーザＰＣ２の主制御部４１）は暗号化オーディオ信号の復号及び元オーディオ信号の再生出力を許可する。従って、ユーザＰＣ２において特殊ファイル４３０内の暗号化オーディオ信号が復号されて、復号によって得られた元オーディオ信号がユーザＰＣ２の再生出力部４６から再生出力される。また、ユーザＰＣ２において、元オーディオ信号の編集も許可される。 Specifically, the following operation is performed in the user PC 2. When the user of the user PC 2 tries to reproduce and output an audio signal corresponding to the special file 430, an operation for instructing reproduction output is performed on the PC operation key 47. When this operation is performed, the dedicated software (that is, the main control unit 41 of the user PC 2) compares the authentication code written in the header area of the special file 430 with the PC registration code registered in the dedicated software of the user PC 2. . In this case, since they match, the dedicated software (that is, the main control unit 41 of the user PC 2) permits the decryption of the encrypted audio signal and the reproduction output of the original audio signal. Therefore, the encrypted audio signal in the special file 430 is decrypted in the user PC 2 and the original audio signal obtained by the decryption is reproduced and output from the reproduction output unit 46 of the user PC 2. In addition, the user PC 2 is allowed to edit the original audio signal.

一方、ＰＣ２ａのユーザが特殊ファイル４３０に対応するオーディオ信号を再生出力しようとした場合は、ＰＣ２ａにおいて以下のような動作が行われる。ＰＣ２ａのユーザは、その再生出力を指示する操作をＰＣ２ａのＰＣ操作キー４７に対して行う。この操作がなされると、専用ソフトウェア（即ち、ＰＣ２ａの主制御部４１）が特殊ファイル４３０のヘッダ領域に書き込まれた認証コードとＰＣ２ａの専用ソフトウェアに登録されたＰＣ用登録コードを照合する。今の場合、両者は一致していないため、専用ソフトウェア（即ち、ＰＣ２ａの主制御部４１）は暗号化オーディオ信号の復号及び元オーディオ信号の再生出力を禁止する。この場合、例えば、暗号化オーディオ信号をそのままＰＣ２ａの再生出力部４６に与える。このようにすれば、ＰＣ２ａのユーザは、音声区間における人の音声を聞き取ることができない。或いは、オーディオ信号を一切再生出力部４６に与えないようにしてもよい。 On the other hand, when the user of the PC 2a tries to reproduce and output an audio signal corresponding to the special file 430, the following operation is performed in the PC 2a. The user of the PC 2a performs an operation for instructing the reproduction output on the PC operation key 47 of the PC 2a. When this operation is performed, the dedicated software (that is, the main control unit 41 of the PC 2a) collates the authentication code written in the header area of the special file 430 with the PC registration code registered in the dedicated software of the PC 2a. In this case, since they do not match, the dedicated software (that is, the main control unit 41 of the PC 2a) prohibits the decryption of the encrypted audio signal and the reproduction output of the original audio signal. In this case, for example, the encrypted audio signal is supplied to the reproduction output unit 46 of the PC 2a as it is. In this way, the user of the PC 2a cannot hear the human voice in the voice section. Alternatively, no audio signal may be given to the reproduction output unit 46.

尚、ユーザＰＣ２及びＰＣ２ａの内、ユーザＰＣ２においてのみ、暗号化オーディオ信号の復号の許可／禁止に対する、上述のような処理を解除可能としておくとよい。この解除は、専用ソフトウェア上で、認証コードとＰＣ用登録コードとの合致を条件として実行される（従って、ＰＣ２ａでは実行できない）。この解除を指示する操作がユーザＰＣ２に対してなされた後は、特殊ファイル４３０に対応する元オーディオ信号の復元及び再生出力をＰＣ２ａ上でも行うことが可能となる。 It should be noted that only the user PC2 out of the user PC2 and PC2a may be able to cancel the processing as described above for permission / prohibition of decryption of the encrypted audio signal. This cancellation is executed on the dedicated software on condition that the authentication code matches the PC registration code (and therefore cannot be executed by the PC 2a). After the user PC2 is instructed to cancel, the original audio signal corresponding to the special file 430 can be restored and reproduced on the PC 2a.

上述のように処理することで、記録再生装置６又はユーザＰＣ２以外では元オーディオ信号は再生出力されなくなるため、メモリカード１３が紛失又は盗難されたり、コンピュータウイルスやファイル交換ソフトの悪用等によってネット経由で特殊ファイル４３０が流出した場合でも、プライバシーが保護される。 By performing the processing as described above, the original audio signal is not reproduced and output except for the recording / reproducing apparatus 6 or the user PC 2, so that the memory card 13 is lost or stolen, or the computer virus or file exchange software is abused via the network. Even if the special file 430 is leaked, privacy is protected.

＜＜第８実施例＞＞
上述の各実施例にて生成された特殊ファイルは、メモリカード１３以外の記録媒体にコピー（複製）される可能性があるが、このコピーに対して、プライバシー保護を図る仕組みを導入することもできる。本実施例では、特殊ファイルを特定の複製用ソフトウェア上でしかコピーできないものとする。複製用ソフトウェアは、例えば、ユーザＰＣ２に導入され、ユーザＰＣ２上で動作する。 << Eighth Example >>
The special file generated in each of the above-described embodiments may be copied (replicated) to a recording medium other than the memory card 13. However, a mechanism for protecting the privacy of this copy may be introduced. it can. In this embodiment, it is assumed that the special file can be copied only on specific duplication software. For example, the duplication software is installed in the user PC 2 and operates on the user PC 2.

本実施例において、特殊ファイルの権限管理情報は権限レベル情報を含み、ユーザは権限レベル情報を変更できないものとする。図１の記録装置１、図１６の記録再生装置６又は図１７の撮像装置７上で作成された特殊ファイルに対し、ユーザは、パスワードとしての認証コードを付与することができる。この付与は、記録装置１、記録再生装置６若しくは撮像装置７又はユーザＰＣ２上で行われる。その付与が行われた後の特殊ファイル５００のデータ構造を図２０に示す。尚、認証コードの付与は必須ではない。 In this embodiment, it is assumed that the authority management information of the special file includes authority level information, and the user cannot change the authority level information. The user can give an authentication code as a password to the special file created on the recording apparatus 1 in FIG. 1, the recording / reproducing apparatus 6 in FIG. 16, or the imaging apparatus 7 in FIG. This assignment is performed on the recording device 1, the recording / reproducing device 6, the imaging device 7, or the user PC 2. FIG. 20 shows the data structure of the special file 500 after the assignment. It is not essential to give an authentication code.

特殊ファイル５００のヘッダ領域には、音声区間情報、復号鍵情報及び権限管理情報が格納され、その権限管理情報に、上記の権限レベル情報及び認証コードが含まれることになる。特殊ファイル５００の本体領域には、暗号化オーディオ信号のデータが格納される。権限レベル情報は、複数段階に分類された権限レベルの何れかを示す。今、権限レベル情報は「１」、「２」又は「３」の何れかの値をとり、権限レベルが３段階で分類されるものとする。勿論、権限レベルを３段階以外の段階数で分類しても構わない。権限レベル情報が「１」である時に権限レベルは最も高く、権限レベル情報の数値が大きくなるにつれて権限レベルが低くなるものとする。 The header area of the special file 500 stores voice section information, decryption key information, and authority management information, and the authority management information includes the authority level information and the authentication code. Data of the encrypted audio signal is stored in the main body area of the special file 500. The authority level information indicates any of the authority levels classified into a plurality of stages. Now, it is assumed that the authority level information takes any value of “1”, “2”, or “3”, and the authority level is classified into three levels. Of course, the authority level may be classified by the number of stages other than three stages. It is assumed that the authority level is the highest when the authority level information is “1”, and the authority level decreases as the numerical value of the authority level information increases.

権限レベル情報は、当初「１」となっている。つまり、特殊ファイル５００が作成されてメモリカード１３に保存される際、権限レベル情報として「１」が書き込まれる。 The authority level information is initially “1”. That is, when the special file 500 is created and saved in the memory card 13, “1” is written as the authority level information.

複製用ソフトウェア上で特殊ファイル５００を複製する時、権限レベル情報だけは複製元の特殊ファイルのそれから変更される。具体的には、複製元の特殊ファイルにおける権限レベル情報の数値に対して複製先の特殊ファイルにおける権限レベル情報の数値を１以上大きくする。例えば、複製元の特殊ファイルにおける権限レベル情報が「１」であるとき、複製先の特殊ファイルにおける権限レベル情報は「２」とされ、複製元の特殊ファイルにおける権限レベル情報が「２」であるとき、複製先の特殊ファイルにおける権限レベル情報は「３」とされる。但し、複製しようとするユーザが複製用ソフトウェアが導入された機器（例えばユーザＰＣ２）に対して上記の認証コードを与えた上で複製を行った場合は、権限レベル情報の変更は行われないものとする。 When the special file 500 is copied on the copy software, only the authority level information is changed from that of the copy-source special file. Specifically, the value of authority level information in the copy-destination special file is increased by 1 or more with respect to the value of authority level information in the copy-source special file. For example, when the authority level information in the copy source special file is “1”, the authority level information in the copy destination special file is “2”, and the authority level information in the copy source special file is “2”. At this time, the authority level information in the copy-destination special file is set to “3”. However, if the user who wants to duplicate gives the above-mentioned authentication code to the device (for example, user PC 2) in which the duplication software is introduced, the authority level information is not changed. And

本実施例において、記録再生装置６及び撮像装置７並びにユーザＰＣ２を含む、特殊ファイル５００にアクセス可能な再生機器（オーディオ信号再生装置）は、権限レベル情報に応じた再生制御を行う。説明の具体化のため、再生機器が、ユーザＰＣ２である場合を考える。 In the present embodiment, playback devices (audio signal playback devices) that can access the special file 500, including the recording / playback device 6, the imaging device 7, and the user PC 2, perform playback control according to authority level information. For the sake of concrete explanation, consider a case where the playback device is the user PC 2.

例えば、特殊ファイルに対応するオーディオ信号を再生出力するように指示された際、ユーザＰＣ２は自身に与えられた特殊ファイル５００内の権限レベル情報を参照する。そして、権限レベル情報に応じて、以下のような動作を行う。 For example, when instructed to reproduce and output an audio signal corresponding to a special file, the user PC 2 refers to authority level information in the special file 500 given to the user PC 2. Then, the following operation is performed according to the authority level information.

権限レベル情報が「１」の時、ユーザＰＣ２は、特殊ファイル５００内の復号用情報に基づいて特殊ファイル５００内の暗号化オーディオ信号の復号を行い、復号によって得られた元オーディオ信号をユーザＰＣ２の再生出力部４６にて再生出力する。 When the authority level information is “1”, the user PC 2 decrypts the encrypted audio signal in the special file 500 based on the decryption information in the special file 500, and uses the original audio signal obtained by the decryption as the user PC 2. The reproduction output unit 46 reproduces and outputs.

権限レベル情報が「２」の時、ユーザＰＣ２は、復号用情報に基づいて暗号化オーディオ信号から元オーディオ信号を復元した後、音声区間情報に基づきつつ音声区間における元オーディオ信号に所定の加工処理を施し、音声区間における発話者が特定できないような第１加工オーディオ信号を作成する。この第１加工オーディオ信号は、第３実施例で述べたそれと同様のものであり、第３実施例で述べた手法にてそれを作成することができる（図１１（ｂ）参照）。そして、その第１加工オーディオ信号をユーザＰＣ２の再生出力部４６にて再生出力する。 When the authority level information is “2”, the user PC 2 restores the original audio signal from the encrypted audio signal based on the decryption information, and then performs a predetermined processing on the original audio signal in the voice interval based on the audio interval information. The first processed audio signal is created so that the speaker in the voice section cannot be identified. This first processed audio signal is the same as that described in the third embodiment, and can be created by the method described in the third embodiment (see FIG. 11B). The first processed audio signal is reproduced and output by the reproduction output unit 46 of the user PC 2.

権限レベル情報が「３」の時、ユーザＰＣ２は、特殊ファイル５００内の暗号化オーディオ信号を読み出し、その暗号化オーディオ信号と音声区間情報に基づいて、音声区間における発話内容が特定できないような第２加工オーディオ信号を作成する。この第２加工オーディオ信号は、第３実施例で述べたそれと同様のものであり、第３実施例で述べた手法にてそれを作成することができる（図１１（ｃ）参照）。そして、その第２加工オーディオ信号をユーザＰＣ２の再生出力部４６にて再生出力する。 When the authority level information is “3”, the user PC 2 reads the encrypted audio signal in the special file 500, and based on the encrypted audio signal and the voice section information, the user PC 2 cannot specify the utterance content in the voice section. 2 Create a processed audio signal. This second processed audio signal is the same as that described in the third embodiment, and can be created by the method described in the third embodiment (see FIG. 11C). Then, the second processed audio signal is reproduced and output by the reproduction output unit 46 of the user PC 2.

複製された特殊ファイルを含む記録媒体は、最初は知人や家族に渡されるが、その先々で不特定の人間に行き渡ることもある。これを考慮し、上述の如く、コピーを繰り返す度に権限レベル情報を変更していく（再生に関する権限レベルを低下させていく）。これにより、最終的に、複製された特殊ファイルでは音声が再生されないようになる。結果、不特定の人間に音声情報が流通する可能性が抑制され、プライバシーの保護が図られる。また、パスワードとしての認証コードを利用することで、権限レベルを維持したままでのコピーも可能である。認証コードを利用すれば、知人や家族などに対しては同じ特殊ファイルを配布すること可能である。 The recording medium containing the copied special file is initially delivered to an acquaintance or family, but may be passed on to unspecified people. Considering this, as described above, the authority level information is changed each time copying is repeated (the authority level related to reproduction is lowered). As a result, the sound is not reproduced in the copied special file. As a result, the possibility of voice information being distributed to unspecified persons is suppressed, and privacy can be protected. Further, by using an authentication code as a password, copying can be performed while maintaining the authority level. By using an authentication code, it is possible to distribute the same special file to acquaintances and family members.

＜＜変形等＞＞
上述した説明文中に示した具体的な数値は、単なる例示であって、当然の如く、それらを様々な数値に変更することができる。上述の実施形態の変形例または注釈事項として、以下に、注釈１〜注釈４を記す。各注釈に記載した内容は、矛盾なき限り、任意に組み合わせることが可能である。 << Deformation, etc. >>
The specific numerical values shown in the above description are merely examples, and as a matter of course, they can be changed to various numerical values. As modifications or annotations of the above-described embodiment, notes 1 to 4 are described below. The contents described in each comment can be arbitrarily combined as long as there is no contradiction.

［注釈１］
上述の各実施例では、音声区間の元オーディオ信号に対してのみ暗号化処理を施し、非音声区間の元オーディオ信号に対しては暗号化処理を施さないと述べたが、暗号化処理を音声区間と非音声区間の双方の元オーディオ信号に対して施すようにしても構わない。この場合においても、例えば、非音声区間に関しては、権限管理情報の如何によらず記録時の元オーディオ信号が再生機器側で再生出力されるようにする。 [Note 1]
In each of the above-described embodiments, it is described that only the original audio signal in the speech section is encrypted and the original audio signal in the non-speech section is not subjected to the encryption process. You may make it apply with respect to the original audio signal of both the area and a non-voice area. Also in this case, for example, with respect to the non-speech section, the original audio signal at the time of recording is reproduced and output on the reproduction device side regardless of the authority management information.

［注釈２］
暗号化オーディオ信号を復号する時に音声区間情報を利用する例を説明したが、この音声区間情報を利用して早送り再生等を実施することも可能である。例えば、暗号化オーディオ信号から元オーディオ信号を復号した後、音声区間情報に基づいて元オーディオ信号の全区間の内の音声区間における信号を切り出し、その切り出した部分の信号のみを再生出力することで早送り再生が可能となる。また例えば、その切り出した部分の信号を非音声区間にまで伸張して、音声をゆっくり再生することもできる。 [Note 2]
Although an example in which voice segment information is used when decrypting an encrypted audio signal has been described, fast-forward playback or the like can be performed using this voice segment information. For example, after decrypting the original audio signal from the encrypted audio signal, the signal in the voice section of all sections of the original audio signal is cut out based on the voice section information, and only the signal of the cut out part is reproduced and output. Fast forward playback is possible. In addition, for example, it is also possible to reproduce the sound slowly by extending the extracted signal to the non-speech section.

［注釈３］
図１の記録装置１、図１６の記録再生装置６及び図１７の撮像装置７は、ハードウェア、或いは、ハードウェアとソフトウェアの組み合わせによって実現可能である。特に、主制御部１４、オーディオ信号処理部１２及び１６の各機能は、ハードウェア、ソフトウェア、またはハードウェアとソフトウェアの組み合わせによって実現可能である。 [Note 3]
The recording apparatus 1 in FIG. 1, the recording / reproducing apparatus 6 in FIG. 16, and the imaging apparatus 7 in FIG. 17 can be realized by hardware or a combination of hardware and software. In particular, the functions of the main control unit 14 and the audio signal processing units 12 and 16 can be realized by hardware, software, or a combination of hardware and software.

図７のユーザＰＣ２及び図８のサーバ３も、ハードウェア、或いは、ハードウェアとソフトウェアの組み合わせによって実現可能である。特に、図７の主制御部４１の機能及び図８の主制御部６１の機能は、ハードウェア、ソフトウェア、またはハードウェアとソフトウェアの組み合わせによって実現可能である。ソフトウェアを用いて実現される部位についてのブロック図は、その部位の機能ブロック図を表すことになる。 The user PC 2 in FIG. 7 and the server 3 in FIG. 8 can also be realized by hardware or a combination of hardware and software. In particular, the function of the main control unit 41 in FIG. 7 and the function of the main control unit 61 in FIG. 8 can be realized by hardware, software, or a combination of hardware and software. A block diagram of a part realized using software represents a functional block diagram of the part.

［注釈４］
例えば、以下のように考えることができる。図１等に示される主制御部１４は、特殊ファイルをメモリカード１３に記録する記録制御手段としての機能を備える。図７のユーザＰＣ２において、カードスロット４４は、特殊ファイルを受け取るためのファイル入力手段として機能しうる。勿論、特殊ファイルを有線又は無線の通信によってユーザＰＣ２に提供することもできる。上述してきた特殊ファイルは、勿論、電子ファイルであり、その電子ファイルには、オーディオ信号のデータだけでなく画像データも格納されうる。 [Note 4]
For example, it can be considered as follows. The main control unit 14 shown in FIG. 1 and the like has a function as a recording control unit that records a special file on the memory card 13. In the user PC 2 of FIG. 7, the card slot 44 can function as a file input means for receiving a special file. Of course, the special file can also be provided to the user PC 2 by wired or wireless communication. Of course, the special file described above is an electronic file, and not only audio signal data but also image data can be stored in the electronic file.

本発明の第１実施例に係るオーディオ信号記録装置の内部ブロック図である。1 is an internal block diagram of an audio signal recording apparatus according to a first embodiment of the present invention. 図１のメモリカードに保存される特殊ファイルのデータ構造を示す図である。It is a figure which shows the data structure of the special file preserve | saved at the memory card of FIG. 本発明の第２実施例に係る、図１のオーディオ信号処理部の内部ブロック図である。FIG. 6 is an internal block diagram of the audio signal processing unit of FIG. 1 according to a second embodiment of the present invention. 図３の音声／非音声判別部の動作内容を説明するための図である。It is a figure for demonstrating the operation | movement content of the audio | voice / non-audio | voice discrimination | determination part of FIG. 図３の音声／非音声判別部の動作内容を説明するための図である。It is a figure for demonstrating the operation | movement content of the audio | voice / non-audio | voice discrimination | determination part of FIG. 本発明の第３実施例に係るファイル管理システムの全体構成図である。It is a whole block diagram of the file management system which concerns on 3rd Example of this invention. 図６のユーザＰＣの概略内部ブロック図である。It is a schematic internal block diagram of the user PC of FIG. 図６のサーバの概略内部ブロック図である。FIG. 7 is a schematic internal block diagram of the server of FIG. 6. 図６のユーザＰＣと各閲覧用ＰＣとの関係を規定する公開管理関係情報の内容例を示す図である。It is a figure which shows the example of the content of the public management relationship information which prescribes | regulates the relationship between user PC of FIG. 6, and each browsing PC. 図８のサーバのサーバＨＤＤに保存された特殊ファイル内の公開レベル情報の内容例を示す図である。It is a figure which shows the example of the content of the public level information in the special file preserve | saved at server HDD of the server of FIG. 図６のサーバから各閲覧用ＰＣに送信されるオーディオ信号を示す図である。It is a figure which shows the audio signal transmitted to each PC for browsing from the server of FIG. 本発明の第４実施例に係る話者認識部のブロック図である。It is a block diagram of the speaker recognition part which concerns on 4th Example of this invention. 本発明の第４実施例に係る元オーディオ信号の模式図である。It is a schematic diagram of the original audio signal which concerns on 4th Example of this invention. 本発明の第４実施例に係る特殊ファイルのデータ構造を示す図である。It is a figure which shows the data structure of the special file which concerns on 4th Example of this invention. 本発明の第４実施例に係る、登録話者用公開レベル情報（ａ）と非登録話者用公開レベル情報（ｂ）の内容例を示す図である。It is a figure which shows the example of the content of the public level information (a) for registration speakers, and the public level information (b) for non-registration speakers based on 4th Example of this invention. 本発明の第５実施例に係るオーディオ信号記録再生装置の内部ブロック図である。It is an internal block diagram of the audio signal recording / reproducing apparatus based on 5th Example of this invention. 本発明の第６実施例に係る撮像装置の内部ブロック図である。It is an internal block diagram of the imaging device which concerns on 6th Example of this invention. 本発明の第７実施例に係る２台のオーディオ信号記録再生装置を示す図である。It is a figure which shows the two audio signal recording / reproducing apparatuses based on 7th Example of this invention. 本発明の第７実施例に係る２台のＰＣ（パーソナルコンピュータ）を示す図である。It is a figure which shows two PCs (personal computer) based on 7th Example of this invention. 本発明の第８実施例に係る特殊ファイルのデータ構造を示す図である。It is a figure which shows the data structure of the special file which concerns on 8th Example of this invention.

Explanation of symbols

１オーディオ信号記録装置
２ユーザＰＣ
３サーバ
６オーディオ信号記録再生装置
７撮像装置
１０１、１０２、１０３閲覧用ＰＣ 1 Audio signal recording device 2 User PC
3 Server 6 Audio Signal Recording / Reproducing Device 7 Imaging Device 101, 102, 103 Viewing PC

Claims

In an audio signal recording device,
Speech section detecting means for detecting a speech section including a signal component of human speech from all sections of the input original audio signal;
Encryption means for generating an encrypted audio signal from the original audio signal by performing an encryption process on the signal in the voice section of the original audio signal;
An audio comprising: recording control means for recording an electronic file in which the encrypted audio signal and decryption information for decrypting the encrypted audio signal are stored in association with each other in a recording means Signal recording device.

The recording control means includes authority management information for switching permission / prohibition for decrypting the encrypted audio signal according to the decryption information and reproducing and outputting the original audio signal obtained by the decryption, 2. The audio signal recording apparatus according to claim 1, wherein the audio signal recording apparatus is further stored in the electronic file in association with the encrypted audio signal and the decryption information.

The audio signal recording apparatus according to claim 1 or 2, wherein the decoding information includes voice section information indicating which section of the whole section is the voice section.

The speech section is composed of a plurality of different element sections,
The audio signal recording apparatus further includes speaker recognition means for determining, for each element section, a match or mismatch between the voice speaker and a registered speaker registered in advance.
The decoding information includes speech section information that represents which section of the entire section is the speech section and also represents a discrimination result of the speaker recognition means for each element section,
The recording control means performs switching control of permission / prohibition for reproducing and outputting the original audio signal of each element section obtained by decrypting the encrypted audio signal of each element section in accordance with the decoding information. Authority management information for further storing in the electronic file in association with the encrypted audio signal and the decryption information,
The authority management information individually includes first authority management information for an element section including a signal component of the voice of the registered speaker, and second authority management information for other element sections. The audio signal recording apparatus according to claim 1.

An audio signal recording / reproducing apparatus comprising the audio signal recording apparatus according to claim 2,
The authority management information includes an authentication code,
Unique codes different from each other between the audio signal recording / reproducing device and the other audio signal recording / reproducing device are given in advance to the audio signal recording / reproducing device,
The audio signal recording / reproducing apparatus includes:
Reproduction output means for reproducing and outputting an audio signal;
Decryption processing means for decrypting the encrypted audio signal based on the decryption information;
Collating means for collating the authentication code with the unique code for the audio signal recording / reproducing device;
Determining means for determining whether to permit reproduction output of the original audio signal obtained by decoding of the decoding processing means in the reproduction output means, based on the result of matching by the matching means;
An audio signal recording / reproducing apparatus, wherein the decoding processing means and the reproduction output means are controlled in accordance with a discrimination result of the discrimination means.

In an imaging device that acquires an image according to a subject,
An image pickup apparatus comprising the audio signal recording apparatus according to any one of claims 1 to 4 or the audio signal recording / reproducing apparatus according to claim 5.

Data of an encrypted audio signal obtained by performing encryption processing on a signal in a voice section including a signal component of a human voice in the original audio signal;
An electronic file, wherein data of decryption information for decrypting the encrypted audio signal is stored in association with each other.

Data of authority management information for switching control of permission / prohibition for decrypting the encrypted audio signal according to the decryption information and reproducing and outputting the original audio signal obtained by the decryption is further provided. 8. The electronic file according to claim 7, wherein the electronic file is stored in association with the data of the digitized audio signal and the data of the decoding information.

9. The electronic file according to claim 2 or 8 is received from a provider device, and according to a transmission request from a terminal device that has a predetermined relationship with the provider device and includes playback output means for playing back and outputting an audio signal. An information providing device that transmits information based on the electronic file to the terminal device via a communication network,
A decryption processing means for decrypting the encrypted audio signal in the electronic file based on the decryption information in the electronic file;
When there is a transmission request, based on the authority management information in the electronic file and the relationship, it is determined whether or not transmission of the original audio signal obtained by decoding of the decoding processing means to the terminal device is permitted. And an information providing apparatus that controls transmission contents to the terminal device according to the determination result.

Reproduction output means for reproducing and outputting an audio signal is provided,
11. A terminal device, wherein information based on the electronic file is received from the information providing device according to claim 10 via a communication network, and an audio signal based on the received information is reproduced and output by the reproduction output means. .

An audio signal reproduction apparatus comprising: file input means for receiving the electronic file according to claim 2; and reproduction output means for reproducing and outputting an audio signal.
The authority management information in the electronic file includes an authentication code,
The audio signal reproduction device
Decryption processing means for decrypting the encrypted audio signal in the electronic file based on the decryption information in the electronic file;
Verification means for verifying the authentication code and the code registered in the audio signal reproduction device;
Determining means for determining whether to permit reproduction output of the original audio signal obtained by decoding of the decoding processing means in the reproduction output means, based on the result of matching by the matching means;
An audio signal reproducing apparatus, wherein the decoding processing means and the reproduction output means are controlled in accordance with a discrimination result of the discrimination means.

In the electronic file recording method,
Data of an encrypted audio signal obtained by performing encryption processing on a signal in a voice section including a signal component of a human voice in the original audio signal;
A recording method of an electronic file, wherein data of decryption information for decrypting the encrypted audio signal is recorded in association with each other.