JP5521414B2

JP5521414B2 - Recording device

Info

Publication number: JP5521414B2
Application number: JP2009161597A
Authority: JP
Inventors: 秀樹俊成; 博久青柳
Original assignee: Nakayo Telecommunications Inc
Current assignee: Nakayo Telecommunications Inc
Priority date: 2009-07-08
Filing date: 2009-07-08
Publication date: 2014-06-11
Anticipated expiration: 2029-07-08
Also published as: JP2011019034A

Description

本発明は、音声録音装置に係り、会議参加者の発言の開始・中断を検知して、音声情報に加えて、インデックス情報を付加して記録し、再生時に、指定したインデックス情報を抽出して会議参加者の発言（音声情報）を再生する録音装置に関する。 The present invention relates to an audio recording apparatus, detects the start / interruption of a conference participant's speech, records with adding index information in addition to audio information, and extracts specified index information during playback. The present invention relates to a recording device that reproduces speech (voice information) of conference participants.

従来の録音装置における再生制御方法は、以下の文献に記載されている。
特許文献１は、複数の発言者（相手先）からの音声情報がミキシングされた合成音声ファイルを、個別の音声情報ファイルに変換した後、再生する方法を開示している。 A reproduction control method in a conventional recording apparatus is described in the following document.
Patent Document 1 discloses a method of reproducing a synthesized voice file in which voice information from a plurality of speakers (destinations) is mixed into individual voice information files and then reproducing them.

特許文献２は、複数の発言者（相手先）からの音声情報がミキシングされた合成音声ファイルから、音声認識処理によって各話者の特徴を認識し、特定の話者を抽出して再生する方法を開示している。 Japanese Patent Application Laid-Open No. 2004-228561 is a method of recognizing the characteristics of each speaker by a speech recognition process from a synthesized speech file in which speech information from a plurality of speakers (destinations) is mixed, and extracting and reproducing a specific speaker. Is disclosed.

特許文献３は、会議参加者毎に独立した音声情報として個別音声情報ファイルに記録する方法を開示している。 Patent Document 3 discloses a method of recording an individual audio information file as independent audio information for each conference participant.

特開２００７−２５６４９８号公報JP 2007-256498 A 特開２００７−２９８８７６号公報JP 2007-298776 A 特開平９−０９７２２０号公報JP-A-9-097220

しかし、特許文献１の方法では、パソコン等によってミキシングされた合成音声データファイルから必要な情報を抽出し、個々の音声情報ファイルに変換するため、目的の会議参加者の記録音声を再生するまでに手間が必要である。
また、特許文献２は、音声認識を行うため、再生装置に高度なＣＰＵを搭載する必要があり、再生装置のコストが高騰してしまう。 However, in the method of Patent Document 1, necessary information is extracted from a synthesized voice data file mixed by a personal computer or the like and converted into individual voice information files. It takes time and effort.
Moreover, since patent document 2 performs voice recognition, it is necessary to mount an advanced CPU in the playback device, and the cost of the playback device increases.

さらに、特許文献３では、会議参加者毎に独立した個別音声情報ファイルとして記録するため、大量の保存用のメモリを用意する必要がある。
本発明の目的は、簡易な制御で特定の会議参加者の発言を再生可能とする録音装置を提供することに有る。 Further, in Patent Document 3, since recording is performed as an independent audio information file for each conference participant, it is necessary to prepare a large amount of storage memory.
An object of the present invention is to provide a recording apparatus that can reproduce a speech of a specific conference participant with simple control.

そこで本発明では、ミキシングされた音声情報を記録する際に、各会議参加者の音声情報が入力される部分に発言の有無を検出する入力有音検出手段を設けて、発言者を区別する情報と発言の開始／終了をインデックスとして併せて記録し、再生時には記録したインデックスに基づき、再生対象指定手段によって指定された会議参加者毎の発言を検索、選択して再生させる。 Therefore, in the present invention, when recording mixed audio information, information for distinguishing a speaker is provided by providing an input sound detection means for detecting presence / absence of a speech at a portion where the speech information of each conference participant is input. And the start / end of the utterance are recorded together as an index, and at the time of reproduction, the utterance for each conference participant designated by the reproduction target designation means is searched, selected and reproduced based on the recorded index.

上述した課題は、複数の音声入力部の内の特定話者に対応する特定の音声入力部から入力する音声信号中の有音を検出する特定話者有音検出手段と、複数の音声入力部の内の特定話者以外の音声入力部の各々から入力する音声信号中の有音を検出する非特定話者有音検出手段と、複数の音声入力部からの音声をミキシングして録音するミキシング音声録音手段と、を有し、ミキシング音声録音手段は、特定話者有音検出手段が有音を検出する毎に録音区間の区切りを示す区切情報を記録し、区切情報が示す録音区間の発言終了を示す位置は、非特定話者有音検出手段が特定話者以外の音声入力部から入力する音声信号のいずれからも有音を検出しなくなった時点の位置である録音装置により、達成できる。 The above-mentioned problem is that a specific speaker sound detecting means for detecting sound in a sound signal input from a specific sound input unit corresponding to a specific speaker among a plurality of sound input units , and a plurality of sound input units Non-specific speaker sound detection means for detecting sound in the sound signal input from each of the sound input units other than the specific speaker, and mixing for recording the sound from the plurality of sound input units A voice recording means, and the mixing voice recording means records the delimiter information indicating the delimiter of the recording section every time the specific speaker sound detecting means detects the sound, and the speech of the recording section indicated by the delimiter information is recorded. The position indicating the end can be achieved by the recording device that is the position when the non-specific speaker sound detection means stops detecting sound from any of the sound signals input from the sound input unit other than the specific speaker. .

本発明によれば、簡易な制御で特定の会議参加者の発言を再生可能とする録音装置を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the recording apparatus which enables reproduction | regeneration of the speech of a specific conference participant by simple control can be provided.

会議電話装置の構成を説明する機能ブロック図である。It is a functional block diagram explaining the structure of a conference telephone apparatus. 音声情報記録メモリの記録内容を説明する図である。It is a figure explaining the recording content of an audio | voice information recording memory. 区切情報を説明する図である。It is a figure explaining division | segmentation information. 発言中フラグを説明する図である。It is a figure explaining the flag in speech. 録音制御部の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of a recording control part. 再生対象管理メモリを説明する図である。It is a figure explaining the reproduction | regeneration object management memory. 再生制御部の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of a reproduction | regeneration control part. 区切情報を説明する図である（その２）。It is a figure explaining the division | segmentation information (the 2).

以下、本発明の実施の形態について、実施例を用い図面を参照しながら詳細に説明する。なお、実質同一部位には同じ参照番号を振り、説明は繰り返さない。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings using examples. The same reference numerals are assigned to substantially the same parts, and the description will not be repeated.

図１を参照して、録音機能を備えた会議電話装置と電話網との接続構成を説明する。図１において、会議電話装置３０は、電話網２０を介して、３台の相手先端末１０と接続されている。会議電話装置３０は、回線インタフェース部３０１と、呼制御部３０２と、３式の通話回路３１と、通話回路３１と１対１で接続された有音検出部３２と、音声ミキシング部３１４と、録音制御部３２５と、区切情報生成部３２６と、音声情報記録メモリ３１５と、再生制御部３２７と、スピーカ３３２と、操作部３３１と、再生用通話回路３１７と、再生対象指定操作検出部３２８とから構成されている。 With reference to FIG. 1, a connection configuration between a conference telephone device having a recording function and a telephone network will be described. In FIG. 1, the conference telephone device 30 is connected to three destination terminals 10 via the telephone network 20. The conference telephone device 30 includes a line interface unit 301, a call control unit 302, a three-way communication circuit 31, a voice detection unit 32 connected to the communication circuit 31 on a one-to-one basis, a voice mixing unit 314, Recording control unit 325, delimiter information generation unit 326, audio information recording memory 315, reproduction control unit 327, speaker 332, operation unit 331, reproduction call circuit 317, reproduction target designation operation detection unit 328 It is composed of

回線インタフェース部３０１は、会議電話装置３０と電話網２０とを接続する。呼制御部３０２は、着信、応答および終話を制御する。通話回路３１は、回線インタフェース部３０１を介して各相手先端末１０と音声ミキシング部３１４との間で音声信号を中継する。有音検出部３２−１〜有音検出部３２−３は、通話回路３１−１〜通話回路３１−３毎に相手先端末１０−１〜１０−３から入力された音声信号の有音／無音を検出する。 The line interface unit 301 connects the conference telephone device 30 and the telephone network 20. The call control unit 302 controls incoming calls, responses, and end calls. The call circuit 31 relays an audio signal between each counterpart terminal 10 and the audio mixing unit 314 via the line interface unit 301. The sound detection unit 32-1 to the sound detection unit 32-3 are provided with the sound / sound of the audio signal input from the counterpart terminals 10-1 to 10-3 for each of the call circuits 31-1 to 31-3. Detect silence.

音声ミキシング部３１４は、相手先端末１０−１と相手先端末１０−２から入力された音声信号をミキシング（合成）して相手先端末１０−３（通話回路３）に出力する。音声ミキシング部３１４は、また相手先端末１０−１と相手先端末１０−３から入力された音声信号をミキシング（合成）して相手先端末１０−２（通話回路２）に出力する。音声ミキシング部３１４は、また相手先端末１０−２と相手先端末１０−３から入力された音声信号をミキシング（合成）して相手先端末１０−１（通話回路１）に出力する。 The voice mixing unit 314 mixes (synthesizes) the voice signals input from the counterpart terminal 10-1 and the counterpart terminal 10-2, and outputs the voice signal to the counterpart terminal 10-3 (call circuit 3). The voice mixing unit 314 also mixes (synthesizes) the voice signals input from the counterpart terminal 10-1 and the counterpart terminal 10-3 and outputs the result to the counterpart terminal 10-2 (call circuit 2). The voice mixing unit 314 also mixes (synthesizes) voice signals input from the counterpart terminal 10-2 and the counterpart terminal 10-3 and outputs the mixed voice signal to the counterpart terminal 10-1 (call circuit 1).

音声情報記録メモリ３１５は、全ての相手先端末１０から入力された音声信号をミキシング（合成）した合成音声と区切情報を記録する。録音制御部３２５は、音声情報記録メモリ３１５に対して音声ミキシング部３１４でミキシングされた会議音声情報の録音（保存）を制御する。再生制御部３２７は、音声情報記録メモリ３１５に保存された会議音声情報の再生を制御する。 The voice information recording memory 315 records synthesized voices and segmentation information obtained by mixing (synthesizing) voice signals input from all counterpart terminals 10. The recording control unit 325 controls the recording (storing) of the conference audio information mixed by the audio mixing unit 314 with respect to the audio information recording memory 315. The reproduction control unit 327 controls reproduction of conference audio information stored in the audio information recording memory 315.

録音制御部３２５は、会議通話が開始されると音声ミキシング部３１４を有効とし、全ての相手先端末１０（通話回路３１）から入力された音声信号がミキシングされた会議音声信号を音声情報記録メモリ３１５に保存する。録音制御部３２５は、この時、各有音検出部３２が有音を検知した際、区切情報生成部３２６から取得したメモリアドレス（発言開始位置情報）と、呼制御部３０２から有音を検出した通話回路に対応する相手先端末（会議参加者）の電話番号情報を取得する。録音制御部３２５は、メモリアドレスと、電話番号とを関連付けた区切情報（インデックス）として音声情報記録メモリ３１５に保存する。録音制御部３２５は、また各有音検出部が有音から無音に変化したことを検知した際にも、同様に発言の中断を示す区切情報を生成し、音声情報記録メモリ３１５に保存する。 When the conference call is started, the recording control unit 325 validates the voice mixing unit 314, and the conference voice signal in which the voice signals input from all the destination terminals 10 (call circuit 31) are mixed is recorded in the voice information recording memory. Save to 315. At this time, the recording control unit 325 detects the sound from the memory address (utterance start position information) acquired from the delimiter information generation unit 326 and the sound from the call control unit 302 when each sound detection unit 32 detects the sound. The telephone number information of the destination terminal (conference participant) corresponding to the call circuit that has been used is acquired. The recording control unit 325 stores the memory address and the telephone number in the voice information recording memory 315 as delimiter information (index) associated with each other. The recording control unit 325 also generates delimiter information indicating the interruption of speech when each sound detection unit detects that the sound detection unit has changed from sound to silence, and stores it in the sound information recording memory 315.

ここで有音から無音に変化したこととは、有音から無音への状態の変化を検出し、無音が一定時間経過（ここでは1秒）したことを云う。 Here, the change from sound to silence means that a change in the state from sound to silence is detected, and silence has elapsed for a certain time (here, 1 second).

再生制御部３２７は、操作部３３１または再生対象指定操作検出部３２８から、会議音声情報の再生指示を検出すると、音声情報記録メモリ３１５から指定された会議参加者の区切情報を検索し、抽出した発言開始位置より再生を開始し、スピーカ３３２または再生用通話回路３１７を介して、保存された会議音声情報を出力する。また再生制御部３２７は、発言中断位置に至ったことを検出した場合、該当する会議音声情報の再生を中止する。 When the reproduction control unit 327 detects the reproduction instruction of the conference audio information from the operation unit 331 or the reproduction target designation operation detection unit 328, the reproduction control unit 327 searches and extracts the delimiter information of the designated conference participant from the audio information recording memory 315. Playback is started from the speech start position, and the stored conference audio information is output via the speaker 332 or the playback call circuit 317. Further, when the reproduction control unit 327 detects that the speech interruption position has been reached, the reproduction control unit 327 stops reproduction of the corresponding conference audio information.

再生対象指定操作検出部３２８は、保存された会議音声情報の読出しを行う相手先端末からのＰＢ（ＤＴＭＦ）信号等を受信し、再生の対象となる話者選択、再生の繰返し、停止、終了等の指示を検出する。 The reproduction target designating operation detection unit 328 receives a PB (DTMF) signal or the like from a partner terminal that reads stored conference audio information, selects a speaker to be reproduced, repeats reproduction, stops, and ends. Etc. are detected.

なお、通話回路３１−１〜通話回路３１−３によって３つの相手先と接続する例として示しているが、更に通話回路及び有音検知部を追加して、多数の相手先と接続させてもよい。 In addition, although it has shown as an example which connects with three other parties by the call circuit 31-1-the call circuit 31-3, even if a call circuit and a sound detection part are further added and it connects with many other parties. Good.

図２を参照して、音声情報記録メモリの記録内容を説明する。図２Ａにおいて、音声情報記録メモリ３１５は、ミキシングされた会議音声情報を蓄積するエリア３１５１と、区切情報３５０を蓄積するエリア３１５２とから構成する。 With reference to FIG. 2, the recording contents of the audio information recording memory will be described. In FIG. 2A, the audio information recording memory 315 includes an area 3151 for storing mixed conference audio information and an area 3152 for storing delimiter information 350.

図２Ｂにおいて、区切情報３５０は、発言者区分３５１、発言開始位置３５２、発言中断位置３５３で構成されている。ここで、発言者区分３５１は、発言者の電話番号を記録する。発言開始位置３５２および発言中断位置３５３は、ミキシングされた会議音声情報を蓄積するエリア３１５１の記録開始位置／記録中断位置を示すメモリ番地を記録する。 In FIG. 2B, the delimiter information 350 includes a speaker category 351, a speech start position 352, and a speech interruption position 353. Here, the speaker classification 351 records the telephone number of the speaker. The speech start position 352 and the speech interruption position 353 record the memory addresses indicating the recording start position / recording interruption position of the area 3151 in which the mixed conference audio information is accumulated.

なお、会議記録の先頭のレコードの発言者区分３５１は、録音開始を記録し、発言開始位置３５２は、会議音声情報を蓄積するエリア３１５１の先頭のメモリ番地を記録する。また、会議記録の末尾のレコードの発言者区分３５１は、録音終了を記録し、発言中断位置３５３は、会議音声情報を蓄積するエリア３１５１の末尾のメモリ番地を記録する。
また、同時に発言する場合があるので、区切情報３５０に記録された発言開始位置３５２と発言中断位置３５３とは、各レコードで重なってもよい。 Note that the speaker classification 351 of the first record of the conference record records the start of recording, and the speech start position 352 records the first memory address of the area 3151 in which conference audio information is accumulated. Further, the speaker classification 351 of the last record of the conference record records the end of recording, and the speech interruption position 353 records the memory address at the end of the area 3151 for storing conference audio information.
In addition, since there are cases where the user speaks at the same time, the speech start position 352 and the speech interruption position 353 recorded in the delimiter information 350 may overlap each other.

図３を参照して、録音制御部が、音声情報記録メモリに会議音声情報の区切情報を保存する際に、発言者（相手先端末）が発言中であるか否かを識別する発言中フラグを説明する。図３において、発言中フラグ３６０は、発言者区分３６１と、発言中識別フラグ３６２で構成される。発言者区分３６１は、発言者の電話番号を記録する。発言中識別フラグ３６２は、有音検出部３２の検出結果に基づいて、有音検出中に「１」、非有音検出中に「０」を記録している。 Referring to FIG. 3, when the recording control unit stores the conference audio information delimiter information in the audio information recording memory, the in-speak flag for identifying whether or not the speaker (destination terminal) is speaking Will be explained. In FIG. 3, the speaking flag 360 includes a speaker classification 361 and a speaking identification flag 362. The speaker category 361 records the telephone number of the speaker. The during-speech identification flag 362 records “1” during sound detection and “0” during non-sound detection based on the detection result of the sound detection unit 32.

図４を参照して、録音制御部の処理を説明する。図４において、録音制御部３２５は、会議通話の開始を待つ（Ｓ１０１）。録音制御部３２５は、会議通話が開始されたことを認識すると（ＹＥＳ）、区切情報生成部３２６からメモリ位置情報を取得する（Ｓ１０２）。録音制御部３２５は、音声情報記録メモリ３１５に録音の先頭を示す区切情報を記録する（Ｓ１０３）。録音制御部３２５は、音声ミキシング部３１４を有効として、ミキシングされた会議音声情報の記録を開始する（Ｓ１０４）。 The processing of the recording control unit will be described with reference to FIG. In FIG. 4, the recording control unit 325 waits for the start of a conference call (S101). When the recording control unit 325 recognizes that the conference call has started (YES), the recording control unit 325 acquires the memory location information from the delimiter information generation unit 326 (S102). The recording control unit 325 records delimiter information indicating the beginning of recording in the audio information recording memory 315 (S103). The recording control unit 325 activates the audio mixing unit 314 and starts recording the mixed conference audio information (S104).

録音制御部３２５は、カウンタｉに１をセットする（Ｓ１０６）。録音制御部３２５は、有音検出部３２−ｉで有音を検出しているか判定する（Ｓ１０７）。有音を検出しているとき（Ｓ１０７でＹＥＳ）のとき録音制御部３２５は、通話回路３１−ｉの発言中フラグがセットされているか判定する（Ｓ１０８）。発言中フラグがセットされていないとき(Ｓ１０８でＮＯ)、録音制御部３２５は、メモリ位置と電話番号を取得する（Ｓ１０９）。録音制御部３２５は、発言開始の区切情報を記録する（Ｓ１１１）。録音制御部３２５は、通話回路３１−ｉの発言中フラグをセットする（Ｓ１１２）。録音制御部３２５は、カウンタｉをインクリメントする（Ｓ１１３）。録音制御部３２５は、カウンタｉが４か判定する（Ｓ１１４）。ＮＯのとき録音制御部３２５は、ステップ１０７に遷移する。発言中フラグがセットされているとき（Ｓ１０８でＹＥＳ）、録音制御部３２５は、そのままステップ１１３に遷移する。 The recording control unit 325 sets 1 to the counter i (S106). The recording control unit 325 determines whether or not the sound detection unit 32-i detects sound (S107). When a sound is detected (YES in S107), the recording control unit 325 determines whether the speaking flag of the call circuit 31-i is set (S108). When the speaking flag is not set (NO in S108), the recording control unit 325 acquires the memory location and the telephone number (S109). The recording control unit 325 records the utterance start delimiter information (S111). The recording control unit 325 sets the speaking flag of the call circuit 31-i (S112). The recording control unit 325 increments the counter i (S113). The recording control unit 325 determines whether the counter i is 4 (S114). When NO, the recording control unit 325 makes a transition to Step 107. When the speaking flag is set (YES in S108), the recording control unit 325 proceeds to step 113 as it is.

無音を検知しているとき（１０７でＮＯ）、録音制御部３２５は、通話回路３１−ｉの発言中フラグがセットされているか判定する（Ｓ１１６）。ＹＥＳのとき録音制御部３２５は、メモリ位置と電話番号を取得する（Ｓ１１７）。録音制御部３２５は、発言中断の区切情報を記録する（Ｓ１１８）。録音制御部３２５は、通話回路３１−ｉの発言中フラグをリセットして（Ｓ１１９）、ステップ１１３に遷移する。ステップ１１６でＮＯのとき、録音制御部３２５は、そのままステップ１１３に遷移する。 When silence is detected (NO in 107), the recording control unit 325 determines whether the speaking flag of the call circuit 31-i is set (S116). If YES, the recording control unit 325 acquires the memory location and the telephone number (S117). The recording control unit 325 records the speech break delimiter information (S118). The recording control unit 325 resets the speaking flag of the call circuit 31-i (S119), and proceeds to step 113. When NO at step 116, recording control unit 325 proceeds to step 113 as it is.

ステップ１１４でＹＥＳのとき、録音制御部３２５は、会議通話が終了したか判定する（Ｓ１２１）。ＹＥＳのとき、録音制御部３２５は、ミキシング音声録音を停止する（Ｓ１２２）。録音制御部３２５は、メモリ位置を取得する（Ｓ１２３）。録音制御部３２５は、録音終了の区切位置を記録して（Ｓ１２４）、終了する。ステップ１２１でＮＯのとき、録音制御部３２５は、ステップ１０６に遷移する。 If YES in step 114, the recording control unit 325 determines whether or not the conference call has ended (S121). If YES, the recording control unit 325 stops mixing voice recording (S122). The recording control unit 325 acquires the memory position (S123). The recording control unit 325 records the recording end point (S124) and ends. When NO at step 121, recording control unit 325 transitions to step 106.

図５を参照して、再生制御部が、音声情報記録メモリ３１５に保存された会議音声情報を再生し、前後の発言検索する際に用いる再生位置メモリの内容を説明する。図５において、再生位置メモリ３７０は、再生中発言者区分３７１、再生開始位置３７２、再生中断位置３７３から構成される。再生中発言者区分３７１、再生開始位置３７２、再生中断位置３７３は、いずれも区切情報３５０のレコードをコピーしたものである。ここでは、図２Ｂの区切情報３５０の２番目のレコードを記録している。したがって、再生中発言者区分３７１は、再生中の発言者の電話番号を記録する。再生開始位置３７２および再生中断位置３７３は、再生中の会議音声情報を蓄積するエリア３１５１の記録開始位置／記録中断位置を示すメモリ番地を記録する。 With reference to FIG. 5, the content of the reproduction position memory used when the reproduction control unit reproduces the conference audio information stored in the audio information recording memory 315 and searches for the preceding and following messages will be described. In FIG. 5, the playback position memory 370 includes a speaker segment 371 during playback, a playback start position 372, and a playback interruption position 373. The during-reproducing speaker classification 371, the reproduction start position 372, and the reproduction interruption position 373 are all copied records of the delimiter information 350. Here, the second record of the delimiter information 350 in FIG. 2B is recorded. Therefore, the reproducing speaker category 371 records the telephone number of the reproducing speaker. The reproduction start position 372 and the reproduction interruption position 373 record a memory address indicating the recording start position / recording interruption position of the area 3151 for storing the conference audio information being reproduced.

図６を参照して、再生制御部の動作フローを説明する。図６において、再生制御部３２７は、再生対象の会議の指定を待つ（Ｓ２０１）。記録された会議音声情報の再生指示を認識すると（ＹＥＳ）、再生制御部３２７は、再生用通話回路３１７を接続する（Ｓ２０２）。再生制御部３２７は、再生位置メモリ３７０をリセットする（Ｓ２０３）。再生制御部３２７は、発言者ｎの再生指示を検出したか判定する（Ｓ２０４）。ＹＥＳのとき、再生制御部３２７は、発言者ｎの次の区切情報を抽出する（Ｓ２０６）。再生制御部３２７は、該当する区切情報を再生位置メモリ３７０に保存する（Ｓ２０７）。再生制御部３２７は、該当する再生開始位置３７２から再生開始する（Ｓ２０８）。 With reference to FIG. 6, the operation flow of the reproduction control unit will be described. In FIG. 6, the reproduction control unit 327 waits for designation of a conference to be reproduced (S201). When recognizing the reproduction instruction of the recorded conference audio information (YES), the reproduction control unit 327 connects the reproduction call circuit 317 (S202). The playback control unit 327 resets the playback position memory 370 (S203). The playback control unit 327 determines whether the playback instruction of the speaker n has been detected (S204). When YES, the reproduction control unit 327 extracts the next break information of the speaker n (S206). The playback control unit 327 stores the corresponding delimiter information in the playback position memory 370 (S207). The playback control unit 327 starts playback from the corresponding playback start position 372 (S208).

ステップ２０８に引き続いて、またはステップ２０４でＮＯのとき、再生制御部３２７は、巻き戻し再生支持を検出したか判定する（Ｓ２０９）。ＹＥＳのとき、再生制御部３２７は、該当発言者の前の区切情報を抽出する（Ｓ２１１）。再生制御部３２７は、該当する区切情報を再生位置メモリ３７０に保存する（Ｓ２１２）。再生制御部３２７は、該当する再生開始位置３７２から再生開始する（Ｓ２１３）。 Subsequent to step 208 or when NO in step 204, the regeneration control unit 327 determines whether or not rewind regeneration support is detected (S209). In the case of YES, the reproduction control unit 327 extracts the delimiter information before the corresponding speaker (S211). The playback control unit 327 stores the corresponding delimiter information in the playback position memory 370 (S212). The playback control unit 327 starts playback from the corresponding playback start position 372 (S213).

ステップ２１３に引き続いて、またはステップ２０９でＮＯのとき、再生制御部３２７は、再生停止指示を検出したか判定する（Ｓ２１４）。ＹＥＳのとき、再生制御部３２７は、音声情報の再生を停止する（Ｓ２１６）。 Subsequent to step 213 or when NO in step 209, the reproduction control unit 327 determines whether or not a reproduction stop instruction has been detected (S214). In the case of YES, the reproduction control unit 327 stops reproducing the audio information (S216).

ステップ２１６に引き続いて、またはステップ２１４でＮＯのとき、再生制御部３２７は、再生中断位置３７３に到達したか判定する（Ｓ２１７）。ＹＥＳのとき、再生制御部３２７は、音声情報の再生を停止する（Ｓ２１８）。 Subsequent to step 216 or NO in step 214, the reproduction control unit 327 determines whether or not the reproduction interruption position 373 has been reached (S217). If YES, the playback control unit 327 stops the playback of the audio information (S218).

ステップ２１８に引き続いて、またはステップ２１７でＮＯのとき、再生制御部３２７は、再生終了指示を検出したか判定する（Ｓ２１９）。ＹＥＳのとき、再生制御部３２７は、対象音声情報の再生を停止する（Ｓ２２１）。再生制御部３２７は、再生通話回路３１７を切断して（Ｓ２２２）、終了する。ステップ２１９でＮＯのとき、再生制御部３２７は、ステップ２０４に遷移する。 Subsequent to step 218 or when NO in step 217, the playback control unit 327 determines whether a playback end instruction has been detected (S219). When YES, the reproduction control unit 327 stops reproduction of the target audio information (S221). The reproduction control unit 327 disconnects the reproduction call circuit 317 (S222) and ends. When NO in step 219, the regeneration control unit 327 makes a transition to step 204.

上述した実施例に拠れば、簡易な制御で特定の会議参加者の発言を再生可能とする録音装置を提供することができる。なお、実施例として会議電話装置を説明したが、これに限らず単に録音装置であってもよい。 According to the above-described embodiment, it is possible to provide a recording apparatus that can reproduce a speech of a specific conference participant with simple control. In addition, although the conference telephone apparatus was demonstrated as an Example, not only this but a recording apparatus may be sufficient.

実施例１は、順次、特定の端末から入力された音声を再生する。これに対して、実施例２は、電話会議の主催者の通話回路を特定し、その音声で切り分けて、ミキシングした音声を再生する。 In the first embodiment, sounds input from a specific terminal are sequentially reproduced. On the other hand, in the second embodiment, the call circuit of the telephone conference organizer is specified, and the voice is mixed and reproduced by the voice.

図７において、区切情報３５０は、発言者区分３５１、発言開始位置３５２、発言中断位置３５３で構成されている。ここで、発言者区分３５１は、会議の主催者の電話番号を記録する。発言開始位置３５２および発言中断位置３５３は、ミキシングされた会議音声情報を蓄積するエリア３１５１の記録開始位置／記録中断位置を示すメモリ番地を記録する。 In FIG. 7, the delimiter information 350 includes a speaker category 351, a speech start position 352, and a speech interruption position 353. Here, the speaker category 351 records the telephone number of the conference organizer. The speech start position 352 and the speech interruption position 353 record the memory addresses indicating the recording start position / recording interruption position of the area 3151 in which the mixed conference audio information is accumulated.

なお、会議記録の先頭のレコードの発言者区分３５１は、録音開始を記録し、発言開始位置３５２は、会議音声情報を蓄積するエリア３１５１の先頭のメモリ番地を記録する。また、会議記録の末尾のレコードの発言者区分３５１は、録音終了を記録し、発言中断位置３５３は、会議音声情報を蓄積するエリア３１５１の末尾のメモリ番地を記録する。 Note that the speaker classification 351 of the first record of the conference record records the start of recording, and the speech start position 352 records the first memory address of the area 3151 in which conference audio information is accumulated. Further, the speaker classification 351 of the last record of the conference record records the end of recording, and the speech interruption position 353 records the memory address at the end of the area 3151 for storing conference audio information.

録音開始および録音終了を除く各レコードは、会議主催者（特定話者）の発言開始時点で、当該レコードの発言開始位置３５２と、あれば一つ前のレコードの発言中断位置３５３を記入する。すなわち、発言開始位置３５２は、単一の会議主催者の発言開始位置を記録するが、発言中断位置３５３は、すべての会議参加者の発言が中断された時点もしくは会議主催者の次の発言開始位置より−１したメモリ番地を記録してもよい。 Each record except the start of recording and the end of recording enters the start position 352 of the record and the stop position 353 of the previous record, if any, at the start of speaking by the conference organizer (specific speaker). That is, the utterance start position 352 records the utterance start position of a single meeting organizer, while the utterance stop position 353 indicates when the utterances of all the conference participants are interrupted or when the meeting organizer starts the next statement. A memory address minus -1 from the position may be recorded.

図４を再び参照して、実施例２の録音制御部の処理を説明する。図４において、実施例１との違いを説明する。すなわち、ステップ１０６、ステップ１１３、ステップ１１４、ステップ１１７、ステップ１１８は、実施例２ではジャンプして飛ばす。また、ステップ１１１では、録音制御部３２５は、発言開始の区切情報を記録するだけでなく、もしあれば、前のレコードの発言中断の区切情報を記録する。また、ステップ１２４では、録音終了の区切位置（発言中断位置）と、その前の区切の発言中断位置を記録して、終了する。
実施例２によれば、会議の全てを順を追って再生することができ、また会議主催者の発言を単位として再生のスキップや巻き戻しを行うことができる。 With reference to FIG. 4 again, the processing of the recording control unit of the second embodiment will be described. In FIG. 4, the difference from the first embodiment will be described. That is, step 106, step 113, step 114, step 117, and step 118 are skipped in the second embodiment. In Step 111, the recording control unit 325 records not only the start information but also the stop information of the previous record, if any. In step 124, the recording end point (utterance interruption position) and the previous interruption point are recorded, and the process ends.
According to the second embodiment, all the conferences can be reproduced in order, and reproduction can be skipped or rewound in units of the conference organizer's remarks.

１０…相手先端末、２０…電話網、３０…会議電話装置、３１…通話回路、３２…有音検出部、３０１…回線インタフェース部、３０２…呼制御部、３１４…音声ミキシング部、３１５…音声情報記録メモリ、３１７…再生用通話回路、３２５…録音制御部、３２６…区切情報生成部、３２７…再生制御部、３２８…再生対象指定操作検出部、３３１…操作部、３３２…スピーカ。 DESCRIPTION OF SYMBOLS 10 ... Counterpart terminal, 20 ... Telephone network, 30 ... Conference telephone apparatus, 31 ... Call circuit, 32 ... Sound detection part, 301 ... Line interface part, 302 ... Call control part, 314 ... Voice mixing part, 315 ... Voice Information recording memory, 317... Calling circuit for reproduction, 325... Recording control unit, 326... Separation information generation unit, 327.

Claims

In a recording device that records audio from multiple audio input units,
Specific speaker sound detection means for detecting sound in a sound signal input from a specific sound input unit corresponding to a specific speaker among the plurality of sound input units, and among the plurality of sound input units Non-specific speaker sound detection means for detecting sound in a sound signal input from each of the sound input units other than the specific speaker, and mixing sound for mixing and recording the sound from the plurality of sound input units Recording means, and
The mixing voice recording unit records delimiter information indicating a delimiter of a recording section every time the specific speaker sound detecting unit detects a sound, and a position indicating the end of speech in the recording section indicated by the delimiter information is: 2. A recording apparatus according to claim 1, wherein the non-specific speaker presence detection means is a position at the time when no speech is detected from any of audio signals input from an audio input unit other than the specific speaker .

The recording device according to claim 1,
Voice reproduction control means for controlling reproduction of an audio signal recorded in the mixing voice recording means;
The voice reproduction control means starts reproduction from a position where the delimiter information stored by the specific-speaker sound detection means indicates the start of speech and each time the delimiter information detects a position indicating the end of speech, A recording apparatus having a function of stopping playback or skipping voice reproduction until detection of subsequent segment information or returning to the timing of previous segment information and reproducing audio again .