JPS63201821A

JPS63201821A - Voice memorandum system

Info

Publication number: JPS63201821A
Application number: JP62033257A
Authority: JP
Inventors: Hiroshi Ichikawa; 市川　熹; Shigeru Yabuuchi; 薮内　繁
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1987-02-18
Filing date: 1987-02-18
Publication date: 1988-08-19

Abstract

PURPOSE:To transpose and edit partial contents of a memory by providing a voice input and recording/reproducing means and a means which allows recorded voice and display to correspond to each other. CONSTITUTION:A document stored in a picture memory 34 is displayed on a picture display 36 through a picture buffer 35 by the control of a picture control processor 30. The voice inputted as voice memorandum is inputted to a voice detecting part 26 and an input voice buffer 27 through an encoding part 24 of coding. A voice processing processor 20 finds the output of the detecting part 26 and stores the input voice signal of the buffer 27 in a voice memory 29. A control processor 11 executes correspondence between picture display and voice memorandum and the control of the whole of the system while communicating with processors 20 and 30 through a voice interface part 23 and a picture interface part 33. Information input from a keyboard 13 and input of a pointing device 14 are controlled by the processor 11 to allow voice and picture to correspond to each other.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、音声メモ方式に係り、特に電子化した文書に
音声によるメモを付与するのに好適な方式に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a voice memo system, and particularly to a system suitable for adding voice memos to electronic documents.

[Conventional technology]

従来の装置は、音声メモを付加した位置をマークとして
示したり、音声メモの長さをバーで表示し、再生時にど
こまで再生したか、残りがどの程度あるか、などの機能
を持たせる工夫がなされている。なお、この種の装置と
して関連するものには例えば特開昭６１−１３７３０号
、特開昭６０−２４８０５６号等が挙げられる。Conventional devices have features such as indicating the position where a voice memo has been added as a mark, displaying the length of the voice memo with a bar, and showing how far the voice memo has been played and how much is left. being done. Incidentally, related devices of this type include, for example, Japanese Patent Application Laid-Open No. 13730/1982 and Japanese Patent Application Laid-open No. 248056/1989.

[Problem that the invention seeks to solve]

音声によるメモを付与する場合、通常の会話のように相
手の応答を見ながら、話す内容を変更したり、説明の程
度を変えたりすることができず、ある程度筋道の通った
順序で、適格にまとめて行かないと、大変聞きにくいも
のとなる。このため一度付与した音声メモの一部を言い
換えたり、順序を入れ換えたり、追加したりして聞きや
すいものに編集しなおす必要があるが、上記技術は、こ
の点に関する配慮がなされていない。When adding voice memos, you can't change what you say or change the level of explanation while watching the other person's response, as you would in a normal conversation. If you don't do it all together, it will be very difficult to listen to. For this reason, it is necessary to re-edit the voice memo once it has been added to make it easier to listen to by rephrasing, rearranging, or adding a part of the voice memo, but the above technology does not take this point into consideration.

また、音声は時間的に一次元の信号であり、画像のよう
に、一覧して、はしい情報がどれであるかを知ることが
できず、順番に各音声メモを全部間いて行く必要があり
、効率的でない。この点に関しては、音声メモの再生を
ステレオ技術によって空間的に分散させてカクテルパー
ティ−効果をねらって同時に聞く方式が提案、されてい
るが、′実効的でない。それは、同一の話者による音声
を同時に聞く条件ではカクテルパーティ−効果は小さい
からである。Also, audio is a one-dimensional signal in terms of time, and unlike images, it is not possible to view the information at a glance and find out which information is correct, and it is necessary to go through each voice memo in order. Yes, it is not efficient. Regarding this point, a method has been proposed in which the playback of voice memos is spatially distributed using stereo technology to create a cocktail party effect, but this method is not effective. This is because the cocktail party effect is small under conditions where voices from the same speaker are heard simultaneously.

本発明の目的は、音声メモの付与を容易にし、かつ、必
要な音声メモを効率良く取り出す手段を提供することに
ある。SUMMARY OF THE INVENTION An object of the present invention is to provide a means for easily adding a voice memo and efficiently retrieving a necessary voice memo.

[Means for solving problems]

上記目的は、メモの一部を言い換え入れ替えたり、順序
を入れ換えたりするための簡易な音声編集機能と、各メ
モの主要となる文章等主要な部分を効率良く再生する機
能を付すことにより達成される。The above purpose is achieved by adding a simple voice editing function that allows you to paraphrase a part of a memo or change its order, and a function that efficiently plays back the main parts such as the main sentences of each memo. Ru.

[Effect]

上記目的を達成するために、本発明では、音声の入力・
記録再生手段と記録音声と表示を対応付ける手段、各種
のポインティング及び機能選択手段の他に、音声の編集
手段と音声早送り再生手段とを設ける。更に、この音声
入力、記録、再生手段には、音声区間を検出する手段を
設ける。また、音声区間の終りを検出後／／１〜２秒た
って、音声区間の終りを確認したことを入力者に知らせ
る機能を設けると、使用者にとって使いやすいものとな
る。記録手段及び再生手段には、高能率音声符号化手段
及び復号化手段を利用すると、メモリ容、最上有利にな
ることは言うまでもない。In order to achieve the above object, the present invention provides voice input and
In addition to recording and reproducing means, means for associating recorded audio and display, and various pointing and function selecting means, audio editing means and audio fast forward reproducing means are provided. Furthermore, this audio input, recording, and reproducing means is provided with means for detecting audio sections. Further, if a function is provided to inform the inputter that the end of the voice section has been confirmed after 1 to 2 seconds have passed after detecting the end of the voice section, it will be easier for the user to use. It goes without saying that the use of high-efficiency audio encoding means and decoding means for the recording means and reproducing means provides the greatest advantage in terms of memory capacity.

編集手段は、音声入力・記録・再生手段の他、音声と表
示を対応付ける手段、ポインティング及び各種機能選択
手段等と組み合せ、使いやすい編集手段を構成する。機
能選択手段で、音声入力又は編集モードを選択すると、
音声メモを付す文書の画面付近に、編集用の画面を設定
する。編集画面は、いわゆるマルチ、ウィンドウのよう
な構成で実現することがで来る。編集画面には、メモ音
声を文単位等で表示しポインティング手段と組み合せて
、音声の挿入、入れ換え、消去、キーセンテンスの指定
等が可能な編集機能をもたせる。The editing means constitutes an easy-to-use editing means by combining audio input/recording/playback means, means for associating audio and display, pointing and various function selection means, and the like. When you select voice input or editing mode with the function selection means,
Set an editing screen near the screen of the document to which you want to add a voice memo. The editing screen can be realized in a so-called multi-window configuration. The editing screen is provided with an editing function that displays memo audio in units of sentences, etc., and allows insertion, replacement, deletion of audio, designation of key sentences, etc. in combination with pointing means.

音声早送り手段は、文書中に付せられた音声メモを再生
位置を表示しながら、次々と主要部分のみを再生して行
くもので、再生の方式に幾つかのモードを設ける。その
第１は、各メモの頭部から１〜２秒程度を再生するもの
、第２は、早口で、同じく、１〜２秒頭部より再生する
もの、第３゜第４は、キーセンチレスについてのみ、頭
部から、普通の速さ又は早口で再生するものである。各
再生文の尾部は再生音レベルを除々に小さくし、次の再
生文を重ねて再生することにより、自然で聞きやすい早
送り再生が可能となる。The audio fast-forwarding means plays back only the main parts of the audio memo attached to the document one after another while displaying the playback position, and has several playback modes. The first one is to play about 1-2 seconds from the beginning of each memo, the second one is to speak quickly and also plays from the beginning for 1-2 seconds, and the third and fourth ones are to play key centimeters. Only the responses are played from the head at normal or rapid speed. By gradually lowering the reproduced sound level at the tail of each reproduced sentence and reproducing the next reproduced sentence overlappingly, natural and easy-to-listen fast-forward reproduction is possible.

更に、文書のどこに音声メモが付されているかを表示す
るマークが付けられているのが普通であるが、音声メモ
を使わない場合は、この音声メモマークの表示を禁止し
たり、再生時には、再生位置を示すのに、マークではな
く１文書の該当部分の色を変えたり、ブリンキングをさ
せる機能とすることにより、より一層直接的に音声と文
書がスムースに対応付けられる、又、対象部分外の表示
の輝度を下げたり、ややぼかすなどの表示をすることに
より、より自然に注目すべき位置を示すことも自然な対
応付けに大きな効果がある。Additionally, there is usually a mark that indicates where a voice memo has been added in a document, but if you do not use voice memos, you may want to prohibit the display of this voice memo mark or set the By using a function that changes the color or blinks the relevant part of a document instead of a mark to indicate the playback position, audio and document can be more directly and smoothly matched, and the target part Reducing the brightness of the outside display or making it slightly blurred to more naturally indicate the location of attention has a great effect on natural correspondence.

ボインテツング及び機能選択手段は、専用のライトペン
のようなものやマウスを用いても良く、又、メニュー選
択方式によっても機能を実現できることは言うまでもな
い。As the input and function selection means, a special light pen or a mouse may be used, and it goes without saying that the functions can also be realized by a menu selection method.

これらの各種の機能を組み合せ使用することにより、効
果的かつ使いやすい音声メモ方式を実現することが可能
となる。By using these various functions in combination, it is possible to realize an effective and easy-to-use voice memo system.

〔Example〕

以下、本発明の一実施例を図を用いて説明する。 An embodiment of the present invention will be described below with reference to the drawings.

第１図は本発明を説明するためのブロック図である。FIG. 1 is a block diagram for explaining the present invention.

第１図において、画像メモリ３４に格納された文書が画
像制御プロセッサ３０の制御のもとに、画像バッファ３
５を経て、画像ディスプレー３６に表示されている。こ
の構成は、通常のマルチ・ウィンドウ機能を持つ画像端
末として実現されているものである。画像メモリ３４に
は、マルチウィンドウ表示に必要な各セグメント画像情
報が格納されており、画像バッファ３５で表示画面に編
集され、出力される。In FIG. 1, a document stored in an image memory 34 is transferred to an image buffer 3 under the control of an image control processor 30.
5 and is displayed on the image display 36. This configuration is realized as an image terminal with a normal multi-window function. The image memory 34 stores each segment image information necessary for multi-window display, and the image buffer 35 edits and outputs the segment image information to the display screen.

音声メモとして入力された音声は、コーディングの符号
化部２４を経て、音声検出部２６及び入力音声バッファ
２７に入力される。音声処理プロセッサ２０は音声検出
部２６の出力を見て、音声が入力されたと判定すると、
入力音声バッファ２７中の入力音声信号を音声メモリイ
ンターフェース２８を経て、音声メモリ２９に格納する
。The voice input as a voice memo passes through a coding encoder 24 and is input to a voice detector 26 and an input voice buffer 27. When the audio processing processor 20 looks at the output of the audio detection unit 26 and determines that audio has been input,
The input audio signal in the input audio buffer 27 is stored in the audio memory 29 via the audio memory interface 28.

２１は音声処理プロセッサ用のメモリ部でプログラム等
が格納されている。音声検出部２６は入力音声の短時間
パワーを計算し、短時間パワーが、あらかじめ定めた閾
値を起えて、一定時間以上継続すると音声ありと判定し
、一定時間以上、閾値を検出する技術として用いられて
いる公知技術で実現可能である。Reference numeral 21 denotes a memory section for the audio processing processor, in which programs and the like are stored. The audio detection unit 26 calculates the short-term power of the input audio, and when the short-term power reaches a predetermined threshold and continues for a certain period of time or more, it is determined that there is audio, and this technology is used to detect the threshold for a certain period of time or more. This can be realized using known technology.

音声メモの再生は、音声処理プロセッサ２０の制御のも
とに、音声メモリ２９より音声メモリインターフェース
２８出力音声バツフア２２を経て、コープイックの復号
化部２５を経て出力する。The voice memo is reproduced under the control of the voice processing processor 20, from the voice memory 29, through the voice memory interface 28, through the output voice buffer 22, and then output through the copic decoding section 25.

画像表示と音声メモの対応付け、及び全体の制御は、制
御プロセッサ１１が音声インターフェース部２３及び画
像インターフェース部３３を経て音声処理プロセッサ２
０及び画像制御プロセッサ３０と交信しながら実行する
。対応付けの表にはメモ作成の順序も合せ記録しておく
。ここでは、キーボード１３からの情報の入力及び、ポ
インティングディバイス１４の入力も制御プロセッサ１
１が統一的に管理し、音声及び画像と同様対応付けを行
なう。１２は制御プロセッサ１１用のメモリ部であり、
３１は画像処理プロ７セツサ３０用のメモリ部である。The control processor 11 performs the correspondence between image display and voice memo and overall control via the voice processing processor 2 via the voice interface section 23 and the image interface section 33.
0 and the image control processor 30. Also record the order of memo creation in the correspondence table. Here, information input from the keyboard 13 and input from the pointing device 14 are also performed by the control processor 1.
1 manages the data in a unified manner and associates them in the same way as audio and images. 12 is a memory section for the control processor 11;
31 is a memory section for the image processing processor 7 processor 30;

メモリ部１１には音声メモと画像の部分を対応付ける対
応表等も記録される。The memory unit 11 also records a correspondence table that associates the voice memo with the image part.

先ず、表示された画面の特定ケ所に音声メモを付す場合
について第２図を用いて説明する。First, the case where a voice memo is attached to a specific location on a displayed screen will be explained using FIG.

ポインティング・ディバイス１４を用いて、先ず画面下
部のメニューから音声メモ付加のファンクションを指定
し、次にメモを付加したい位置■を指定すると、音声メ
モ編集画面Ｂが表示される。Using the pointing device 14, first specify the voice memo addition function from the menu at the bottom of the screen, and then specify the position (2) to which you would like to add the memo, and the voice memo editing screen B is displayed.

ポインティング・ディバイスとしてはマウスなどすでに
広く用いられているものを利用することができる。音声
メモ編集画面Ｂの位置は、メモを付す位置が画面のどこ
にあるかにより、自動的に表示位置を定められる。これ
は、画面を大まかに、幾つかの領域に分け、メモを付す
位置■がどの領域に入るかにより、予め定めた表により
表示位置を決めることにより実現することが可能である
。As the pointing device, devices that are already widely used such as a mouse can be used. The display position of the voice memo editing screen B is automatically determined depending on where on the screen the memo is attached. This can be achieved by roughly dividing the screen into several areas and determining the display position according to a predetermined table depending on which area the memo position (■) falls within.

編集画面Ｂは、図に示すように、破線で囲ま２れたバー
（補助マーク）と、実線で囲まれたバー（音声記録マー
ク）が交互に横にならんだものが表示される。この画面
は、原文書データとは別の画像セグメントとして、画像
メモリの原文書とは別の位置に格納されたものが１画面
バッファ上で編集されて一つの画面として表示されてい
るものである。メモ音声が入力されると、一つの音声を
間毎に（通常一つの音声による文章）、実線でかこまれ
たバーの内側がぬりつぶされる。制御プロセッサ１１は
、画面上のメモを付すデータの位置と音声メモの対応表
を作り、メモリ１２に表を格納すると共に、音声プロセ
ッサ２０に対し、入力された音声を音声メモリ２９に格
納するように指示する。第３図に示すようにこの対応表
は、メモを構成する複数の音声文章の一つ一つに対して
も、その順序と位置の情報を持っている。As shown in the figure, the editing screen B displays two bars (auxiliary marks) surrounded by broken lines and two bars (audio recording marks) surrounded by solid lines, which are alternately arranged horizontally. This screen is an image segment that is separate from the original document data and is stored in a different location from the original document in the image memory, edited on a single screen buffer and displayed as a single screen. . When a memo voice is input, the inside of a bar surrounded by a solid line is filled in between each voice (usually a sentence with one voice). The control processor 11 creates a correspondence table between the position of the data to be marked with a memo on the screen and the voice memo, stores the table in the memory 12, and instructs the voice processor 20 to store the input voice in the voice memory 29. instruct. As shown in FIG. 3, this correspondence table has information on the order and position of each of the plurality of audio sentences that make up the memo.

第３図を用いて対応表の例を説明する。（Ａ）はある画
面全体の音声メモの状況を示すものであり、（Ｂ）は個
々の音声メモの状況を示すもので゛ある。（Ａ）は画面
に表示されている音声に一連の番号を付し、その番号の
順に情報がならべられており、各音声メモには次の音声
メモはメモ番号何番の音声メモが来るかを示すポインタ
と、個々の音声メモの状況を示す各音声メモ毎に用意さ
れるテーブル（Ｂ）を指す対応テーブル・ポインタから
なる。音声メモ番号０の情報は音声メモ開始の先頭情報
となるポインタ情報Ｐｏと、メモ数ｎを有する。An example of the correspondence table will be explained using FIG. (A) shows the status of voice memos on a screen as a whole, and (B) shows the status of individual voice memos. In (A), a series of numbers are attached to the voices displayed on the screen, and the information is arranged in the order of the numbers, and each voice memo has a number that indicates the memo number of the next voice memo. and a corresponding table pointer that points to a table (B) prepared for each voice memo indicating the status of each voice memo. The information of voice memo number 0 includes pointer information Po, which is the head information of the voice memo start, and the number n of memos.

編集等の作業で、音声メモ番号ｉが選択されると、対応
テーブルポインタの情報ｔにより、ｔｉに対応する個別
音声メモ情報テーブル（Ｂ）が選択される。When a voice memo number i is selected during editing or the like, the individual voice memo information table (B) corresponding to ti is selected based on the information t of the correspondence table pointer.

個別音声メモ情報テーブル（Ｂ）は、音声メモ番号ｉ、
音声Ｘを作成順序番号（（Ａ）の表の中での作成順で、
その画面中でのメモ作成順序情報に等しい）、音声メモ
リ２９中の実際に音声情報が格納されている位置に関す
る情報、このメモが付加される画面上の位置との対応付
けを行なうための画像メモリ３４上の画像の位置に関す
る情報と対応付ける情報、音声メモを構成する音声文章
数Ｎ、及び個々の音声文章に関する情報よりなる。The individual voice memo information table (B) includes voice memo numbers i,
Voice X is created by the creation order number (in the creation order in the table (A)
(equivalent to memo creation order information on that screen), information regarding the position in the voice memory 29 where the voice information is actually stored, and an image for making a correspondence with the position on the screen to which this memo is added. It consists of information associated with information regarding the position of the image on the memory 34, the number N of audio sentences composing the audio memo, and information regarding individual audio sentences.

個々の音声文章に関する情報は、実際にその音声の格納
されている先頭位置（この情報は音声処理部に作っても
良い）、その文章がキーセンテンスかどうかを示す重要
度マーク、次の音声文章がどれかを示すポインタよりな
る。このポインタを操作することにより音声メモ内の順
序を編集することができる。Information about each audio sentence includes the actual starting position where the audio is stored (this information may be created in the audio processing unit), an importance mark indicating whether the sentence is a key sentence, and the next audio sentence. It consists of a pointer indicating which one. By operating this pointer, the order within the voice memo can be edited.

このような構成により得られる情報を用いれば。If you use the information obtained by such a configuration.

音声メモの編集や、再生順序を入力順や画面の位置順等
任意に操作できることは容易に理解できる。It is easy to understand that the voice memo can be edited and the playback order can be changed arbitrarily, such as input order or screen position order.

また、これらの表の一部を音声メモリや画像メモリ等に
移すなど、表の構成に関しては様々な変形がありうるが
、その具体的構成をどうするかは本発明のポイントを制
約するものでないことは明らかである。Furthermore, there may be various modifications to the structure of the tables, such as moving a part of these tables to audio memory, image memory, etc., but the specific structure of the table does not limit the point of the present invention. is clear.

ポインティング・デバイス１４等により、音声メモ付加
終了の指示を入れると、編集画面が消え、メモを付した
画面の位置に、音声メモを付したことがわかるような表
示を付す。この表示のデータも、原文書とは別の画原セ
グメントとして、画像メモリ３４に記録する。これより
原文書のみを表示したい場合に、音声メモ付加等の表示
を簡単に消すことが可能となる。When an instruction to finish adding a voice memo is input using the pointing device 14 or the like, the editing screen disappears and a display is attached to the position of the screen where the memo was attached so that it can be seen that the voice memo has been attached. This display data is also recorded in the image memory 34 as an original segment separate from the original document. From this, if you want to display only the original document, it becomes possible to easily erase the display such as the addition of a voice memo.

次に、一つの音声メモ内での音声文章の順序を入れ換え
たり、挿入したり、消去する編集の場合を説明する。Next, a case of editing in which the order of voice sentences within one voice memo is changed, inserted, or deleted will be explained.

メニューから１編集機能を選択すると、メニュー画面が
編集用の各種機能のメニューに変ると共に、音声メモ付
加の場合と同様の編集画面１３が指定した位置に応じて
表示される。実線に囲まれたバーは、音声メモの付せら
れている数だけ、内部がぬりつぶされて表示されている
。When one editing function is selected from the menu, the menu screen changes to a menu of various editing functions, and an editing screen 13 similar to that for adding a voice memo is displayed in accordance with the specified position. The bar surrounded by a solid line is displayed with the inside filled in by the number of voice memos attached.

先ず、メモ全体を聞いて見たい場合は、メニューから再
生機能を選択すると、再生される音声文章位置のバーが
ブリンキングしながら、順次再生されて行く。First, if you want to listen to and see the entire memo, select the playback function from the menu, and the bar at the position of the audio sentence to be played will blink and play in sequence.

特定の音声文章を消去したい場合は、その音声のバーと
消去ファンクションを選択すると、その音声文章が消去
され、それ以降のぬりつぶされたバーは一つづつ繰り上
る。If you want to erase a specific audio sentence, select the bar for that audio and the delete function, and that audio sentence will be erased, and subsequent filled-in bars will move up one by one.

音声文章を挿入したい場合は、挿入したい位置の破線の
バーと挿入ファンクションを選択すると、それ以降のぬ
りつぶされたバーが一つづつ繰り下がり、実線の内部の
ぬりつぶされていないバーが一つ挿入位置に表示される
。ここで音声文章を発声すると、音声メモが挿入され、
バーがぬりつぶされて行く、ある音声文章の位置を入れ
換えたい場合は、その音声文章を示す、ぬりつぶされた
バーと、移動先きの破線バー及び移動ファンクションを
選択することにより実行される。If you want to insert an audio sentence, select the dashed line bar at the position you want to insert and the insertion function, and the subsequent filled bars move down one by one, and the one unfilled bar inside the solid line moves to the insertion position. will be displayed. If you say the audio sentence here, a voice memo will be inserted,
If you want to swap the positions of a certain audio sentence whose bar is being filled in, you can do so by selecting the filled-in bar that represents the audio sentence, the dashed line bar to which you want to move, and the movement function.

特定の音声文章がそのメモの中で特に重要であることを
指定したい場合は、メニューから、キーセンテンス指定
を選択し、キーセンテンスとなる音声文章を示すバーを
指定することにより実現される。If you want to specify that a particular audio sentence is particularly important in the memo, you can do so by selecting Key Sentence Designation from the menu and specifying the bar that indicates the audio sentence that will be the key sentence.

次に音声メモを付す人がメモを発声しやすいように、音
声入力に対し応答する機能を付した。第１図において、
音声区間検出部２６が、音声区間の尾部を検出後１〜２
秒を経て、制御部１１は画像表示プロセッサ３０に対し
、音声入力があったことを確認する応答表示をするよう
指示する。画像表示プロセッサ３０は、この指示にもと
づき、表示画面の一部に確認表示として数十ｍｓの時定
数で立上り立下る数Ｌｏｏｍｓ間の表示を行なう。Next, we added a function that responds to voice input to make it easier for people who add voice memos to say their notes. In Figure 1,
1 to 2 after the voice section detection unit 26 detects the tail of the voice section.
After a few seconds have passed, the control unit 11 instructs the image display processor 30 to display a response confirming that there has been a voice input. Based on this instruction, the image display processor 30 displays a confirmation display on a part of the display screen for several looms that rises and falls with a time constant of several tens of milliseconds.

表示としては、音声メモの存在を示す表示の輝度や大き
さを変えるなどの方法がある。また、画像表示の代わり
に音声処理プロセッサに指示し、音声出力を用いて、″
ハイ″とか″それで？″とがの音声を出力させても良い
。なお発声後１〜２秒で応答すると、発声がやりやすい
ことは、石井他の研究で知られており（たとえば、中白
書店″ヒユーマン、サイエンス″第２券２を利用したものである。As a display, there are methods such as changing the brightness or size of the display indicating the presence of the voice memo. Also, instead of displaying an image, you can instruct the audio processor and use audio output to
“Hi” or “Is that so?” It is also known from research by Ishii et al. that it is easier to respond if you respond within 1 to 2 seconds after uttering the voice. This is the one using the second ticket 2.

これらの処理の結果は、すべて制御プロセッサ１１で管
理され、必要な対応表等が、その都度修正され、修正に
件なうメモリの管理に関する指令が画像制御プロセッサ
３ｏと音声処理プロセッサ２０に出され、各プロセッサ
は必要な処理を行なう。これらの処理自体は，現在のマ
ルチウィンドウ方式の表示端末で行なわれている各種の
技法により容易に実現されるので、ここでは省略する。The results of these processes are all managed by the control processor 11, necessary correspondence tables etc. are modified each time, and commands regarding memory management for modification are issued to the image control processor 3o and the audio processing processor 20. , each processor performs necessary processing. These processes themselves can be easily realized using various techniques used in current multi-window display terminals, so they will be omitted here.

音声メモを聞く場合は、音声メモ再生のメニューを選択
し、位置を指定する他に、直接音声メモを付しである表
示を指示するだけで、再生できるモードを設ける。When listening to a voice memo, in addition to selecting the voice memo playback menu and specifying the position, a mode is provided in which the voice memo can be played back by simply instructing to display the voice memo directly.

次に音声の早送り再生を行なう場合を説明する。Next, a case will be described in which fast-forward playback of audio is performed.

早送り再生機能のメニューを選択すると、通常の早送り
の他、早口の再生、キーセンテンスによる早送り再生等
のメニューが表示される。又１表示の順序は画面位置順
の他に、音声メモ作成順の再生機能のメニューも用意す
る。通常の早送り再生では、各メニューの頭部から２秒
程度の音声メモを画面の左上から右下方向に順次再生し
て行く、再生と共に対応する再生ケ所がわかるように画
面上の対応ケ所の表示をブリンキングさせる。また、対
象ケ所の近傍以外の部分に淡いアミカケを行なうなどの
表示を取っても良い。When the fast-forward playback function menu is selected, menus such as normal fast-forward, fast-forward playback, and fast-forward playback using key sentences are displayed. In addition to the screen position order, a menu of playback functions in the order of voice memo creation is also provided for the display order. In normal fast-forward playback, a voice memo of about 2 seconds is played from the top of each menu sequentially from the top left to the bottom right of the screen, and the corresponding location is displayed on the screen so that you can see the corresponding location as it plays. make it blink. Further, it is also possible to display a light shade in areas other than the vicinity of the target area.

記録順の再生モードでは、音声メモ作成の順に没送りや
、全メモの再生を行なう、この機能はメモ作成時の作成
者、発想の順序を再現する必要があるとき、有効な機能
となる。In the recording order playback mode, the voice memo is played back in the order in which it was created, and all the memos are played back.This function is useful when it is necessary to reproduce the author and the order of ideas when creating the memo.

早口再生は、音声メモの再生時に音声のスピードを早く
するもので、良く知られているＰＡＲＣＯＲ方式などパ
ラメータ符号化方式で音声メモを記録しておけば、パラ
メータの読み出しタイミングを早くすることにより、テ
ープレコーダの早送りのような声の高さの変化を判なう
ことなく、実現できる。Fast playback speeds up the speed of the audio when playing back the voice memo.If the voice memo is recorded using a parameter encoding method such as the well-known PARCOR method, the reading timing of the parameters will be faster. It is possible to realize changes in the pitch of the voice, such as fast forwarding on a tape recorder, without being perceptible.

キーセンテンスによる早送り再生は、各メモの先頭から
ではなく、キーセンテンスの先頭から一定時間（１〜２
秒）ずつ順次再生して行く方式である。Fast-forward playback using key sentences does not start from the beginning of each memo, but from the beginning of the key sentence for a certain period of time (1 to 2
This is a method in which the data is played back in sequence (seconds) at a time.

これらの早送り再生においては、一定時間毎に次のメモ
へいきなり切り換えると、不自然で非常に聞きづらいも
のとなる。本発明では、先行する再生音をその再生の終
了付近で音量を除々に小さくし、次のメモの再生音をそ
の上にかぶせることにより自然な関係で次々と音声メモ
を聞いて行くことができる。これを実現するためには、
第１図において、音声メモリ２９から取り出した音声デ
ータの音声レベルを決めるパラメータを音声メモリイン
ターフェース部２８で除々に小さくし、実時間の約２倍
の高速で読み出し、出力制御部２２内に設ける二面バッ
ファの一方の出力バッファへ書き込む、先にもう１面の
バッファメモリに書き込まれている先行する音声メモの
データと出力制御部２２内に設けられる加算器÷加算し
、その結果をコープイックの復号部２５へ送り音声に再
生して出力する。音声圧縮゛の方式によっては、パラメ
ータでの加算が不可な方式もあり、この場合は、線形う
°波形に複合機振幅に重みをつけたり両者を加算するよ
うに、出力制御部２２と復号部２５の順序や構成をそれ
ぞれの方式に合致するよう適宜修正する必要があること
は言うまでもない。In these fast-forward playbacks, if the memo is suddenly switched to the next memo at regular intervals, it becomes unnatural and extremely difficult to listen to. In the present invention, the volume of the preceding playback sound is gradually reduced near the end of its playback, and the playback sound of the next memo is overlaid on top of it, so that it is possible to listen to the voice memos one after another in a natural manner. . To achieve this,
In FIG. 1, the parameters that determine the audio level of the audio data taken out from the audio memory 29 are gradually reduced by the audio memory interface unit 28, read out at a high speed approximately twice the real time, and The preceding voice memo data written to the output buffer on one side of the side buffer is added to the data of the preceding voice memo written in the buffer memory on the other side by the adder provided in the output control unit 22, and the result is decoded by Copeic. The signal is sent to section 25 and reproduced and output as audio. Depending on the audio compression method, there are methods that do not allow addition using parameters, and in this case, the output control unit 22 and the decoding unit 25 may be configured to weight the multifunction device amplitude on the linear waveform or to add both. Needless to say, it is necessary to modify the order and structure of the system as appropriate to suit each system.

重要なメモのみを聞きたい場合は、重要メモ選択のメニ
ューを選択することにより、重要を指定された音声文章
を含む音声メモのみを順次再生させる。If you want to listen to only important memos, select the important memo selection menu to sequentially reproduce only the voice memos containing voice sentences designated as important.

この他、メニュー選択により、音声メモ関係の表示を除
いた原文書情報のみの表示を行なうモー　ドを設けてお
く。In addition, a menu selection mode is provided in which only the original document information is displayed, excluding the voice memo-related display.

以上の編集処理の主な手順をＰＡＤ表現により第４ＷＩ
に示す。同量の用語等は第３図を参照。ボインティング
ディバイスを用いた処理手順の実現には、すでに様々な
実現方法があるように、この手順も様々な変形があるこ
とはいうまでもない。The main steps of the above editing process are expressed in PAD in the 4th WI.
Shown below. See Figure 3 for equivalent terms. Just as there are already various implementation methods for implementing a processing procedure using a pointing device, it goes without saying that this procedure also has various modifications.

その手続き自体をどのような手段と手順で実現するかに
より本特許の主張点が左右されるものでないことは明ら
かである。It is clear that the claims of this patent are not affected by the means and procedures used to implement the procedure itself.

［発明の効果〕本発明によれば、画面に表示された文書や画像（医療画
像など）、図面（地図や設計図面など）に音声による任
意のメモを容易に付したり、簡単に修正したり、効率良
くメモの概要を聞いたりすることができるので１人間に
とって扱いやすい情報の取り扱いが可能となる。[Effects of the Invention] According to the present invention, it is possible to easily add an arbitrary voice memo to documents, images (medical images, etc.), and drawings (maps, design drawings, etc.) displayed on the screen, and easily modify them. This allows one person to easily handle information because the user can read the memo and listen to the summary of the memo efficiently.

[Brief explanation of the drawing]

第１図は本発明の一実施例を説明するブロック図、第２
図は本発明の詳細な説明する図、第３図は本発明におけ
る音声メモと画像上のメモとを対応させる図、第４図は
本発明における音声メモの編集処理の流れを示す図であ
る。１１・・・制御プロセッサ、１３・・・キーボード、１
４・・・ポインティングディバイス、２０・・・音声処
理プロセッサ、２９・・・音声メモリ、３０・・・画像
制御プロセツサ、３４・・・画像メモリ、３５・・・画
像バッファ、３６・・・画像ディスプレー。FIG. 1 is a block diagram explaining one embodiment of the present invention, and FIG.
3 is a diagram for explaining the present invention in detail, FIG. 3 is a diagram showing correspondence between a voice memo and an image memo in the present invention, and FIG. 4 is a diagram showing the flow of editing processing of a voice memo in the present invention. . 11... Control processor, 13... Keyboard, 1
4... Pointing device, 20... Audio processing processor, 29... Audio memory, 30... Image control processor, 34... Image memory, 35... Image buffer, 36... Image display .

Claims

[Scope of Claims] 1. In a voice memo method for attaching a memo by voice to a displayed predetermined image, means for specifying an arbitrary position on the image for attaching the voice memo; means for inputting a voice to be attached to a position; means for recording the input voice as a voice memo; means for associating the position on the image with the recorded voice memo; and editing the recorded voice memo. and means for reproducing the edited voice memo based on designation of a predetermined position on the image. 2. The voice memo system according to claim 1, wherein the means for editing the recorded voice memo includes means for displaying an editing screen. 3. In the voice memo system according to claim 2, the editing screen includes a plurality of marks indicating the recording of the voice in units of audio sentences, and an editing screen sandwiched between the recording marks. A voice memo method characterized by having an auxiliary mark for use. 4. The voice memo system according to claim 3, wherein the editing screen has a voice recording mark whose display state changes as the voice is actually input. Voice memo method. 5. In the voice memo method described in claim 1, the means for editing the recorded voice memo has a function of selecting and specifying important ones from among the voice sentences making up the voice memo. Voice memo method. 6. The voice memo system according to claim 1, wherein the means for editing the recorded voice memo has a function of rearranging the order of voice sentences constituting the voice memo. . 7. In the voice memo system set forth in claim 1, the means for editing the recorded voice memo has a function of inserting a new voice sentence at any position between the voice sentences constituting the voice memo. The voice memo method is characterized by: 8. In the voice memo system as set forth in claim 1, the means for editing the recorded voice memo has a function of erasing any of the voice sentences constituting the voice memo. Memo method. 9. In the voice memo method described in claim 1, the means for editing the recorded voice memo changes the display state of the voice recording mark as the voice text constituting the voice memo is deleted or inserted. Voice memo method with correction function. 10. In the voice memo system set forth in claim 1, the means for inputting the voice includes means for outputting confirmation that voice input has occurred 1 to 2 seconds after detecting the tail of the section of the input voice. The voice memo method is characterized by: 11. In the voice memo system as set forth in claim 1, the reproducing means includes fast-forward reproducing means for sequentially reproducing only a range of about 2 seconds from each of the series of voice memos on the display screen. Voice memo method. 12. The voice memo system according to claim 11, wherein the reproducing means includes means for correcting and reproducing the voice memo at a faster rate than the utterance rate of the input voice. 13. In the voice memo system according to claim 11, the reproduction means selects only voice sentences designated as important from among the voice sentences constituting the voice memo and performs fast-forward reproduction. Voice memo method with 14. In the voice memo system as set forth in claim 11, the reproduction means has means for gradually reducing the audio intensity of the tail portion of the previously reproduced voice and taking over to reproduction of the next voice memo. Voice memo method.