JPH0797277B2

JPH0797277B2 - Computer and communication device for unspecified speakers

Info

Publication number: JPH0797277B2
Application number: JP60231483A
Authority: JP
Inventors: フアンホウキツト
Original assignee: キットファンホウ
Priority date: 1984-10-18
Filing date: 1985-10-18
Publication date: 1995-10-18
Anticipated expiration: 2010-10-18
Also published as: GB2165974A; GB2165974B; HK99093A; CN85107539B; JPS61159699A; GB8525606D0; CN85107539A

Description

【発明の詳細な説明】技術分野本発明は、機械との通信に関し、特に電子計算機との命
令の音声通信に関する。TECHNICAL FIELD The present invention relates to communication with machines, and more particularly to voice communication of commands with electronic computers.

背景技術人間による電子計算機との音声通信には、多大の興味が
向けられ、研究が成されている。電子計算機との音声通
信の利点、応用、及び一般的な装置は、“Toss Your Ke
yboards and Just Tell Your Computer What to Do"
（A.E.Conrad,Research＆Development,1984年１月号、
第26巻，第１号，第86〜89頁）で説明されている。BACKGROUND ART A great deal of interest has been directed to and researched on human voice communication with electronic computers. The advantages, applications, and general equipment of voice communication with electronic calculators are: "Toss Your Ke
yboards and Just Tell Your Computer What to Do "
(AEConrad, Research & Development, January 1984 issue,
Vol. 26, No. 1, pp. 86-89).

この中で指摘されているように、この分野の殆んどの研
究は、話者を特定した、すなわち話者に依存した装置
と、話者に独立した装置である言語認識装置に対してで
あった。前者はある個人が話した言葉だけを認識でき、
後者は異なる人々の話した同じ言葉を認識できる。話者
特定装置は、話者に独立した装置に比較して、より多数
の語を同時に認識でき（又は習うことができ）、このた
め、より多くの機能の制御ができる。両装置共、話され
た語のテンプレートを作成し、受取った語をこのテンプ
レートに合わせることによっている。As pointed out herein, most research in this area has focused on speaker-specific or speaker-dependent devices and on language-recognition devices that are speaker-independent devices. It was The former can only recognize the words spoken by an individual,
The latter can recognize the same words spoken by different people. A speaker-specific device can recognize (or learn) more words at the same time than a speaker-independent device, thus allowing more control of the function. Both devices rely on creating a template of spoken words and matching received words to this template.

この分野には多大の努力が払われ、Marleyの米国特許第
4,284,846号の先行技術の記述にある技術文献の調査で
証明された多くの特許と記事がある。この従来技術特許
は、ある波形特性と予め記憶された比率を比較すること
によって語を分析し、比較するための装置を教示してい
る。A great deal of effort has been put into this area, and Marley's US patent
There are many patents and articles proven in a search of the technical literature in the prior art description of 4,284,846. This prior art patent teaches an apparatus for analyzing and comparing words by comparing certain waveform characteristics with pre-stored ratios.

言語認識装置を開示する他の特許には、Sahoeの米国特
許第4,286,114号と第4,319,221号、B.H.Anの米国特許第
4,292,470号、Welch他の米国特許第4,319,085号と第4,3
36,421号、Kellettの米国特許第4,343,969号、Pirzの米
国特許第4,349,700号、Taniguchi他の米国特許第4,389,
109号、Hitchockの米国特許第4,388,495号、Duifhuis他
の米国特許第4,384,335号、及びRothschild他の米国特
許第4,399,732号がある。Other patents disclosing language recognizers include Sahoe U.S. Pat. Nos. 4,286,114 and 4,319,221; BHAn U.S. Pat.
U.S. Pat.Nos. 4,319,085 and 4,3, 4,292,470, Welch et al.
36,421, Kellett U.S. Pat.No. 4,343,969, Pirz U.S. Pat.No. 4,349,700, Taniguchi et al U.S. Pat.
109, Hitchock US Pat. No. 4,388,495, Duifhuis et al. US Pat. No. 4,384,335, and Rothschild et al. US Pat. No. 4,399,732.

これらの装置は極めて複雑で高価であるか、そうでなけ
れば能力が非常に制限されている。例えば、代表的な話
者非依存装置では、約10語（例えば10個の異なる数字）
を認識するものもあれば、話者特定装置では、１桁多
く、100〜200語を認識する装置もある。These devices are extremely complex and expensive, or otherwise have very limited capabilities. For example, on a typical speaker-independent device, about 10 words (eg 10 different numbers)
There is a device that recognizes 100 to 200 words, which is one digit larger in the speaker identification device.

発明の概要本発明は、上記の参照した特許に開示された装置とは、
複雑度が比較的低いことにおいて異なる。即ち、多数の
異なる音声命令を認識可能な話者非依存装置を経済的に
実現するため、本発明は通常会話で必要とされる。又は
通常使用されている範囲を大きく超えた範囲の人間の声
を使用できるという物理的な利点がある。本発明の教示
による装置は、予め定義された機能を実行するための符
号化信号として取扱われる音の差を感知し認識する。SUMMARY OF THE INVENTION The present invention comprises the devices disclosed in the above referenced patents,
The difference is that the complexity is relatively low. That is, the present invention is typically needed in conversation in order to economically implement a speaker-independent device capable of recognizing a large number of different voice commands. Alternatively, there is a physical advantage that a human voice in a range far exceeding the range normally used can be used. An apparatus in accordance with the teachings of the present invention senses and recognizes differences in sound that are treated as encoded signals to perform a predefined function.

この音の差は、本発明の特徴によれば、例えば通常の全
音階のような音階内の音であり、例えば使用者の中央の
ハ音と、続いて発音されたニ音との差の音を比較するこ
とによって、各々の使用者に対する調性（キー）を設定
するための手段が設けられる。本発明の装置は、命令又
は入力として音程を使用し、認識する。According to the characteristics of the present invention, this difference in sound is a sound within a scale such as a normal diatonic scale, and, for example, the difference between the user's central C-sound and the subsequently pronounced D-sound. Means are provided for setting the tonality (key) for each user by comparing the sounds. The device of the present invention uses and recognizes pitches as commands or inputs.

本装置は、他の入力装置が実用化不可能な環境での電子
計算機の入力及び制御を含む多くの応用に適用される。
即ち、暗室内で（例えば、電子顕微鏡室で）、又は製造
工程のような他の作業中に使用者が手を使用しなくては
ならない場合に、本発明は有効である。本装置は障害
者、特に従来の電子計算機端末を容易に操作し得ない人
にとって便利である。The device finds application in many applications, including computer input and control in environments where other input devices are impractical.
That is, the present invention is effective when the user has to use his hands in a dark room (for example, in an electron microscope room) or during other operations such as a manufacturing process. This device is convenient for people with disabilities, especially those who cannot easily operate conventional computer terminals.

本発明の上記認識装置は、その利点と共に添付の図面を
参照して以下の説明により最良に理解され得る。上記図
面中、同様な構成要素は同様な参照番号を付与する。The recognition device of the present invention together with its advantages can best be understood by the following description with reference to the accompanying drawings. In the drawings, similar components are provided with similar reference numbers.

好ましい実施例の説明本発明の原理は、各種方法でハードウェア及び／又はソ
フトウェアと共に適用可能であり、いくつかの例を以下
に説明する。Description of the Preferred Embodiments The principles of the present invention may be applied in various ways with hardware and / or software, some examples of which are described below.

本発明による装置は、第１図に示すように全体として、
参照番号10で示される。装置10はマイクロホン等の変換
器12を含み、これは波形14で示したような音声波を集め
る。上記波形は、人間がハ音を歌っている場合には、そ
の基音周波数（ピッチ）のレートに対する基本周期Ｔを
有する。電子回路系を単純にするため、受信音声波形14
の電気アナログ信号は、波形整形回路18で周期Ｔを持つ
パルス波列16に変換される。The device according to the invention as a whole, as shown in FIG.
Designated by reference numeral 10. The device 10 includes a transducer 12, such as a microphone, which collects a sound wave as shown by waveform 14. The above waveform has a fundamental period T for the rate of the fundamental frequency (pitch) when a human is singing a Ha sound. Received speech waveform 14 to simplify the electronic circuit system
The electric analog signal of is converted into a pulse wave train 16 having a period T by the waveform shaping circuit 18.

変換器12の代りに、補助電気信号入力20を備えてもよ
い。これは、例えば、遠隔起動用電話入力、又は音声発
生器でもよい。Instead of the converter 12, an auxiliary electrical signal input 20 may be provided. This may be, for example, a remotely activated telephone input, or a voice generator.

波列16は周期測定回路22に受信され、その周期が測定数
値化され、その情報は数値式電子計算機26とインターフ
ェイスする入出力インターフェイス24に送られる。The wave train 16 is received by a period measuring circuit 22, its period is measured and digitized, and the information is sent to an input / output interface 24 which interfaces with a numerical computer 26.

装置10の全体の動作において、装置10はマイクロホン12
又は入力20に音声信号を受け、受信信号から選択された
信号に応動して、計算機内で予めプログラムされている
ルーチンを実行する。In the overall operation of the device 10, the device 10 uses the microphone 12
Alternatively, a voice signal is received at the input 20 and a preprogrammed routine in the computer is executed in response to a signal selected from the received signals.

本発明の装置10の動作に関する原理を理解するため、通
常の全音階を参照して応用例を説明する。これによっ
て、本発明は、他の音階にも適用可能と理解される。現
代全音階は、特定音の周波数に依存しない音程を有し
（しかし、ある周波数が設定されると、他の音の周波数
はそれによって決定される）、部分的には下記の表によ
って表現され得る。In order to understand the principle of operation of the device 10 of the present invention, an application example will be described with reference to a normal diatonic scale. Therefore, it is understood that the present invention can be applied to other scales. The modern diatonic scale has a pitch that is independent of the frequency of a particular note (but when one frequency is set, the frequency of another note is determined by it), and is partly represented by the table below. obtain.

任意の全音階音程 MI′ 2.5000 RE′ 2.2500 DO′ 2.0000 TI 1.8750 LA 1.6667 SO 1.5000 FA 1.3333 MI 1.2500 RE 1.1250 基準音階 DO 1.0000 TI, 0.93750 LA, 0.83333 DO, 0.75000 FA, 0.66667 MI, 0.62500 RE, 0.56250 DO, 0.50000 TI, 0.46875 殆んど人はこの音程に親しんでいて、子供でさえ、直ち
にド，レ，ミ，ファ，ソ，ラ，シ，ドと歌えるので（こ
の場合、無意識に基準周波数を選び、上記音程で他の音
を基準周波数に関連させている）。本発明は、情報を受
信する手段として音程を選択し、これに対して処理を行
う。Any scale pitch MI ′ 2.5000 RE ′ 2.2500 DO ′ 2.0000 TI 1.8750 LA 1.6667 SO 1.5000 FA 1.3333 MI 1.2500 RE 1.1250 Reference scale DO 1.0000 TI, 0.93750 LA, 0.83333 DO, 0.75000 FA, 0.66667 MI, 0.62500 RE, 0.56250 DO, 0.50000 TI, 0.46875 Most people are familiar with this pitch, and even children can immediately sing Do, Re, Mi, Fa, So, La, Si, Do (in this case, unconsciously select the reference frequency, Other notes are related to the reference frequency in the above pitch). The present invention selects a pitch as a means for receiving information and processes it.

従って、第５図の流れ図を参照すると、始動命令は、ス
テップＡで検出された音声信号（レ）の受信であってよ
い。ステップＢで、この信号は十分な持続時間の周期を
有するかを決定するために測定される（誤起動の防
止）。基準信号があって（例えば、ドが）記憶されてい
れば、それらは比較され、1.125の音程がステップＣで
計算されると、ステップＤでフェッチ／実行サブルーン
が実行され、予め記録されている音程1.125に対するサ
ブルーチンを呼び出し、実行する。最後に、系はステッ
プＥでリセットされ、第２の信号（例えば、ファ）を受
け、同様な一般的方法でこれに応動するための状態にな
る。Therefore, referring to the flow chart of FIG. 5, the start command may be the reception of the audio signal (h) detected in step A. In step B, this signal is measured to determine if it has a period of sufficient duration (prevention of false triggers). If a reference signal is present and stored (eg, do), they are compared, and when a pitch of 1.125 is calculated in step C, a fetch / execute subrun is executed in step D and is pre-recorded. Call and execute a subroutine for pitch 1.125. Finally, the system is reset in step E, ready to receive a second signal (eg, fa) and respond to it in the same general manner.

系が話者特定になるのを避け、最小の音楽的能力で誰に
でも使用できるようにするため、基準信号は同一の処理
で設定される。始動時使用者は短期間ドと続いてレをマ
イクロホンに吹き込めばよい。系は最初の受信音声信号
を、基準信号設定用とし、第２以降の信号を命令信号と
して扱う。The reference signal is set in the same process in order to avoid the system becoming speaker specific and to be usable by anyone with a minimum of musical ability. At start-up, the user only needs to blow the microphone into the microphone for a short period of time. The system treats the first received voice signal as a reference signal setting and treats the second and subsequent signals as command signals.

第２図に波形整形回路18と、マイクロホン12及び補助入
力20への相互接続の望ましい実施例を示す。図中に、使
用構成要素の電気的な特定値が示されているが、当業者
に周知のように、他の多くの値も当然使用できる。しか
し、第２図の回路要素の値と接続は、プロトタイプ装置
に於て良好なものであった。FIG. 2 shows a preferred embodiment of the waveform shaping circuit 18 and the interconnections to the microphone 12 and auxiliary input 20. Although electrical specific values for the components used are shown in the figure, many other values can of course be used, as is well known to those skilled in the art. However, the values and connections of the circuit elements in Figure 2 were good in the prototype device.

具体的には、マイクロホン12は、回路接続点28へのリア
クティブインピーダンス27とシャーシ接地点の間に接続
される。接続点28は、抵抗32を介してシャーシ接地点に
容量34を経て補助入力20に接続される。Specifically, the microphone 12 is connected between the reactive impedance 27 to the circuit connection point 28 and the chassis ground point. Connection point 28 is connected to auxiliary input 20 via resistor 32 to chassis ground via capacitance 34.

信号入力12又は20に信号が受信されると、入力12又は20
は、受信信号を、演算増幅器38を含む低周波増幅器36へ
送る。増幅器38の負入力（ピン６）は、接続点28に接続
され、正入力（ピン５）は抵抗40を経てバイアス電圧
（＋5V）に、容量42と抵抗44の並列接続を介してシャー
シ接地点に接続される。演算増幅器38のピン４は、電流
制限抵抗45を経て正バイアス源（12V）に、容量46を介
してシャーシ接地点に接続される。演算増幅器38の出力
の一部は、抵抗47と容量48の並列接続を経て、負入力に
帰還する。When a signal is received at signal input 12 or 20, input 12 or 20
Sends the received signal to a low frequency amplifier 36 including an operational amplifier 38. The negative input (pin 6) of the amplifier 38 is connected to the connection point 28, and the positive input (pin 5) is connected to the bias voltage (+ 5V) via the resistor 40 and the chassis ground point via the parallel connection of the capacitor 42 and the resistor 44. Connected to. Pin 4 of the operational amplifier 38 is connected to the positive bias source (12V) via the current limiting resistor 45 and to the chassis ground point via the capacitor 46. A part of the output of the operational amplifier 38 is fed back to the negative input through the parallel connection of the resistor 47 and the capacitor 48.

低周波増幅器36の主出力は、抵抗51を含む低周波濾波器
50に供給される。抵抗51の一端は増幅器36の出力に接続
され、他端は容量52を介して接地点に、抵抗53を経て回
路接続点54に接続される。低周波濾波器50の出力は、接
続点54へ接続され、更に、比較器55と、減衰付最大電圧
フォロア回路56に接続される。信号は、（ａ）入力が抵
抗59を経て接地されている比較器55の演算増幅器58の主
要正信号入力（ピン12）に電流分離ダイオード57経由で
接続され、（ｂ）最大電圧フォロア回路56に接続され
る。回路56の出力は、比較器55の入力ピン（ピン13）へ
の主要負信号入力となる。The main output of low frequency amplifier 36 is a low frequency filter including resistor 51.
Supplied to 50. One end of the resistor 51 is connected to the output of the amplifier 36, and the other end is connected to the ground point via the capacitor 52 and the circuit connection point 54 via the resistor 53. The output of the low frequency filter 50 is connected to a connection point 54, and further connected to a comparator 55 and a maximum voltage follower circuit 56 with attenuation. The signal is (a) connected via a current isolation diode 57 to the main positive signal input (pin 12) of an operational amplifier 58 of a comparator 55 whose input is grounded via a resistor 59, and (b) a maximum voltage follower circuit 56. Connected to. The output of circuit 56 is the major negative signal input to the input pin (pin 13) of comparator 55.

回路56は、望ましくは、演算増幅器60を含み、増幅器60
の主要正入力（ピン３）は接続点54に接続され、出力
（ピン１）は直接、負入力（ピン２）に戻され、一部は
分離ダイオード61を経て増幅器58の負信号入力に接続さ
れる。抵抗60の出力は、抵抗63及び容量64を介して接地
される。並列に接続された抵抗63と容量64の放電時定数
は、接続点54に現れる信号の各サイクルの最大実効値を
除いて、比較器55での反転入力が非反転入力よりも常に
影響が大きくなる値である。Circuit 56 preferably includes operational amplifier 60, and amplifier 60
Has its primary positive input (pin 3) connected to node 54, its output (pin 1) directly returned to its negative input (pin 2), and partly connected to the negative signal input of amplifier 58 via isolation diode 61. To be done. The output of the resistor 60 is grounded via the resistor 63 and the capacitor 64. The discharge time constants of the resistor 63 and the capacitor 64 connected in parallel always affect the inverting input of the comparator 55 more than the non-inverting input, except for the maximum effective value of each cycle of the signal appearing at the connection point 54. Is the value.

比較器55の出力である波形16は、演算増幅器58の出力
（ピン14）から接地点へ直列に接続された一組の抵抗65
と66の接続点から得られる。この出力は、周期測定回路
22の入力STに送出される。Waveform 16, which is the output of comparator 55, has a set of resistors 65 connected in series from the output of operational amplifier 58 (pin 14) to ground.
And 66 connection points. This output is the period measurement circuit
Sent to 22 input STs.

第３図は、周期測定回路22と入出力インターフェース回
路24及び計算機26への相互接続の望ましい実施例を示
す。特定の計算機26は、望ましくはアップルII⁺であっ
て、ここではその計算機への特定相互接続を示す。FIG. 3 shows a preferred embodiment of the interconnection of the period measuring circuit 22, the input / output interface circuit 24 and the computer 26. The particular computer 26 is preferably an Apple II ⁺ , and here is shown the particular interconnect to that computer.

パルス列16は、シュミットトリガとして働く演算増幅器
70を介してシフトレジスタ72に結合される。レジスタ72
の出力はインバータ74の周期測定ゲート75を経てカウン
タ78に結合される。カウンタ78は、バッファ80を介して
計算機26に接続される。The pulse train 16 is an operational amplifier that acts as a Schmitt trigger.
Coupled to shift register 72 via 70. Register 72
The output of is coupled to the counter 78 via the period measurement gate 75 of the inverter 74. The counter 78 is connected to the computer 26 via the buffer 80.

計算機26の出力は、R/W,AO,Al.及び出力及びバイアス源（5V）とタイミングパルスから得ら
れ、第３図に示すように供給される。ゲート81と82は、
信号線83を介してリセット命令を送るためのものであ
る。The output of the calculator 26 is R / W, AO, Al. It is derived from the output and bias source (5V) and timing pulses and is supplied as shown in FIG. Gates 81 and 82 are
It is for sending a reset command via the signal line 83.

第２図〜第３図の回路及び計算機26の機能は、第４図を
参照することでより明らかになる。第４図は波形14,16
とシフトレジスタ72のCPでの入力、計算機26から周期測
定回路75へ送られるタイミングパルス、及び測定された
周期Ｔの始めと終りに対応するシフトレジスタ72の出力
の相互関係を示す。The function of the circuit of FIGS. 2-3 and the calculator 26 will become more apparent with reference to FIG. Figure 4 shows waveforms 14 and 16
And the input of the shift register 72 at CP, the timing pulse sent from the calculator 26 to the period measuring circuit 75, and the output of the shift register 72 corresponding to the beginning and end of the measured period T.

動作を説明する。第２図と第３図の回路は先ず、シュミ
ットトリガ70への入力点ST（第２図、第３図）に於て１
サイクル毎に１パルスを得るため、入力信号14を整形さ
れたパルス列16にする。計算機26はシフトレジスタ72と
カウンタを（信号線83を経てインバータ86を介し）リセ
ットし、次の周期測定の為に回路を設定する。The operation will be described. The circuit of FIG. 2 and FIG. 3 first has 1 at the input point ST (FIGS. 2 and 3) to the Schmitt trigger 70.
The input signal 14 is shaped into a train of pulses 16 to obtain one pulse per cycle. The computer 26 resets the shift register 72 and the counter (via the inverter 86 via the signal line 83) and sets up the circuit for the next period measurement.

シフトレジスタ72のCPにおいて、整形されたパルス列16
の最初の立上りエッジが現われると、出力Q0は低（電
圧）レベルから高レベルに変わる。カウンタ78は、入力
波列のその時点での周期の測定を始める。In the CP of the shift register 72, the shaped pulse train 16
When the first rising edge of appears, output Q0 changes from a low (voltage) level to a high level. The counter 78 begins measuring the current period of the input wave train.

シフトレジスタ72のCPに、整形されたパルス列16の第２
の立上りエッジが到着すると、出力Q1は低レベルから高
レベルになる。カウンタ78は計数を停止する。同時にバ
ッファ80の高位バイトバッファ90の最上位数字の入力が
低レベルから高レベルに設定され、周期測定が完了した
ことを示す。In the CP of the shift register 72, the second of the shaped pulse train 16
When the rising edge of arrives, output Q1 goes from low to high. The counter 78 stops counting. At the same time, the input of the highest digit of the high byte buffer 90 of buffer 80 is set from low to high, indicating that the period measurement is complete.

計算器26は高位バイトバッファ90を読み取る。最上位数
字が高レベルであれば、無視され、計算機26は、最下位
の７ビットを、15ビット二進数のQ8〜Q14として評価す
る。（最上位数字が低レベルであれば、周期測定はまだ
完了していないことを示す。）計算機26は低位バイトバッファ92も読み取り、15ビット
二進数の最下位８ビット、Q0〜Q7を読取値とする。計算
機26は、この15ビット二進数の大きさを、いま測定した
周期の大きさとする。Calculator 26 reads high byte buffer 90. If the most significant digit is high, it is ignored and the computer 26 evaluates the least significant 7 bits as a 15-bit binary number, Q8-Q14. (If the highest digit is low, it indicates that the period measurement is not yet completed.) The computer 26 also reads the low byte buffer 92 and reads the lowest 8 bits of the 15-bit binary number, Q0 to Q7. And The calculator 26 sets the size of this 15-bit binary number as the size of the period just measured.

装置10の動作は、第６図の流れ図からより深く理解でき
る。図中、スタートシーケンスＡは、測定回路リセット
（ゲート81〜82及び関連リード線）に進み、更にB1にお
いて、周期信号が受信されたかが判定される。受信され
ていなければ、システムはその信号の待ちに入る。受信
されていれば、B3において読取値を取り、B4において、
同一パルス周期のＪ番目（何番目でもよく、例えば30番
目）のパルスが、ステップＣで平均値計算に使えるかが
判断される。The operation of device 10 can be better understood from the flow chart of FIG. In the figure, the start sequence A proceeds to the measurement circuit reset (gates 81-82 and associated leads), and further at B1 it is determined whether a periodic signal has been received. If not, the system waits for the signal. If received, take readings at B3 and at B4
In step C, it is determined whether the J-th (any number, for example, 30th) pulse having the same pulse period can be used for the average value calculation.

第６図の流れ図は単一音程メッセージのものである。こ
れは、別々の基準及び信号波列入力に対するもので、連
続波列間には常に、切れ目（ブレーク）がある。ブロッ
クＩはこれらの切れ目を検出するためのステップであ
る。The flow chart of FIG. 6 is for a single pitch message. This is for separate reference and signal wave train inputs, there is always a break between the continuous wave trains. Block I is the step for detecting these breaks.

ブロックＩの出力が、検出された音声信号の最初のもの
であれば、その信号は基準信号として扱われ、記憶され
る。第２以降の信号が識別されれば、それらは基準信号
と比較され、音程が計算される。計算された音程は一致
しているかを調べられ、一致が検出されると、関連サブ
ルーチンが実効され、プログラムはリセットされる。If the output of block I is the first of the detected audio signals, that signal is treated as a reference signal and stored. Once the second and subsequent signals have been identified, they are compared to the reference signal and the pitch calculated. The calculated pitch is checked for a match, and if a match is found, the associated subroutine is executed and the program is reset.

実際の例では計算機端末は以下のようなメニューを表わ
すことができた。In a practical example, the computer terminal could display the following menu.

I AM AT YOUR SERVICE.‘SING'YOUR CHOICE （DO DO）FOR（LIST PROGRAMME IN MEMORY）（DO RE）FOR（DISPLAY PATTERN‘HO'）（DO ME）FOR（TEXT MODE DISPLAY）（DO FA）FOR（FLASH MODE DISPLAY）（DO SO）FOR（PLAY RUNNING TONES）（DO LA）FOR（ACTIVATE EXTERNAL DRIVE TO CATALOG P
ROGRAMMES ON DISK）（DO TE）FOR（DISPLAY‘TE'）（DO DO′）FOR（ACTIVATE EXTERNAL DRIVE TO SAVE TH
IS PROGRAMME ON DISK,AND EXECUTE ANOTHER PROGRAMME
ON DISK,AND RETURN）計算機を起動するには、使用者は、要求命令を声に出す
だけでよい。第６図のプログラム用の最適なリスティン
グを以下に示す。I AM AT YOUR SERVICE.'SING'YOUR CHOICE (DO DO) FOR (LIST PROGRAMME IN MEMORY) (DO RE) FOR (DISPLAY PATTERN'HO ') (DO ME) FOR (TEXT MODE DISPLAY) (DO FA) FOR ( FLASH MODE DISPLAY) (DO SO) FOR (PLAY RUNNING TONES) (DO LA) FOR (ACTIVATE EXTERNAL DRIVE TO CATALOG P
ROGRAMMES ON DISK) (DO TE) FOR (DISPLAY'TE ') (DO DO') FOR (ACTIVATE EXTERNAL DRIVE TO SAVE TH
IS PROGRAMME ON DISK, AND EXECUTE ANOTHER PROGRAMME
ON DISK, AND RETURN) To start the computer, the user only has to say a request command. The following is an optimal listing for the program of FIG.

因に、上記リスティングのライン93を以下のように修正
すれば、装置10は、各命令ごとに、新しい基準信号で動
作する。 By modifying the listing line 93 as follows, the device 10 operates with a new reference signal for each command.

93 Ｗ＝φ 上記動作によって、同一の、又は異った話者が、命令毎
に基準信号を自由に変えることが可能になる。93 W = φ The above operation allows the same or different speakers to freely change the reference signal for each command.

第７図は、本発明の装置のための、別の流れ図100を示
す。この図は、Ｎ音程メッセージ用プログラムである。
即ち、多重音用コーディングである。例えば、ここでは
ド−レ−ミとド−レ−ファは異なった信号である。FIG. 7 shows another flow chart 100 for the apparatus of the present invention. This figure is a program for N pitch message.
That is, it is coding for multiple sounds. For example, here the doremi and the drafer are different signals.

このプログラムを実行するのに適しいリスティングを以
下に示す。Below is a listing suitable for running this program.

第８図に、音声プログラムの更に別の流れ図を示す。話
者は一連の音声を入力することによって、以降に実行さ
れる所望の音声命令列を含む音声プログラム（例えば、
ある処理）を効果的に定義する。 FIG. 8 shows still another flow chart of the voice program. By inputting a series of voices, the speaker inputs a voice program including a desired voice command sequence to be executed thereafter (for example,
Effectively define a process).

第８図の流れ図に従って、このプログラムを実行するの
に適しいリスティングを以下に示す。Following a flow chart of FIG. 8, a suitable listing for executing this program is shown below.

第９図は、他のサブルーチンIIで、第６図〜第８図の流
れ図のブロックＩと代替可能であって、第６図〜第８図
のブロックＩに代えればスラー付入力、即ち、スラー付
波列の処理を可能にする。個々の音声よりもスラー付
（連結した）音声を作る上方が容易であるので、この動
作は話者にとって有利である。 FIG. 9 is another subroutine II, which can be replaced with the block I in the flowcharts of FIGS. 6 to 8, and if the block I in FIGS. 6 to 8 is replaced, input with slur, that is, slur Allows processing of the wave train. This behavior is advantageous for the speaker, as it is easier to make slurred (concatenated) speech than individual speech.

ブロックIIのプログラムを実施するのに適しいリスティ
ングを以下に示す。The following listings are suitable for implementing the Block II program.

2420 PRINT CHR＄（７）:REM BEEP FOR NEXT WAVE TRAI
N 2460 for PS＝1 TO 500:NEXT:REM PAUSE これは、例えば、FIG.6の流れ図に対するプログラムリ
スティングのライン2420〜2485に代替できる。第９図の
休止は、その時点の波列（第６図〜第８図の流れ図）
が、後続の読み取り値と間違えられること、例えば、次
の波列と間違えられるのを防ぐのに十分な長さであるべ
き点に留意すべきであり、使用者は上記のような波列を
休止を超えた長さで生成し続けないように注意すべきで
ある。2420 PRINT CHR $ (7): REM BEEP FOR NEXT WAVE TRAI
N 2460 for PS = 1 TO 500: NEXT: REM PAUSE This can be replaced with, for example, lines 2420 to 2485 of the program listing for the flowchart of FIG. The pause in FIG. 9 is the wave train at that time (flow charts in FIGS. 6 to 8).
Should be long enough to avoid being mistaken for subsequent readings, for example, for the next wave train, and the user Care should be taken not to continue to generate for more than a pause.

第10図は他の代替用サブルーチンIIIを示す。ブロックI
IIは装置10を修正し、これによって使用者は波列出力を
正常時に要求される時間を超えて生成し保持すること
で、実際に周波数域を拡大できる。即ち、休止時間を超
えて生成保持すればよい。この装置が、所定の持続時間
を超えて継続する波列を検出すると、メッセージ識別の
前に（即ち、解釈の前に）、実質上の音程を得るため、
変換係数を用いて上記データを修正する。FIG. 10 shows another alternative subroutine III. Block I
The II modifies the device 10 so that the user can actually expand the frequency range by generating and holding the wave train output for longer than the time normally required. That is, it suffices to generate and hold the data over the pause time. When the device detects a wave train that lasts longer than a predetermined duration, it obtains a substantial pitch before message identification (ie, before interpretation).
The conversion factor is used to modify the above data.

第10図のブロックIIIに示す記号ｍは変換係数で、任意
の範囲の値をとれる。人間が話者である場合には、２個
の特定値0.5と２が特に有用である。係数がｍ＝0.5の場
合、装置聴取部は、高オクターブへ移調し、ｍ＝２では
低オクターブに移る。（即ち、ドーレは音程1.125であ
る。ｍ＝５であって、REが保持されていれば、音程2.5
即ち、DO-RE′を測定する）。繰返し移調は、話者又は
歌手が更に波列を続けた場合に実行される。つまり、話
者の周波数域は、実質的に拡張される。更に、音声入力
は、使用者が使用し易い周波数域内で操作することを許
し、しかも、話者が非常に広い周波数域を有するかのよ
うに多数の音程を実現する。従って、僅かな音符のみを
使用して、従来より多くの異なった語信号が得られる。The symbol m shown in block III of FIG. 10 is a conversion coefficient, which can take a value in an arbitrary range. Two specific values 0.5 and 2 are particularly useful when a human is the speaker. When the coefficient is m = 0.5, the device listening section transposes to a high octave, and when m = 2, it shifts to a low octave. (That is, Dore has a pitch of 1.125. If m = 5 and RE is held, the pitch is 2.5.
That is, measure DO-RE '). Repeated transposition is performed when the speaker or singer continues to continue the wave train. That is, the speaker frequency range is substantially expanded. Further, the voice input allows the user to operate in a frequency range that is easy for the user to use, and yet realizes a large number of pitches as if the speaker has a very wide frequency range. Therefore, more different word signals than before can be obtained using only a few notes.

第10図のブロックIIIのプログラムに適しいリスティン
グを以下に示す。The following listings are suitable for the block III program of FIG.

2420 PRINT CHR＄（７）:REM BEEP FOR NEXT WAVE TRAI
N 2464 FOR PS＝1 TO 1000:NEXT:REM PAUSE 2466 A＝PEEK（49348）:REM RESET CIRCUIT AND TEST F
OR SILENCE 2470 FOR PS＝1 TO 100:NEXT:REM BRIEF PAUSE 2472 REM“WAVE"TRAIN STILL DETECTED?" 2475 L＝PEEK（49345）:H＝PEEK（49346） 2478 IF L＝0 THEN GO TO 2495:REM NO WAVE DETECTED 2480 AVE（Ｗ）＝AVE（Ｗ）/2:REM MULTIPLYING FACTOR
＝1/2 2485 GO TO 2420 本発明が電子計算機による通信のための新装置を教示す
ることは以上の説明により明らかである。本装置は音程
符号による通信の概念を使用し、これにより、音声命令
数及び音声認識の容易さを本質的に拡大する。2420 PRINT CHR $ (7): REM BEEP FOR NEXT WAVE TRAI
N 2464 FOR PS = 1 TO 1000: NEXT: REM PAUSE 2466 A = PEEK (49348): REM RESET CIRCUIT AND TEST F
OR SILENCE 2470 FOR PS = 1 TO 100: NEXT: REM BRIEF PAUSE 2472 REM "WAVE" TRAIN STILL DETECTED? "2475 L = PEEK (49345): H = PEEK (49346) 2478 IF L = 0 THEN GO TO 2495: REM NO WAVE DETECTED 2480 AVE (W) = AVE (W) / 2: REM MULTIPLYING FACTOR
= 1/2 2485 GO TO 2420 It is clear from the above description that the present invention teaches a new device for electronic computer communication. The device uses the concept of pitch-based communication, which essentially extends the number of voice commands and the ease of voice recognition.

本装置は、経済的かつ効果的に実現でき、操作は容易に
修得でき有効に使用できる。これは使用者に特定される
ことなく、多数の認識される語（音程）を提供し、製造
及び使用が経済的である。This device can be realized economically and effectively, and its operation can be easily learned and used effectively. It provides a large number of recognized words (tones) without being user-specific and is economical to manufacture and use.

本発明の装置のいくつかの実施例を示し説明したが、本
明細書に記載した装置10,000又はその変形に対する変更
及び修正が本発明の教示を逸脱することなく可能なこと
は当業者には自明である。従って、本発明の範囲は特許
請求の範囲に記載の事項にのみ限定される。Although several embodiments of the apparatus of the present invention have been shown and described, it will be obvious to those skilled in the art that changes and modifications to the apparatus 10,000 or variations thereof described herein may be made without departing from the teachings of the present invention. Is. Therefore, the scope of the present invention is limited only to the matters described in the claims.

要約すると、本発明は、予め選択された命令又は入力を
電子計算機に通信するための符号として音程を使用する
機械への通信のための装置に関する。この装置は、音程
を連続的に設定するため、音符（ド−レ等）を発生して
いる人の音声を使用でき、受信された音符又は音声を同
一基本周期のパルス列に変換し、周期と音程を記憶し計
算するために、電気回路とプログラムを使用できる。特
定音程が受信されると、アップルII＋TMのような数値式
マイクロコンピュータを含む装置は、予め記憶されてい
るサブルーチンを実行し、続いて、第２の音程の受信の
ために回路を再設定することが可能である。In summary, the present invention relates to an apparatus for communicating to a machine that uses pitch as a code for communicating preselected instructions or inputs to a computer. Since this device sets the pitch continuously, it can use the voice of the person who is generating the note (dore etc.), converts the received note or voice into a pulse train of the same fundamental period, and Electrical circuits and programs can be used to store and calculate pitches. When a specific pitch is received, a device containing a numerical microcomputer such as the Apple II + TM executes a pre-stored subroutine, followed by reconfiguring the circuit for the reception of the second pitch. Is possible.

[Brief description of drawings]

第１図は本発明の教示に従って構成された音声認識装置
のブロック図で、各点での波形を示し、第２図は、第１図の装置の波形整形回路部の回路図、第３図は、第１図の装置の周期測定回路と他の部分の回
路図、第４図は、第２図と第３図の回路の動作を理解するため
の一組の波形を示す波形図、第５図は、第１図から第３図に示した装置の全体として
の動作を図示するための流れ図、第６図ないし第８図は本装置の他の実施例の動作を示す
ための流れ図、第９図および第10図は、第６図、第７図、又は第８図の
各流れ図の部分と代替可能なサブルーチン流れ図であ
る。主要部分の符号の説明 12……変換器、18……波形整形回路 20……補助入力端子、22……周期測定回路 24……入出力インターフェース 26……電子計算機、36……低周波増幅器 50……低周波濾波器、55……比較器 72……シフトレジスタ、75……周期測定ゲート 78……15ビットカウンタ 80……８ビットカウンタFIG. 1 is a block diagram of a speech recognition apparatus constructed in accordance with the teachings of the present invention, showing waveforms at each point, FIG. 2 is a circuit diagram of a waveform shaping circuit section of the apparatus of FIG. 1, and FIG. Is a circuit diagram of the period measuring circuit and other parts of the apparatus of FIG. 1, FIG. 4 is a waveform diagram showing a set of waveforms for understanding the operation of the circuit of FIGS. 2 and 3, FIG. 5 is a flow chart for illustrating the overall operation of the apparatus shown in FIGS. 1 to 3, and FIGS. 6 through 8 are flow charts for showing the operation of another embodiment of the present apparatus. 9 and 10 are sub-routine flow charts that can replace the portions of the flow charts of FIG. 6, FIG. 7, or FIG. Explanation of symbols of main parts 12 …… Converter, 18 …… Wave shaping circuit 20 …… Auxiliary input terminal, 22 …… Period measurement circuit 24 …… Input / output interface 26 …… Electronic computer, 36 …… Low frequency amplifier 50 ...... Low-frequency filter, 55 …… Comparator 72 …… Shift register, 75 …… Period measurement gate 78 …… 15-bit counter 80 …… 8-bit counter

Claims

[Claims]

1. An apparatus for communicating using a voice of an unspecified speaker with a computer in which a series of subroutines corresponding to various pitches are stored in advance, wherein the apparatus transmits an input voice having a repeating period. Conversion means for converting into a signal representing the repetition period, pitch calculation means coupled to the conversion means for calculating the pitch of the received voice and reference voice signal, and the conversion means and the pitch calculation means in response to the conversion means, An electronic computer and a communication device for an unspecified speaker, comprising: a pre-stored subroutine corresponding to a specific pitch calculated by the pitch calculating means, and a selection running means for running the subroutine.

2. A device according to claim 1, wherein the device is capable of responding to a reference voice signal set by a voice input to the converting means in the order of voice of a command. A computer and a communication device for unspecified speakers.

3. An apparatus according to claim 2, wherein the apparatus is capable of responding to musical tones within a musical scale, and a communication apparatus for a computer and an unspecified speaker.

4. A computer according to claim 3, wherein said scale is a modern scale having a pitch including 1.000, 1.250, 1.500 and 2.000 within a permissible error range. Communication device for a specific speaker.

5. A voice activated computer controller, said device comprising: converting means for converting a voice into a pulse train of said voice's repetitive period; and coupled to said converting means to measure the period of said pulse train,
A comparison means for comparing this with a reference cycle, a pitch calculation means for calculating a pitch between the cycle of the pulse train and the reference cycle, and a response to all the above means for responding to a selected and calculated specific pitch value. And a means for causing an electronic computer to execute various routines.

6. A human voice recognition device, said device having means for responding to a command as a voice corresponding to a specific pitch, for recognizing a plurality of specific different commands, the same number of commands being different. A human voice recognition device characterized in that it corresponds to a pitch.

7. A device according to claim 6, wherein said one of said voices expands or contracts the detected pitch when one of said voices repeats continuously over a preselected time. , A human voice recognition device which operates to generate different pitches.

8. An apparatus for recognizing a voice of an unspecified speaker, the apparatus responding to a voice input of a person and generating a pulse train having the same repetition period as that of the voice input, and a waveform shaping circuit of the waveform shaping circuit. A period measuring and comparing circuit for receiving and measuring the pulse train output, the period measuring and comparing circuit comprising means for comparing the period of the pulse train with a reference period, and means for calculating its pitch, An apparatus for recognizing a voice of an unspecified speaker, further comprising means for responding only to a specific pitch and not responding to other pitches.

9. A method for communicating a human voice with an electronic computer through an interface circuit, the method comprising: providing a command as a voice to the interface circuit; calculating a pitch of a received voice with respect to a reference value of the voice. And a step of selecting and executing, in the computer, a subroutine protocol recorded in advance corresponding to a calculated specific pitch, in the computer, the method of communicating a human voice.

10. A method according to claim 9, wherein the voice is a musical tone in a musical scale.

11. A computer according to claim 10, wherein the scale is a modern scale having a pitch including 1.000, 1.250, 1.500 and 2.000 within a permissible error range. Communication method of human voice.

12. The method according to claim 11, wherein the method first converts the reference voice signal into a signal indicating a voice cycle of the reference voice signal, and then converts the voices of the commands in that order. Then, a method of communicating between a computer and a human voice, comprising the step of converting into a signal showing the repetition period and then calculating the pitch.

13. A method according to claim 9, wherein the method comprises the steps of: detecting the point in time when the voice is repeated over a continuous period; and expanding or contracting the detected repeated pitch. And a step of generating different pitches, the method of communicating a human voice with an electronic computer.

14. A method according to claim 9, wherein the method includes the step of converting each voice into a pulse train having a specific period corresponding to the voice. Voice communication method.

15. The method according to claim 12, wherein the converting step includes converting the reference voice signal into a pulse train having a specific period corresponding to a voice of the reference voice signal, and ,
Converting each voice into a pulse train corresponding to the voice,
And a step of calculating the pitch, the communication method between a computer and a human voice.