TW446934B

TW446934B - Speech recognition and control system and telephone

Info

Publication number: TW446934B
Application number: TW88117666A
Authority: TW
Inventors: Hans Hermansson; Stefan Bruhn; Hans-Guenter Hirsch; Stefan Dobler
Original assignee: Ericsson Telefon Ab L M
Priority date: 1998-10-13
Filing date: 1999-10-13
Publication date: 2001-07-21
Also published as: WO2000022609A1; AU1424400A

Abstract

A speech recognition and control system suitable for mobile telephones has, for each of four or five major languages, a preprogrammed store containing many variations of a set of telephone operating commands. The user can manually select one of these four or five major languages, and the selected preprogrammed language store will be consulted when the user utters a word into the telephone. A match (recognition) prompts execution of the desired telephone function. The user can replace each of the preprogrammed commands with his own user-chosen and -spoken commands to create his own set of commands specific to his own native language/dialect and/or pronunciation. The user can also add additional user-dependent commands and a personal user-defined telephone directory.

Description

^ 4,469 3 d A7 B7 經濟部智慧財產局負工消費合作社印製五、發明說明（1 ) 技術範圍本發明關於語音辨識及控制系統，特別關於可由使用者以説出之栺令或以説出與手動組合之指令之控制電話之統。 # ^ 發明之背景一種由説出之指令控制之電話裝置，必須有辨識麥克風接收之語音系統。共有二種不同類別之語音辨識系統，發話人無關系統及發話人有關系統。 " 發話人有關之語音辨識系统可修改以適應系統之個別使用者，例如行動電話之所有人以響應説出之指令而操作。發話人相關之語音辨識系統可辨識由個別用户説出之指令字：個別用彳亦可使用其自己之語言或其建立之語言^ 發裝，，如行動話之不同功能。爲此，用户必須訓練辨識系統實施一長而累贄程式程序，其中之每一指令必須由用尸重複數次。此-程序必須在裝備使用前完成。此種系統不許任何人未經過相同之最初之累贊程序而使用此H 發話人無關之系統爲可辨識在字彙中説出之字，而不論發話人語音之性別年齡及腔調。不同之發話人必須; 以抽樣，以便提供系統可辨識特別之發音之廣泛範圍之系統。發話人無關語音辨識系統之優點爲，其可立即使用不需系統之最初"剜練"以辨識字彙中之各字。最理想是發爷人無關語音辨識系統應甚爲廣泛，可以辨識所有不同可能之發音及腔調，並具有世界上每一不同語言之個別語言模式。此理想系統實際上不易實施，即使企圖涵蓋歐洲所用^ 4,469 3 d A7 B7 Printed by the Consumers ’Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs V. Description of the Invention (1) Technical Scope The present invention relates to speech recognition and control systems, and in particular, to orders or statements that can be spoken by users System of control telephone with manual combination of instructions. # ^ BACKGROUND OF THE INVENTION A telephone device controlled by spoken instructions must have a voice system that recognizes the microphone. There are two different types of speech recognition systems, speaker-independent system and speaker-related system. " The speech recognition system associated with the caller can be modified to suit individual users of the system, such as the owner of the mobile phone, in response to the spoken instructions. The speaker-related speech recognition system can recognize instructions spoken by individual users: individual users can also use their own language or a language created by them ^ such as different functions of mobile phone. To this end, the user must train the recognition system to implement a long and tiring program procedure, each of which must be repeated several times by the corpse. This procedure must be completed before the equipment is used. Such a system does not allow anyone to use this H-speaker-independent system without going through the same initial acclamation process as recognizing words spoken in the vocabulary, regardless of the gender, age, and tone of the speaker's voice. Different speakers must; sampling to provide a system that can identify a wide range of particular pronunciations. The advantage of the speaker-independent speech recognition system is that it can immediately use the original " skill " without the system to recognize each word in the vocabulary. Ideally, the speaker-independent speech recognition system should be extensive, can recognize all the different possible pronunciations and accents, and have an individual language model for each different language in the world. This ideal system is actually not easy to implement, even if it is intended to cover Europe

^靖先閱磧背面之注意事項再填窵本頁)^ Jing first read the notes on the back of the book before filling out this page)

J 裴：訂，. 線一經濟部智慧財產局員工消費合作社印製 A4o9 3 4 A7 ___B7_______ 五、發明說明（2 ) 之語言亦不容易。辨識基礎越廣包括語言越多，則發話人無關之系統之建立將更爲複雜及昂貴。如在語音辨識及控制系統併入更多之功能及修改將涉及更多之工作。相關技藝之救述上述型式之許多不同語音辨識系統已發展問世。其中一系統揭示於頒給Ringland等人之W096/13 82 7 (PCT/GB95/ 02 5 63)之專利中。此聞名系統係根據各別語音（次級字）之辨識’該各別語音再組合形成控制裝置之不同功能之指令。不辨識整個字，其在理論上可能有無眼數目，此聞名系統辨識構成字段之各語音，其數目爲有限的。在正確辨識一語音之後，一處理器將其與其他正確辨識之鄰近語音結合 ’以建立一字或姿態。此系統以存儲容量而言頗爲經濟。在系統使用中當一特殊語音被辨識後，與預限定存儲之標準語音加以比蛟，此語音之實際用户發音及其特殊腔調均存儲在另一記憶體中，因此’在此一特殊發話人發音時，可改進此語音之進一步辨識。此一聞名系統不許用户更換其他語言，如一瑞典用户希望说s 1 a而不説” d i a 1”。同時亦不許自一特殊語音之標準發之偏移。本發明之概述本發明之語音辨識及控制系統並併入一電話機於此系統以克服此等缺點。此語音辨識及控制系統係包括在本發明之電話機中，該系統係預先程式以辨識至少可聽指令之發話人無關之一組。基本發話人無關之可聽指令之數個不同 -5- 本紙張尺度適財國®家標準（CNS>A4規格（210 X 297公爱) -- (請先閱讀背面之注意事項再填寫本頁) 裝訂！ -*線“ 4469 3 4J Pei: Order, line 1. Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs A4o9 3 4 A7 ___B7_______ 5. The language of the invention description (2) is not easy. The wider the identification basis, including the more languages, the more complex and expensive the establishment of speaker-independent systems will be. If more functions and modifications are incorporated in the speech recognition and control system, more work will be involved. Rescue of Related Techniques Many different speech recognition systems of the above types have been developed. One of these systems is disclosed in the W096 / 13 82 7 (PCT / GB95 / 02 5 63) patent issued to Ringland et al. This well-known system is based on the identification of individual voices (secondary words), and the individual voices are combined to form instructions for different functions of the control device. Without recognizing the entire word, it may theoretically have the number of eyes. This well-known system recognizes the individual speeches that make up a field, and its number is limited. After correctly recognizing a speech, a processor combines it with other correctly recognized neighboring speeches to build a word or gesture. This system is quite economical in terms of storage capacity. In system use, when a special voice is recognized, it is compared with the pre-defined standard voice. The actual user's pronunciation of this voice and its special accent are stored in another memory, so 'in this special caller When pronounced, further recognition of this voice can be improved. This famous system does not allow users to change other languages. For example, a Swedish user wants to say s 1 a instead of "d i a 1". It is also not allowed to deviate from the standard speech of a particular voice. SUMMARY OF THE INVENTION The speech recognition and control system of the present invention is incorporated into a telephone set to this system to overcome these disadvantages. The speech recognition and control system is included in the telephone set of the present invention, and the system is pre-programmed to recognize the caller irrelevant group of at least audible instructions. There are several different audible instructions that are not relevant to the basic speaker. -5- The paper size is suitable for the wealthy country® standard (CNS > A4 size (210 X 297 public love))-(Please read the precautions on the back before filling this page. ) Binding!-* Line "4469 3 4

五、發明說明（3 經濟部智慧財產局員工消費合作杜印製V. Description of the invention

组，每一組供教Φ jf: +五+ , ^ 默王要s，如英語，法語，德語，西组，曰語等’可由系統及電話機之製造人生產並預行程式二中。如此系統併入行動電話中，冑當之語言可經由節目二言選擇功能予以手動選擇。此後’用户可對系统說出選; 之主要語言之預行程式有限數目之—可聽指令。、禪此舉不需要用户之任何啓動或系純式作業。在指令之預行程式組中之每-字，不同標準發音之廣泛範圍均;在手動選擇之主要語言中被辨識。當用户在使用系統時’用户可以增加用户相關之指令。可能關於個人電話簿之各欄。本發明之系統之構型可使每 -預行程式之指令可由用户特定發音所取代。其可能爲指令之選擇預行程式組中—指令之用户方言發音，或可能爲用尸（本國語言中之對應指令，其並非主要預行程式語言之一。一瑞典用户可將英語指令"dial，，由瑞典語指令,，sla„ 取代’俾系统可適應任何語言。此外’此取代特性可使用户以任何用户選擇之密碼或想像之指令取代標準指令。本發明之系統亦備有用户啓動功能，以將系臨時或永久返回可辨識指令之原來預行程式之一組。本發明之語音辨及指令系統，因此提供一優點用關，隨時可用之系統’及用户特定及用户限定系統之多樣化及習慣化而無其缺點。圖式之說明伴隨4圖式圖解説明併入一行動電話中之語音辨識及指 -6 - 本紙張尺度適用中國國家標準（CNS)A4規格（210 * 297公漦） (請先閱讀背面之注意事項再填寫本頁) W裳--------訂---------線，經濟部智慧財產局員工消費合作社印製 A469 3 4 A7 ----- B7 五、發明說明（4 ) 令系統之一具體實例。詳細敘述圖式顯示本發明之併入電話中之語音辨識及控制系统之方塊圖：語言辨識單元2及語言控制單元5併入—電話機中。數語言如英德’西’日語等程式指令存儲器3 转合至語言辨識單元。在每一語言存儲器中，—組指；字以不同之料變體存儲㈣蓋廣泛範®之每-待辨識指令之不同發音，音調等。用尸首先選擇其最初使用之語言。此可自電話顯示窗中之清单中手動選擇一理想之語言。此例中之箭頭顯示選擇法語爲主要語言以聲音指令操作電話。用户以電話發射機/ 麥克風中標準法語發出指令。用户可說出指令、〇一" (dia丨），語音辨識單元將檢查以已手動選擇之法語存儲之指令，檢查用尸説出之"compose”所產生之音頻信號是否匹配任何儲於法語指令存儲器中之指令之變體。如匹配時，語音控制單元將發出"執行•，信號至電話之作業單元6以執行 ”dial”作業。用户預程式指令存儲器4與預程式主要語言指令存儲器 3並聯。用户可以進入一用户預程式存儲器中之發音以取代選擇之主語言指令存儲器中之指令。電話機爲，，取代，，模式時’用户可發出compose’1指令作爲最相操作語言，再以其瑞典語T "sla”説出dia丨一字。於是其可取代標準組預程式之所有指令更換爲其自己語言，方言或發音之指令。如希望時’尚可進入一密^字。此系統尚備有超越功能以忽本紙張尺度適用t國國家楳準<CNS)A4规格（210 X 297公« ) — IIIIIIIIIIH >!111111 ^ {靖先閱讀背面之注意事項再填寫本頁) 446934 A7 B7 五、發明說明（5 ) 略進入用户預程式之指令儲存器之指令，而用選擇之主語言指令。此可使另一用户使用電話機之語音辨識及控制功能。用户亦可進入其自已可辨識之額外指令以编成其個人電話簿◊每一存儲之號碼可耦合至一用户發音名字，此名字可遵守指令”<^&1/(：011^0 5 6/3131，撥號至所望之個人。個人電話簿中之號碼可以人工，自動或音頻辨識方式進入。吾人認爲用户亦可增加額外指令至用户預程式指令存儲器。例如休止功能可由用户限定之指令啓動及控制。以簡單之例爲發出一選擇之用户限定指令β顯示電池剩餘電力。任何時區中之現在時間亦可發出由用户限定指令予以顯 ----------i Iw-^---- (請先閱讀背面之注意事項再填寫本I> 訂--- 經濟部智慧財產局員工消費合作社印製 8 本紙張尺度適用中困國家標準（CNS)A4规格（210 X 297公釐）Group, each group is for teaching Φ jf: + five +, ^ Mo Wang wants s, such as English, French, German, Western group, Japanese, etc. ′ can be produced by the system and telephone manufacturer and pre-traveled. Such a system is integrated into a mobile phone, and the appropriate language can be manually selected via the program's second language selection function. After that, the user can speak the system in a limited number of pre-travel styles of the main language-audible instructions. This action does not require any activation or pure operation by the user. Each word in the pre-stroke group of instructions has a wide range of different standard pronunciations; it is recognized in the main language selected manually. When the user is using the system, the user can add user-related instructions. May be about columns of personal phonebook. The system of the present invention is configured such that each pre-stroke instruction can be replaced by a user-specific pronunciation. It may be a selection of instructions in the pre-travel group—the user ’s dialect of the instruction, or it may be a corpse (the corresponding instruction in the national language, which is not one of the main pre-travel languages. A Swedish user can put the English instruction " Dial, replaced by Swedish commands, sla „The system can be adapted to any language. In addition, this substitution feature allows users to replace standard commands with any user-selected password or imaginary command. The system of the present invention also has users The function is activated to temporarily or permanently return to the original pre-travel type of recognizable instructions. The speech recognition and instruction system of the present invention therefore provides an advantage, a system that is always available and a user-specific and user-defined system It is diversified and habitual without its shortcomings. The explanation of the drawings is accompanied by 4 drawings illustrating the speech recognition and reference incorporated into a mobile phone. -6-This paper standard is applicable to China National Standard (CNS) A4 specification (210 * 297 gong) (Please read the notes on the back before filling in this page) W Chang -------- Order --------- line, the consumer cooperation of the Intellectual Property Bureau of the Ministry of Economic Affairs Printed A469 3 4 A7 ----- B7 V. Description of the invention (4) A specific example of the order system. The detailed description shows the block diagram of the voice recognition and control system incorporated in the telephone of the present invention: language recognition Unit 2 and language control unit 5 are incorporated into the telephone. The number of languages such as English, German, Western, Japanese, and other program instruction memory 3 is transferred to the language recognition unit. In each language memory,-the group refers to; the words use different materials The variant stores a wide range of different pronunciations, tones, etc. for each command to be recognized. First select the language in which it was originally used. This can be manually selected from the list in the phone display window. This example The arrow in the middle shows that French is selected as the main language to operate the phone with voice commands. The user sends the command in standard French in the phone transmitter / microphone. The user can say the command, 〇一 " (dia 丨), and the voice recognition unit will check the Manually select the command stored in French, check if the audio signal generated by "compose" is matched with any variant of the command stored in the French command memory If it matches, the voice control unit will issue "execute •" signal to the phone's operation unit 6 to perform "dial" operation. The user pre-program instruction memory 4 is connected in parallel with the pre-program main language instruction memory 3. The user can enter a user pre-program The pronunciation in the program memory replaces the instruction in the selected main language instruction memory. When the phone is in, instead of, the 'user can issue compose' 1 command as the most appropriate operating language, and then use its Swedish T " sla " Say the word dia 丨. So it can replace all the instructions of the standard set of pre-programs with their own language, dialect or pronunciation instructions. If you want, you can still enter a password. This system is also equipped with transcendence functions to Ignoring this paper's specifications Applicable to national standards < CNS) A4 specifications (210 X 297 male «) — IIIIIIIIIIH >! 111111 ^ {Jing first read the precautions on the back before filling this page) 446934 A7 B7 V. Description of the invention (5) The instructions in the instruction memory of the user's pre-program are omitted, and the selected main language instructions are used. This allows another user to use the phone's voice recognition and control functions. Users can also enter their own recognizable additional instructions to compile their personal phonebook. Each stored number can be coupled to a user-pronounced name. This name can follow the instructions "< ^ & 1 / (: 011 ^ 0 5 6/3131, dial to the desired individual. The numbers in the personal phonebook can be entered manually, automatically or by audio recognition. I think that the user can also add additional instructions to the user's pre-program instruction memory. For example, the pause function can be defined by the user Command start and control. Take a simple example to issue a selected user-defined command β to display the remaining battery power. The current time in any time zone can also be issued and displayed by a user-defined command i-w -^ ---- (Please read the notes on the back before filling in this I > Order --- Printed by the Employees' Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 8 This paper size is applicable to the National Standard for Difficulties (CNS) A4 (210 X 297 mm)

Claims

4469 3 d A8 B8 C8 D8

Scope of patent application: Printed by the Consumers ’Cooperative of the Intellectual Property Bureau of the Ministry of Economics — a voice recognition and control system, including: ~ a microphone to receive audible instructions issued by Lu Bushan, without any clues, and generate electricity Audio signal, 9 a processor, including the type of electrical audio signal received by the recognizer at Μ% and 4 for generating a command signal to ring the finger. 7. The processor is coupled to a microphone to receive the generated power. Audio signals, in which the processor is programmed to discriminate at least-one that has nothing to do with the speaker-a set of audible instructions, a _ this or a group of women.,, And all the audible instructions can be used and used by the user Corpse-specific 'select and speak audible instructions replaced. , 2 '. If the patent application scope of the first Bayou voice recognition and control system, the note processor is pre-programmed to identify the door τ door, people. Λ Zhong Yan hard many audible instructions in different languages Groups of instructions not related to the speaker. 3. If the scope of patent application is the first! The item's speech recognition and control system, in which the processor is configured to incorporate additional user-specific; user-specific audible instructions entered by the user. 4. A telephone comprising a speech recognition and control system such as the patent application Fangu item i, 2 or 3, wherein the microphone is a telephone transmitter, and the processor generates a command signal to operate the telephone. 5. ^ For the telephone in the scope of patent application No. 4, one of the pre-programs has nothing to do with the speaker. The language audible instruction set can be used by manually selecting the desired language mode on the telephone. 6. If the telephone number of the scope of patent application is item 4, the telephone number is a mobile phone. 7. For the telephone in the scope of patent application item 4, the audible instructions specific to the user include a list of names with relevant telephone numbers. -9- The paper size is applicable to China National Private Standard (CNS) A4 (210 X 297 public love) * 1 — — !! — —! Zhuang Yi * --- II— ----- (Please read first Zhuyin on the back? Please fill in this page for more information}