WO2007066433A1 - 音声認識装置 - Google Patents
音声認識装置 Download PDFInfo
- Publication number
- WO2007066433A1 WO2007066433A1 PCT/JP2006/316257 JP2006316257W WO2007066433A1 WO 2007066433 A1 WO2007066433 A1 WO 2007066433A1 JP 2006316257 W JP2006316257 W JP 2006316257W WO 2007066433 A1 WO2007066433 A1 WO 2007066433A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice recognition
- external
- vocabulary
- document
- data
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Definitions
- the present invention relates to a voice recognition device provided with a plurality of voice recognition documents, and particularly to a voice recognition device provided with a voice recognition device corresponding to a device of a mobile phone.
- the voice recognition is performed by referring to the word () registered in the voice recognition document, and in order to improve the voice recognition, there are many voice recognition documents. You need to register a word. For this reason, some voice recognition devices have a function of adding a word to an existing voice recognition document to update the voice recognition document and / or a function of creating a voice recognition document for human.
- the voice recognition device detects that the process of generating a voice recognition document used to generate an electronic message by voice input is started, it is acquired by the application. There is one that reads the extracted document data, analyzes the document data, extracts unknowns that do not exist in the existing voice recognition document, and creates a voice recognition or document that includes the extracted (for example, , Patent).
- the target when creating or updating a voice recognition document is an application that can receive document data.
- the frequency of words used for recognition may vary greatly depending on the external device connected to the speech recognition device. Considering that they can be different, a single application (that is, a single
- the present invention has been made to solve the above problems, and an object thereof is to obtain a voice recognition device capable of efficiently performing voice recognition according to an external device.
- the external device is connected to the device and the other information is acquired and registered to the external device.
- the external means for acquiring the data the step of extracting the vocabulary from the data as the extracted vocabulary, analyzing the output vocabulary and assigning the output vocabulary to the analysis data, and separately analyzing the analysis data in the corresponding speech recognition document. It is characterized by having a document generation stage to be stored and a voice recognition document of an external device.
- the recognition is recognition based on the speech recognition result and, if it is judged to be recognition, switch the speech recognition document and continue the speech recognition processing.
- it is possible to switch the speech recognition book and perform speech recognition without any special work, and as a result, it is possible to shorten the speech recognition interval and improve recognition.
- FIG. 4 is a block diagram showing an example of a speech recognition device according to item 2 of 4 above together with an external device.
- Fig. 6 is a block diagram showing an example of a speech recognition device according to 3 of 6 above, together with an external device.
- FIG. 8A and 8B are diagrams for explaining the operation of vocabulary dans in the voice recognition device shown in FIG. 86, and (a) and (b) are diagrams showing examples of complements presented by the dans.
- Voice recognition is equipped with voice input such as iku, voice recognition 2, voice recognition of numbers, (called a dictionary) 3 3 (is an integer on 2), and external means 4, external 5, external 6, It is equipped with external book generation (book generation) 7, vocabulary 8, and C (database for providing information on the music recorded in the nct disk database C to the C source or peripherals) 9.
- 2 2 are connected to the external acquisition means 4.
- vocabulary 8 and C form the vocabulary stage, and vocabulary 8 C forms the analysis document.
- 001 2 2 is, for example, a mobile phone, a small size, a cupboard (for example, Pod (product name), Kibod, or P (Pe so a a ss s a), which are different devices, and the dictionaries 3 3 are respectively
- the voice input from the voice input plate is given to the voice recognition 2, and the voice recognition 2 refers to the shift of the dictionary 3 3 generated as described later and outputs the human voice. Recognize and produce voice recognition results. In other words, when performing speech recognition on the outside 2 (the number of shifts up to the far side), the speech recognition is performed using the dictionary 3.
- the external acquisition means 4 acquires the data stored in the external 2 (for example, the reception message for a mobile phone, the artist or am for Po) (step S), and then the external document.
- the generation 7 is notified of the external 2 () (step S 2).
- the step S S 2 is similarly executed.
- the data acquired by the acquisition means 4 is passed to the external 5.
- the part to be analyzed here (for example, the sentence, song, artist, or am of the received message) is extracted and passed to the external 6 as data (step S3).
- step S4 when the extracted data is a sentence of a chapter, the sentence is divided into words by referring to the morpheme and vocabulary 8, and the word obtained in the analysis is added to the word.
- analysis data step S4
- the extracted data is an artist word
- C is searched using that notation, and is obtained and given as analysis data.
- Step S5 the dictionary 3 3 will be generated corresponding to the external 2 2.
- the above-mentioned 3 3 will be saved in the central area of the memory and will not be deleted by any other purpose. Then, every time the voice recognition is activated, or when switching the external 2, the 3 corresponding to the external 2 is used.
- step 3 when the external data generator 7 stores the analysis data in the dictionary 3, first, whether or not there is a corresponding 3 based on the external 2 is present. And (
- step S 7 If the corresponding 3 does not exist in step S 6), a new dictionary 3 is created in the dictionary (step S 7) and the analysis data is stored in this new 3.
- step S6 determines whether the words in dictionary 3 match the analysis data. It Therefore, the data that does not exist in the dictionary 3 is extracted from the analysis data (step S8), and the data that does not exist in the dictionary 3 is extracted. Only the data is stored in the dictionary 3 and the dictionary 3 is updated (new method step S g). Then, the external document generator 7 discards the analysis data already existing in the dictionary 3.
- the new dictionary should not be included in the analysis data that exists in the dictionary.
- the dictionary is configured to be generated according to the external device. Therefore, if speech recognition is performed using an external document, the recognition rate is improved. In addition, the recognition is improved, and the external device can be easily input by voice input.
- the voice recognition unit 3 further has a voice recognition unit 3, and the external acquisition unit 4 also notifies the voice recognition unit 3 of the external unit 2.
- the external acquisition means 4 acquires data from the external 2 (step S), and then the external acquisition means 7 receives the external information device 2
- the external recognition device 3 is notified, and the external recognition device 2 is notified of the external information device 2 (step S).
- the part to be analyzed is output by 5 and passed to the external 6 as data (step S).
- step 6 the vocabulary 8 or C is referenced to obtain the analysis data (the vocabulary to which is attached) (step S).
- Call generation 7 is external Based on the external data sent from 4, the analysis data is stored in 3 corresponding to External 2 (step S 4)
- step S5 When a plurality of vessels are connected to the external acquisition means 4, the book corresponding to each of these vessels is activated.
- the book corresponding to the external device is configured to be the acti. Therefore, if voice recognition is performed and the external device is connected to the external acquisition means, the dictionary can be used. As a result, it is possible to input voice without being aware of changing the dictionary.
- FIG. 6 is a block diagram showing an example of a voice recognition device according to the item 3 of this statement.
- voice recognition 4 the same elements of voice recognition shown in are given the same symbols.
- Voice recognition 4 is a vocabulary supplementary selection 4
- Lexical complement selection 4 allows multiple complements to be selected as choices
- the external 6 is passed to the external document generator 7 as selective analysis data.
- external book generation 7 is Based on the external sent from the acquisition means 4, the analysis data is stored in 3 corresponding to the external 2.
- the external acquisition means 4 transfers the external 2 to the external document generator 7 and acquires the data from the external 2 as described above (step S 6), send this data to external 5.
- step S 6 extracts the part to be analyzed from the data acquired by the external acquisition means 4, and, for example, in step S 7) at which the artist and the am data are output, Send to external 6.
- step 6 first, it is judged whether the extracted data is a sentence according to the chapter (step S8) . If the extracted data is a sentence, the external part 6 refers to the morpheme and the vocabulary 8 and the sentence Divide the chapter into words () and assign to the words obtained during analysis to obtain analysis data (step S g)
- step S8 If the extracted data is judged not to be a sentence in step S8, the analysis data is stored in 2 corresponding to 2 (step S2).
- step 6 C is searched using the notation as a step (step S 2), and it is judged whether or not it exists as a result of the search (that is, step S 22 in which it is judged that there is a match). If there is a match, the external 6 adds to be the analysis data (step S23). Then, step S 2 is performed by the external document generation 7.
- step 6 the vocabulary complement similar to the extracted data is selected (step S 24), and the vocabulary 42 presents the vocabulary complement to the data () 43.
- O x OX is presented as a vocabulary complement in presentation 43, and when the user selects a vocabulary complement as a selection by vocabulary complement selection 4, the selection is performed in step S 23.
- the external 6 gives the selection and becomes the analysis data.
- step S 2 is performed by the external document generation 7.
- the type (the number of characters to the number) of the characters of the registered data of the extracted data is calculated for the converted character.
- external 6 is the lexical Presented in 43.
- steps S 22 S 24 correspond to the step.
- the supplement of the vocabulary supplement is presented by Z by presenting the supplement.
- the supplement is configured to be registered in the dictionary as a selection. Therefore, it is possible to change each dictionary and reduce recognition.
- the voice recognition according to the present invention and the voice recognition can be efficiently performed according to the external device, so that the voice recognition is suitable for use, for example, in a voice recognition device used in a mobile phone.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE112006002979T DE112006002979T5 (de) | 2005-12-07 | 2006-08-18 | Spracherkennungsvorrichtung |
JP2007549020A JP4846734B2 (ja) | 2005-12-07 | 2006-08-18 | 音声認識装置 |
US11/992,938 US8060368B2 (en) | 2005-12-07 | 2006-08-18 | Speech recognition apparatus |
CN2006800464353A CN101326571B (zh) | 2005-12-07 | 2006-08-18 | 声音识别装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2005-353695 | 2005-12-07 | ||
JP2005353695 | 2005-12-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2007066433A1 true WO2007066433A1 (ja) | 2007-06-14 |
Family
ID=38122585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2006/316257 WO2007066433A1 (ja) | 2005-12-07 | 2006-08-18 | 音声認識装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US8060368B2 (ja) |
JP (1) | JP4846734B2 (ja) |
CN (1) | CN101326571B (ja) |
DE (1) | DE112006002979T5 (ja) |
WO (1) | WO2007066433A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009204872A (ja) * | 2008-02-28 | 2009-09-10 | Alpine Electronics Inc | 音声認識用辞書生成システム |
WO2017179335A1 (ja) * | 2016-04-11 | 2017-10-19 | ソニー株式会社 | 情報処理装置、情報処理方法およびプログラム |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5465926B2 (ja) * | 2009-05-22 | 2014-04-09 | アルパイン株式会社 | 音声認識辞書作成装置及び音声認識辞書作成方法 |
US9230538B2 (en) * | 2011-04-08 | 2016-01-05 | Mitsubishi Electric Corporation | Voice recognition device and navigation device |
US9235565B2 (en) * | 2012-02-14 | 2016-01-12 | Facebook, Inc. | Blending customized user dictionaries |
TWI508057B (zh) * | 2013-07-15 | 2015-11-11 | Chunghwa Picture Tubes Ltd | 語音辨識系統以及方法 |
DE102014209358A1 (de) * | 2014-05-16 | 2015-11-19 | Ford Global Technologies, Llc | Vorrichtung und Verfahren zur Spracherkennung, insbesondere in einem Fahrzeug |
KR102095514B1 (ko) * | 2016-10-03 | 2020-03-31 | 구글 엘엘씨 | 디바이스 토폴로지에 기초한 음성 명령 프로세싱 |
US10572586B2 (en) * | 2018-02-27 | 2020-02-25 | International Business Machines Corporation | Technique for automatically splitting words |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08248980A (ja) * | 1995-03-06 | 1996-09-27 | Fuji Xerox Co Ltd | 音声認識装置 |
JPH09171395A (ja) * | 1995-12-20 | 1997-06-30 | Oki Electric Ind Co Ltd | 音声認識装置 |
JPH11231886A (ja) * | 1998-02-18 | 1999-08-27 | Denso Corp | 登録名称認識装置 |
JPH11312073A (ja) * | 1998-04-27 | 1999-11-09 | Fujitsu Ltd | 意味認識システム |
JPH11311996A (ja) * | 1997-10-23 | 1999-11-09 | Sony Internatl Europ Gmbh | 音声装置及び遠隔制御可能なネットワーク機器 |
JP2001022374A (ja) * | 1999-07-05 | 2001-01-26 | Victor Co Of Japan Ltd | 電子番組ガイドの操作装置および電子番組ガイドの送信装置 |
JP2001042884A (ja) * | 1999-07-27 | 2001-02-16 | Sony Corp | 音声認識制御システム及び音声認識制御方法 |
JP2001092485A (ja) * | 1999-09-10 | 2001-04-06 | Internatl Business Mach Corp <Ibm> | 音声情報の登録方法、認識文字列の特定方法、音声認識装置、音声情報の登録のためのソフトウエア・プロダクトを格納した記憶媒体、及び認識文字列の特定のためのソフトウエア・プロダクトを格納した記憶媒体 |
WO2002001550A1 (fr) * | 2000-06-26 | 2002-01-03 | Mitsubishi Denki Kabushiki Kaisha | Procede et systeme de commande d'un dispositif |
JP2002091755A (ja) * | 2000-05-09 | 2002-03-29 | Internatl Business Mach Corp <Ibm> | サービス・ディスカバリー・ネットワークで装置の音声制御を使用可能にするための方法およびシステム |
JP2002351652A (ja) * | 2001-05-23 | 2002-12-06 | Nec System Technologies Ltd | 音声認識操作支援システム、音声認識操作支援方法、および、音声認識操作支援プログラム |
JP2003255982A (ja) * | 2002-02-28 | 2003-09-10 | Fujitsu Ltd | 音声認識システムおよび音声ファイル記録システム |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05342260A (ja) | 1992-06-08 | 1993-12-24 | Sharp Corp | 単語綴りチェック装置 |
US5825306A (en) | 1995-08-25 | 1998-10-20 | Aisin Aw Co., Ltd. | Navigation system for vehicles |
US5809471A (en) | 1996-03-07 | 1998-09-15 | Ibm Corporation | Retrieval of additional information not found in interactive TV or telephony signal by application using dynamically extracted vocabulary |
US6173266B1 (en) * | 1997-05-06 | 2001-01-09 | Speechworks International, Inc. | System and method for developing interactive speech applications |
JP4201870B2 (ja) | 1998-02-24 | 2008-12-24 | クラリオン株式会社 | 音声認識による制御を用いるシステム及び音声認識による制御方法 |
US6233559B1 (en) * | 1998-04-01 | 2001-05-15 | Motorola, Inc. | Speech control of multiple applications using applets |
DE19910236A1 (de) * | 1999-03-09 | 2000-09-21 | Philips Corp Intellectual Pty | Verfahren zur Spracherkennung |
WO2000058942A2 (en) * | 1999-03-26 | 2000-10-05 | Koninklijke Philips Electronics N.V. | Client-server speech recognition |
US6360201B1 (en) * | 1999-06-08 | 2002-03-19 | International Business Machines Corp. | Method and apparatus for activating and deactivating auxiliary topic libraries in a speech dictation system |
JP2001296881A (ja) | 2000-04-14 | 2001-10-26 | Sony Corp | 情報処理装置および方法、並びに記録媒体 |
JP3911178B2 (ja) | 2002-03-19 | 2007-05-09 | シャープ株式会社 | 音声認識辞書作成装置および音声認識辞書作成方法、音声認識装置、携帯端末器、音声認識システム、音声認識辞書作成プログラム、並びに、プログラム記録媒体 |
EP1575031A3 (en) * | 2002-05-15 | 2010-08-11 | Pioneer Corporation | Voice recognition apparatus |
US7003457B2 (en) * | 2002-10-29 | 2006-02-21 | Nokia Corporation | Method and system for text editing in hand-held electronic device |
JP4217495B2 (ja) | 2003-01-29 | 2009-02-04 | キヤノン株式会社 | 音声認識辞書作成方法、音声認識辞書作成装置及びプログラム、記録媒体 |
JP2005148151A (ja) * | 2003-11-11 | 2005-06-09 | Mitsubishi Electric Corp | 音声操作装置 |
-
2006
- 2006-08-18 DE DE112006002979T patent/DE112006002979T5/de not_active Withdrawn
- 2006-08-18 WO PCT/JP2006/316257 patent/WO2007066433A1/ja active Application Filing
- 2006-08-18 CN CN2006800464353A patent/CN101326571B/zh not_active Expired - Fee Related
- 2006-08-18 US US11/992,938 patent/US8060368B2/en active Active
- 2006-08-18 JP JP2007549020A patent/JP4846734B2/ja not_active Expired - Fee Related
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08248980A (ja) * | 1995-03-06 | 1996-09-27 | Fuji Xerox Co Ltd | 音声認識装置 |
JPH09171395A (ja) * | 1995-12-20 | 1997-06-30 | Oki Electric Ind Co Ltd | 音声認識装置 |
JPH11311996A (ja) * | 1997-10-23 | 1999-11-09 | Sony Internatl Europ Gmbh | 音声装置及び遠隔制御可能なネットワーク機器 |
JPH11231886A (ja) * | 1998-02-18 | 1999-08-27 | Denso Corp | 登録名称認識装置 |
JPH11312073A (ja) * | 1998-04-27 | 1999-11-09 | Fujitsu Ltd | 意味認識システム |
JP2001022374A (ja) * | 1999-07-05 | 2001-01-26 | Victor Co Of Japan Ltd | 電子番組ガイドの操作装置および電子番組ガイドの送信装置 |
JP2001042884A (ja) * | 1999-07-27 | 2001-02-16 | Sony Corp | 音声認識制御システム及び音声認識制御方法 |
JP2001092485A (ja) * | 1999-09-10 | 2001-04-06 | Internatl Business Mach Corp <Ibm> | 音声情報の登録方法、認識文字列の特定方法、音声認識装置、音声情報の登録のためのソフトウエア・プロダクトを格納した記憶媒体、及び認識文字列の特定のためのソフトウエア・プロダクトを格納した記憶媒体 |
JP2002091755A (ja) * | 2000-05-09 | 2002-03-29 | Internatl Business Mach Corp <Ibm> | サービス・ディスカバリー・ネットワークで装置の音声制御を使用可能にするための方法およびシステム |
WO2002001550A1 (fr) * | 2000-06-26 | 2002-01-03 | Mitsubishi Denki Kabushiki Kaisha | Procede et systeme de commande d'un dispositif |
JP2002351652A (ja) * | 2001-05-23 | 2002-12-06 | Nec System Technologies Ltd | 音声認識操作支援システム、音声認識操作支援方法、および、音声認識操作支援プログラム |
JP2003255982A (ja) * | 2002-02-28 | 2003-09-10 | Fujitsu Ltd | 音声認識システムおよび音声ファイル記録システム |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009204872A (ja) * | 2008-02-28 | 2009-09-10 | Alpine Electronics Inc | 音声認識用辞書生成システム |
WO2017179335A1 (ja) * | 2016-04-11 | 2017-10-19 | ソニー株式会社 | 情報処理装置、情報処理方法およびプログラム |
Also Published As
Publication number | Publication date |
---|---|
CN101326571B (zh) | 2012-05-23 |
US8060368B2 (en) | 2011-11-15 |
DE112006002979T5 (de) | 2008-10-09 |
JP4846734B2 (ja) | 2011-12-28 |
US20090228276A1 (en) | 2009-09-10 |
CN101326571A (zh) | 2008-12-17 |
JPWO2007066433A1 (ja) | 2009-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2007066433A1 (ja) | 音声認識装置 | |
US8352268B2 (en) | Systems and methods for selective rate of speech and speech preferences for text to speech synthesis | |
US8355919B2 (en) | Systems and methods for text normalization for text to speech synthesis | |
US8396714B2 (en) | Systems and methods for concatenation of words in text to speech synthesis | |
US8583418B2 (en) | Systems and methods of detecting language and natural language strings for text to speech synthesis | |
US8352272B2 (en) | Systems and methods for text to speech synthesis | |
US8712776B2 (en) | Systems and methods for selective text to speech synthesis | |
US20100082327A1 (en) | Systems and methods for mapping phonemes for text to speech synthesis | |
US20100082328A1 (en) | Systems and methods for speech preprocessing in text to speech synthesis | |
EP2207165A1 (en) | Information processing apparatus and text-to-speech method | |
JPH0916602A (ja) | 翻訳装置および翻訳方法 | |
US20090204401A1 (en) | Speech processing system, speech processing method, and speech processing program | |
JP5221768B2 (ja) | 翻訳装置、及びプログラム | |
JP2015014665A (ja) | 音声認識装置及び方法、並びに、半導体集積回路装置 | |
JP5152588B2 (ja) | 声質変化判定装置、声質変化判定方法、声質変化判定プログラム | |
WO2008018287A1 (fr) | dispositif de recherche et dispositif de génération de base de données de recherche | |
JP2008243076A (ja) | 翻訳装置、方法及びプログラム | |
JP2006065651A (ja) | 商標称呼検索プログラム、商標称呼検索装置及び商標称呼検索方法 | |
JP2002358091A (ja) | 音声合成方法および音声合成装置 | |
JP6567372B2 (ja) | 編集支援装置、編集支援方法及びプログラム | |
JP2002049386A (ja) | テキスト音声合成装置、テキスト音声合成方法及びその方法を記録した記録媒体 | |
JPH08185197A (ja) | 日本語解析装置、及び日本語テキスト音声合成装置 | |
JP2008164785A (ja) | 読み情報生成装置、読み情報生成方法、読み情報生成プログラムおよび音声合成装置 | |
JP2007171275A (ja) | 言語処理装置及び現後処理方法 | |
KR20090054616A (ko) | 시각장애인을 위한 음성낭독 단말기용 색인어 검색방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200680046435.3 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2007549020 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11992938 Country of ref document: US |
|
RET | De translation (de og part 6b) |
Ref document number: 112006002979 Country of ref document: DE Date of ref document: 20081009 Kind code of ref document: P |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112006002979 Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 06782818 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8607 |