CN101000767A - Speech recognition equipment and method - Google Patents

Speech recognition equipment and method Download PDF

Info

Publication number
CN101000767A
CN101000767A CNA2006100005316A CN200610000531A CN101000767A CN 101000767 A CN101000767 A CN 101000767A CN A2006100005316 A CNA2006100005316 A CN A2006100005316A CN 200610000531 A CN200610000531 A CN 200610000531A CN 101000767 A CN101000767 A CN 101000767A
Authority
CN
China
Prior art keywords
information
phonetic feature
received pronunciation
voice
received
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2006100005316A
Other languages
Chinese (zh)
Inventor
陈刚
陈骧
吕凡
王强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HANGZHOU SHIDAO SCIENCE AND TECHNOLOGY Co Ltd
Original Assignee
HANGZHOU SHIDAO SCIENCE AND TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HANGZHOU SHIDAO SCIENCE AND TECHNOLOGY Co Ltd filed Critical HANGZHOU SHIDAO SCIENCE AND TECHNOLOGY Co Ltd
Priority to CNA2006100005316A priority Critical patent/CN101000767A/en
Publication of CN101000767A publication Critical patent/CN101000767A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

A method for identifying voice includes packing up voice character information from user voice information by voice character picking up module, sending said voice character information to voice character information to voice character identification unit, identifying whether user voice information is effective or not by voice character identification unit according to received voice character identification information. The voice identification device used for realizing said method is also disclosed.

Description

Speech recognition apparatus and method thereof
Technical field
The present invention relates to a kind of speech recognition apparatus and method, more specifically, relate to a kind of speech recognition apparatus and method thereof that is used for the communications field.
Background technology
In existing a large amount of telecommunication services, the operation terminal need, colourful voice messaging a large amount of with user interactions, and nearly all voice messaging is all finished by artificial recording, this has increased the cost of operator greatly, elongate the time of information issuing and renewal simultaneously, be difficult in time satisfy the dynamic various needs of user.And the method for prerecording is adopted in most recording and do not use the reason of speech synthesis system generation voice to be: traditional speech synthesis system often has the shortcoming that tone color is flat, the machine taste of voice is dense and lack characteristic.
Simultaneously, the user must be after hearing out voice suggestion when selecting information with traditional method, the voice suggestion that the button that constantly stops enters next section, and behind a series of loaded down with trivial details button operation of process, the information that can obtain wanting.Like this, multi-level complex operations often makes the user be fed up with, and simultaneously, the efficient of promptly losing time is low again.
Therefore, there is the cost height in traditional method, technological means is loaded down with trivial details and use problem such as inconvenience.
Summary of the invention
The present invention is devoted to overcome one or more in the problems referred to above of prior art, for this reason, a kind of speech recognition apparatus and method are provided, this method utilizes TTS (speech synthesis technique) technology at the accurate voice messaging database of the built-in day-mark of wire net, and the user can utilize speech recognition technology to carry out Information Selection by communication network like this.
For achieving the above object, the invention provides a kind of speech recognition apparatus, comprising: the received pronunciation information database is used for storing by the resulting received pronunciation information of TTS technology converting text information; The phonetic feature extraction element is used for extracting phonetic feature from user's voice; And the phonetic feature recognition device, be used to discern user's voice information.Optionally, this speech recognition apparatus comprises the phonetic feature information-storing device, is used to store the phonetic feature information of being extracted by the phonetic feature extraction element.
Preferably, speech recognition apparatus according to the present invention comprises: specific area language speech synthetic device, be used for carrying out phonetic synthesis at specific area respectively so that synthetic voice have more is professional, guarantee simultaneously voice natural and tripping, near true man's voice; The background music adding set is used for adding background music to received pronunciation information when phonetic synthesis, make communication process become rich and varied.
Simultaneously, for achieving the above object, the present invention also provides a kind of audio recognition method.The received pronunciation information database that this method utilization is set up carries out speech recognition, may further comprise the steps: utilize TTS (speech synthesis technique) technology to set up the received pronunciation information database; When the user passes through speech input information, from user's voice, extract phonetic feature information by the phonetic feature extraction element, subsequently this phonetic feature information is sent to the phonetic feature recognition device; Alternatively, the phonetic feature extraction element arrives the phonetic feature information-storing device with the phonetic feature information storage of being extracted, and the phonetic feature recognition device reads phonetic feature information from the phonetic feature information-storing device.Subsequently, the phonetic feature recognition device is searched for the phonetic feature information information corresponding with received (institute read) from the received pronunciation information database, thereby user speech is discerned, and at last, by communication network information is offered the user.
Beneficial effect of the present invention is: utilize speech recognition apparatus to convert text message to voice messaging, make the user can find required information quickly and easily.In addition, this method has broken through the limitation of phone numeric keypad, remove the inconvenience of user key-press and loaded down with trivial details from, and being applied in to the user of voice technology provides interactive mode easily, improve interactive dynamic, ageing, simplicity and recreational the time, also paved road for carrying out of new telecommunication service.
Description of drawings
Fig. 1 is a block diagram of describing speech recognition apparatus of the present invention;
Fig. 2 is a process flow diagram of describing the implementation procedure of speech recognition;
Fig. 3 is the main flow chart according to the first embodiment of the present invention;
Fig. 4 is the process flow diagram of saying song title choosing song according to the first embodiment of the present invention;
Fig. 5 is the process flow diagram of saying singer's title choosing song according to the first embodiment of the present invention;
Fig. 6 is the process flow diagram according to the report result of the first embodiment of the present invention.
Embodiment
Describe the preferred embodiments of the present invention in detail hereinafter with reference to accompanying drawing.
Fig. 1 is a block diagram of describing speech recognition apparatus of the present invention.Wherein, be used for being connected to and be used to discern whether effectively phonetic feature recognition device 104 of user's voice information from the phonetic feature extraction element 102 that user's voice information is extracted phonetic feature information, and phonetic feature recognition device 104 is connected to received pronunciation information database 106.Alternatively, speech recognition apparatus according to the present invention comprises phonetic feature information-storing device 108, is used for the phonetic feature information that the store voice feature deriving means is extracted, and the phonetic feature recognition device reads phonetic feature information from this storer.
Fig. 2 shows the implementation procedure according to audio recognition method of the present invention.When the user passed through speech input information, phonetic feature extraction element 102 extracted phonetic feature information from user's voice information, and the phonetic feature information of being extracted is sent to phonetic feature recognition device 104; Alternatively, phonetic feature extraction element 102 arrives phonetic feature information-storing device 108 with the phonetic feature information storage of being extracted, and phonetic feature recognition device 104 can read phonetic feature information from this phonetic feature information-storing device.Subsequently, phonetic feature recognition device 104 compares (being read) the phonetic feature information that received and the received pronunciation information in the received pronunciation information database 106, and with phonetic feature information be key word from received pronunciation information database 106, search for this key word information corresponding, thereby whether identification user's voice information is effective, in other words, if phonetic feature recognition device 104 searches from received pronunciation information database 106 and phonetic feature information information corresponding, then user's voice information is effective, otherwise, be invalid.
Wherein, according to embodiments of the invention one, received pronunciation information database 106 is a song database, preferably, is referred to as key word with song title and singer's name; According to embodiments of the invention two, received pronunciation information database 106 is the phone directory database.
Fig. 3 shows utilization, and speech recognition apparatus carries out the general flow chart that the voice choosing is sung according to the present invention.After the user connects the operation terminal, enter choosing song flow process as shown in Figure 3, obtain the prompt system prompting immediately, system will point out the user by song title choosing song or by singer's title choosing song.
Under the situation of user by song title choosing song, promptly say under the situation of song title, phonetic feature extraction element 102 extracts the song title characteristic information from user's voice information, and the song title characteristic information that will extract sends to phonetic feature recognition device 104, phonetic feature recognition device 104 will be that the song title corresponding with trip searched in key word from song database 106 with the song title characteristic information that receives then, if search corresponding song title, then be effectively, carry out report as a result the user's voice information Recognition; If do not search corresponding song title, be invalid then with the user's voice information Recognition.
Under the situation of user by singer's title choosing song, promptly say under the situation of singer's title, enter flow process (flow process 5) as shown in Figure 5 by singer's title choosing song.Phonetic feature extraction element 102 extracts singer's title characteristic information from user's voice information, and singer's title characteristic information that will extract sends to phonetic feature recognition device 104, the phonetic feature recognition device will be that the singer title corresponding with trip searched in key word from song database 106 with the singer's title characteristic information that receives then, if search corresponding singer's title, be effective then with the user's voice information Recognition; If do not search corresponding singer's title, be invalid then with the user's voice information Recognition.
When carrying out the report result step of the process flow diagram of pressing song title choosing song shown in Figure 4, preferably, the result who searches for from song database 106 when phonetic feature recognition device 104 is during more than one, may further comprise the steps: system will point out the user that a plurality of results are arranged, and the prompting user is by selecting different results (step 4-1) by different numeral keys.
When carrying out the report result step of the process flow diagram of pressing singer's title choosing song shown in Figure 5, preferably may further comprise the steps: whether the song of judging this singer is more than one, if then enter step 4-1 as shown in Figure 4, if not, then directly report.
By this embodiment as can be seen, by speech recognition apparatus of the present invention and audio recognition method, the user can be rapid and simple finds required information, has got rid of complex operating steps and complicated button selection operation.
Embodiment 2: by the individual voice call basis of speech recognition structure
Current, the renewal frequency of mobile phone is quite fast, simultaneously, mobile phone lose with spoilage also than higher, reached 10%.Need when changing mobile phone to import again or the input communication record, use inconvenience very.By the individual voice call basis of using speech recognition to make up, can provide a telecommunication record of never losing to the user, the user uses this individual's voice call after this, not only can all be saved in the system all address list contents, when needs dial, login native system by modes such as wap, note, voice, inquire required number, also can directly make a call.
Similar with embodiment 1, but different be, be each user allocate storage in personal call notebook data storehouse, utilize the calling party phone number as key word, utilize the TTS technology will be, and store this memory block of personal call into subscriber-related phonetic feature according to name that the user said and the telephone number of saying.When the user puts through Service Phone and says corresponding name, phonetic feature extraction element 104 extracts the name phonetic feature from user's voice information, simultaneously this name phonetic feature is sent to phonetic feature recognition device 106, phonetic feature recognition device 106 is searched for from the database of this memory block of personal call and the corresponding name of name phonetic feature information that received subsequently, and the telephone number of this name correspondence is sent to the user or directly calls this number according to user's request with the note form.
Be the preferred embodiments of the present invention only below, be not limited to the present invention.This law is bright can also multiple different implementation, for example, can be used for voice SMS, voice mail and speech secretary etc.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (13)

1. a speech recognition apparatus that is used for the communications field is characterized in that, comprising:
The received pronunciation information database is used to store the received pronunciation information that is converted to by text message;
The phonetic feature extraction element is used for extracting phonetic feature information from user's voice information; And
The phonetic feature recognition device, be connected to described received pronunciation information database, be used to receive phonetic feature information from described phonetic feature extraction element, and received phonetic feature information and the received pronunciation information that is stored in the described received pronunciation information database compared, thereby whether identification user's voice information is effective.
2. speech recognition apparatus according to claim 1, it is characterized in that, described phonetic feature recognition device with the phonetic feature information that receives from described phonetic feature extraction element as key word, the search received pronunciation information relevant in described received pronunciation information database with described key word.
3. speech recognition apparatus according to claim 1, it is characterized in that, further comprise the phonetic feature information-storing device, be used to store the phonetic feature information that described phonetic feature extraction element is extracted, described phonetic feature recognition device can read described phonetic feature information from described phonetic feature information-storing device.
4. speech recognition apparatus according to claim 1 is characterized in that, further comprises:
Specific area language speech synthetic device is used for carrying out phonetic synthesis at specific area respectively, so that synthetic voice have more is professional, guarantees that simultaneously voice are natural and tripping.
5. according to each described speech recognition apparatus in the claim 1 to 4, it is characterized in that, further comprise:
The background music adding set is used for adding the received pronunciation information of background music in the described received pronunciation information database in the phonetic synthesis process.
6. an audio recognition method that is used for the communications field is characterized in that, may further comprise the steps:
Extraction step uses the phonetic feature extraction element to extract phonetic feature information from user's voice information, and the phonetic feature information of being extracted is sent to described phonetic feature recognition device; And
Whether identification step uses the phonetic feature recognition device effective according to received phonetic feature information Recognition user's voice information.
7. audio recognition method according to claim 6 is characterized in that, before described extraction step and described identification step, further may further comprise the steps:
Storing step, the received pronunciation information of using the standard information database storage to be converted to by text message.
8. audio recognition method according to claim 7 is characterized in that, further may further comprise the steps:
Comparison step described speech recognition shape device is compared received phonetic feature information and the received pronunciation information that is stored in the described received pronunciation information database, thereby whether identification user's voice information is effective.
9. audio recognition method according to claim 6 is characterized in that, described storing step further may further comprise the steps:
Specific area language synthesis step uses specific area language synthesizer to carry out phonetic synthesis at specific area respectively.
10. according to each described audio recognition method in the claim 6 to 9, it is characterized in that, further may further comprise the steps:
Background music adds step, uses the background music adding set to add the received pronunciation information of background music in the described received pronunciation information database in the phonetic synthesis process.
11. audio recognition method according to claim 6 is characterized in that, further may further comprise the steps:
Phonetic feature information stores step, the phonetic feature information of using the phonetic feature information memory stores in described extraction step, to extract, and described speech recognition apparatus reads the phonetic feature information of storage from described phonetic feature storer.
12. audio recognition method according to claim 6, it is characterized in that, described identification step further may further comprise the steps: described phonetic feature recognition device as key word, is searched for the received pronunciation information relevant with described key word with received phonetic feature information from described received pronunciation database.
13. audio recognition method according to claim 6, it is characterized in that: when described phonetic feature recognition device searches with the corresponding received pronunciation information of received phonetic feature information from described received pronunciation database, with described user speech information Recognition is effective, on the contrary, when described speech recognition apparatus does not search with the corresponding received pronunciation information of received phonetic feature information from described standard information database, be invalid with described user speech information Recognition.
CNA2006100005316A 2006-01-09 2006-01-09 Speech recognition equipment and method Pending CN101000767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2006100005316A CN101000767A (en) 2006-01-09 2006-01-09 Speech recognition equipment and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2006100005316A CN101000767A (en) 2006-01-09 2006-01-09 Speech recognition equipment and method

Publications (1)

Publication Number Publication Date
CN101000767A true CN101000767A (en) 2007-07-18

Family

ID=38692706

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2006100005316A Pending CN101000767A (en) 2006-01-09 2006-01-09 Speech recognition equipment and method

Country Status (1)

Country Link
CN (1) CN101000767A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722539A (en) * 2012-05-23 2012-10-10 华为技术有限公司 Query method and device based on voice recognition
CN105161112A (en) * 2015-09-21 2015-12-16 百度在线网络技术(北京)有限公司 Speech recognition method and device
CN107331378A (en) * 2017-08-07 2017-11-07 北京小米移动软件有限公司 Microphone
CN109036373A (en) * 2018-07-31 2018-12-18 北京微播视界科技有限公司 A kind of method of speech processing and electronic equipment
CN113113019A (en) * 2021-03-27 2021-07-13 上海红阵信息科技有限公司 Voice library generating system and method
WO2022037383A1 (en) * 2020-08-17 2022-02-24 北京字节跳动网络技术有限公司 Voice processing method and apparatus, electronic device, and computer readable medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722539A (en) * 2012-05-23 2012-10-10 华为技术有限公司 Query method and device based on voice recognition
CN105161112A (en) * 2015-09-21 2015-12-16 百度在线网络技术(北京)有限公司 Speech recognition method and device
CN105161112B (en) * 2015-09-21 2019-04-02 百度在线网络技术(北京)有限公司 Audio recognition method and device
CN107331378A (en) * 2017-08-07 2017-11-07 北京小米移动软件有限公司 Microphone
CN109036373A (en) * 2018-07-31 2018-12-18 北京微播视界科技有限公司 A kind of method of speech processing and electronic equipment
WO2022037383A1 (en) * 2020-08-17 2022-02-24 北京字节跳动网络技术有限公司 Voice processing method and apparatus, electronic device, and computer readable medium
CN113113019A (en) * 2021-03-27 2021-07-13 上海红阵信息科技有限公司 Voice library generating system and method

Similar Documents

Publication Publication Date Title
US7980465B2 (en) Hands free contact database information entry at a communication device
US6370506B1 (en) Communication devices, methods, and computer program products for transmitting information using voice activated signaling to perform in-call functions
US20060093099A1 (en) Apparatus and method for managing call details using speech recognition
CN103078995A (en) Customizable individualized response method and system used in mobile terminal
CN103139404A (en) System and method for generating interactive voice response display menu based on voice recognition
US20100119046A1 (en) Caller identification using voice recognition
CN101000767A (en) Speech recognition equipment and method
US20070197233A1 (en) Method of location-oriented call screening for communication apparatus
CN103873706A (en) Dynamic intelligent voice recognition IVR (Interactive Voice Response) service system
CN101605307A (en) Test short message service (SMS) voice play system and method
CN103065640B (en) The visual implementation method of voice messaging
CN102567402B (en) Electronic equipment and information processing method
CN102025834A (en) Mobile terminal voice operation method and device
CN101354886A (en) Apparatus for recognizing speech
CN106603792B (en) A kind of number searching equipment
JP6606697B1 (en) Call system and call program
CN102045454A (en) Seat system and method for realizing seat call
CN101763427A (en) Input method and system with fast calling function
US20020085688A1 (en) Apparatus and method for providing call return service
CN1893482B (en) System and method for realizing voice dialling by intelligent network
CN105007365A (en) Method and apparatus for dialing extension number
CN104869210B (en) A kind of communication information extracting method and information extraction terminal
CN101873548A (en) System and method for indicating instant messaging on-line state of user by using ring tone
CN101426047A (en) Intelligent voice control telephone
CN201075286Y (en) Apparatus for speech voice identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20070718