CN1217313C - Method for recognizing voice modified tone in system of recognizing voice of the Chinese language - Google Patents

Method for recognizing voice modified tone in system of recognizing voice of the Chinese language Download PDF

Info

Publication number
CN1217313C
CN1217313C CN021447640A CN02144764A CN1217313C CN 1217313 C CN1217313 C CN 1217313C CN 021447640 A CN021447640 A CN 021447640A CN 02144764 A CN02144764 A CN 02144764A CN 1217313 C CN1217313 C CN 1217313C
Authority
CN
China
Prior art keywords
data
sound
voice
words
continuous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN021447640A
Other languages
Chinese (zh)
Other versions
CN1416111A (en
Inventor
林宽农
陈秋涌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN021447640A priority Critical patent/CN1217313C/en
Publication of CN1416111A publication Critical patent/CN1416111A/en
Application granted granted Critical
Publication of CN1217313C publication Critical patent/CN1217313C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention discloses a recognition method for tone change in a Chinese speed sound recognition system. When continuous 'third' pronunciation word connection or identical overlapping word connection is generated in Chinese speech sound (pronunciation), words and sentences with the tone change can be automatically recognized according to the relationship between character strings and word boundaries, and the system identification accuracy can be improved.

Description

The discrimination method of breaking of voice in the Chinese speech identification system
Technical field
The present invention relates to the discrimination method of breaking of voice in a kind of Chinese speech identification system, especially refer to a kind of in Chinese speech, automatically according to character string relation and speech circle, coming the identification continuous tone is the language modified tone words and phrases that three word or same word are produced when connecting, and promotes the accuracy of Chinese speech identification system identification.
Background technology
The existing many-sided development of present voice identification system both domestic and external, major part all is the difference at the voice and the meaning of word in the words and phrases in various countries' language, and further discrimination method is provided, and uses the accuracy of promoting System Discrimination; Yet, in the Chinese speech identification system, except that the difference of general polyphone, voice, pronunciation (phonetic notation) causes the puzzlement and degree of difficulty of speech recognition easily, the special circumstances of tonal variations are more arranged, and these situations often become the significant obstacle of speech recognition ability; For example, each Chinese words all has the pronunciation of appointment, comprise " one " ( ) sound, " two " (
Figure C0214476400042
) sound, " three " (
Figure C0214476400043
) sound, " four " (
Figure C0214476400044
) sound four kinds of tones and " softly " ( ).And in the general Chinese statement, if it is all no problem to read Chinese statement according to " one " sound of the pronunciation (phonetic notation) of individual character to " four " or " softly ", this moment, voice and pronunciation were identical, but comprised the speech of " three " sound word continuously in the while statement, or be used in the identical reduplicated word of appellation, and go up at voice (pronunciation) and just have the situation of changing voice automatically and take place, make between voice and pronunciation to change and difference, and the situation of changing voice is relevant with the number of words of speech again, attempts example and is described as follows several frequently seen situation:
(1) in the middle of the character string of two words, if two words are all " three " sound, then first word is often changed and is read the sound for " two "; As: two word character strings such as " hello ", " thinking you ", and " thinking you " former phonetic notation is (ㄒ ㄧ ㄤ
Figure C0214476400046
ㄋ ㄧ ) tone, but (ㄒ ㄧ ㄤ then changed in voice (when reading sound)
Figure C0214476400048
ㄋ ㄧ ) tone.
(2) in the middle of triliteral character string, if continuous two " three " sound word is wherein arranged, then first " three " sound word must change thought " two " sound; If three words are " three " sound all, then first word and second word must change thought " two "; As: in the three word character strings of " submarine ", former phonetic notation is (ㄑ ㄧ ㄢ
Figure C02144764000410
ㄨ ㄟ
Figure C02144764000411
ㄊ ㄧ ㄥ ) tone, but then change into (ㄑ ㄧ ㄢ when reading sound ㄨ ㄟ ㄊ ㄧ ㄥ ) tone; And for example: the three word character strings of " president's prize ", former phonetic notation is (ㄗ ㄨ ㄥ ㄊ ㄨ ㄥ
Figure C02144764000417
ㄐ ㄧ ㄤ ) tone, but then change into (ㄗ ㄨ ㄥ when reading sound ㄊ ㄨ ㄥ
Figure C02144764000420
ㄐ ㄧ ㄤ ) tone.
(3) in the middle of the character string of four words, if four words " three " sound all, then first word and the 3rd word must change thought " two "; As: four word character strings such as " very few ", former phonetic notation is (ㄉ ㄧ ㄠ ㄉ ㄧ ㄠ
Figure C0214476400053
) tone, but then change into (ㄉ ㄧ ㄠ when reading sound
Figure C0214476400055
ㄉ ㄧ ㄠ
Figure C0214476400056
Figure C0214476400057
) tone.
(4) in the middle of five words, if five words " three " sound all, then first word and the 3rd, the 4th word must change thought " two "; As: the five word character strings of " 999 99 " (the digital kenel that comprises this example), former phonetic notation is (ㄐ ㄧ ㄡ
Figure C0214476400059
ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ
Figure C02144764000511
ㄐ ㄧ ㄡ
Figure C02144764000512
ㄐ ㄧ ㄡ ) tone, but then change into (ㄐ ㄧ ㄡ when reading sound ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ
Figure C02144764000516
ㄐ ㄧ ㄡ
Figure C02144764000517
ㄐ ㄧ ㄡ
Figure C02144764000518
) tone.
(5) if continuously the number of words of " three " sound is an even number, and six words or six words are when above, and per two words in front are one group, according to the pronunciation of continuous two " three "; As: the six word character strings of " 999 999 ", former phonetic notation are (ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ
Figure C02144764000522
ㄐ ㄧ ㄡ
Figure C02144764000523
ㄐ ㄧ ㄡ ) tone, but then change into (ㄐ ㄧ ㄡ when reading sound ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ ㄐ ㄧ ㄡ ) tone.
(6) if continuously the number of words of " three " sound is an odd number, and seven words or seven words are when above, and per two words in front are one group, and according to the pronunciation of continuous two " three " sound, but last group is three words, according to the pronunciation of continuous three " three "; As: " 5555555 " former phonetic notation is (ㄨ
Figure C02144764000531
Figure C02144764000532
Figure C02144764000534
Figure C02144764000535
) tone, but then change into (ㄨ when reading sound
Figure C02144764000538
Figure C02144764000540
) tone.
As from the foregoing, above-mentioned tonal variations rule is to be applicable to general words and phrases and proper noun, and when having " speech circle " to distinguish in the three word character strings, apply mechanically above-mentioned criterion again after must distinguishing according to " speech circle " earlier, also be applicable to the combination of numeral and normal words, as also be suitable for when being used for " name ", and its tonal variations rule is in addition according to following manner:
(1) when the identification of " name " is used, must separate " surname " and " name ", as " Yo-yo Ma " or " Liu Shuibian ", though three words all " three ", former phonetic notation is (ㄇ ㄚ
Figure C02144764000545
ㄧ ヌ
Figure C02144764000546
ㄧ ㄡ ) and (ㄉ ㄧ ㄡ ㄨ ㄟ ㄅ ㄧ ㄢ
Figure C02144764000550
), but " horse ", " willow " are " surname ", do not change tone and equally read " three " sound, " friend friend " or " water is flat " two words are read " two " sound then according to the rule of above-mentioned two continuous " three " sound with first word in two words, and to read be (ㄇ ㄚ so change ㄧ ㄡ ㄧ ㄡ
Figure C02144764000553
) and (ㄉ ㄧ ㄡ ㄨ ㄟ ㄅ ㄧ ㄢ
Figure C02144764000556
).
When (2) having " speech circle " to distinguish in other three words character string, as: " president Jiang " or " president's prize " etc., wherein, " president's prize " three words are " three " sound all, and former phonetic notation is (ㄗ ㄨ ㄥ
Figure C02144764000557
ㄊ ㄨ ㄥ ㄐ ㄧ ㄤ
Figure C02144764000559
), but because of not comprising " surname ", so still utilize above-mentioned rule, " two " sound read in two words in front, is (ㄗ ㄨ ㄥ so read
Figure C0214476400061
ㄊ ㄨ ㄥ ㄐ ㄧ ㄤ ).And " president Jiang " three words all " three " sound, former phonetic notation is (ㄐ ㄧ ㄤ ㄗ ㄨ ㄥ
Figure C0214476400065
ㄊ ㄨ ㄥ
Figure C0214476400066
), but " Jiang " word is " surname ", is not the part of speech, so equally read " three " sound, back two words " president " are then read " two " sound according to above-mentioned two continuous rules of three with first word, are (ㄐ ㄧ ㄤ so read
Figure C0214476400067
ㄗ ㄨ ㄥ
Figure C0214476400068
ㄊ ㄨ ㄥ ).
When another kind is the appellation of reduplicated word kenel again, promptly be reduplicated word if two identical words are arranged in the appellation of Chinese words and phrases, even if this two word is " three " sound word, voice (pronunciation) are also different, first of reduplicated word read by original sound, and second word will be read to softly.For example: " grandfather ", " grandmother ", " father ", " mother ", " elder brother ", " elder sister ", " younger brother ", " younger sister ", " orangutan " or the like.
In sum, it is the phenomenon that must take place that Chinese words and phrases medium pitch changes, and the otherness of voice that tonal variations forms and pronunciation obviously can increase the degree of difficulty and the incorrectness of speech recognition, cause the puzzlement in the actual use, as: in a Chinese speech identification system, be to utilize the pronunciation mode to import a Chinese database earlier, utilize voice mode that one Chinese words and phrases (or character string) are imported in this voice identification system again, the phase identification of this voice identification system of energy mat is handled, use and can from Chinese database, capture correct Chinese words and phrases or signal and output, to reach the convenience of speech control; Yet, during actual the use, in the often same Chinese words and phrases (or character string), voice are because of the relation of changing voice is different with pronunciation, and causing can't correct Chinese words and phrases or the signal of simple and easy output, cause the difficulty or the mistake of speech recognition, also influence the carrying out of subsequent job, examination is lifted actual example again and is described as follows:
(1) as using company's telephone system of voice forwarding extension set: possessed a cover voice identification system in above-mentioned telephone system device all, voice data such as name with the input of will sending a telegram here enter by microphone in this voice identification system; And the voice identification system of company generally is to have utilized the pronunciation mode to import a Chinese database earlier, for example: in the company someone " name " with its under " extension ", when then using, this voice identification system must be at the in addition identification of the voice data of incoming call input, export use from Chinese database, to capture correct Chinese words and phrases or signal, as: the incoming call input data was " Yo-yo Ma " originally, but the voice data of input is (ㄇ ㄚ
Figure C02144764000610
ㄧ ㄡ
Figure C02144764000611
ㄧ ㄡ
Figure C02144764000612
), and " Yo-yo Ma " is with pronunciation (ㄇ ㄚ in the Chinese database of voice identification system ㄧ ㄡ ㄧ ㄡ
Figure C02144764000615
) store, cause voice data (the ㄇ ㄚ that easily makes input
Figure C02144764000616
ㄧ ヌ ㄧ ㄡ ) can't simple and easy identification and cause the operation trouble of follow-up system, as: can't be forwarded to correct extension set apace automatically because of can't the correct name of identification causing, reduce the use effect of voice forwarding system.
(2) as using voice hospital registration system in hospital; Possessed a cover voice identification system in above-mentioned voice hospital registration system device all, voice data of importing sending a telegram here such as name etc. enter in the voice identification system by microphone; And the voice identification system of hospital generally is to have utilized the pronunciation mode to import a Chinese database earlier, as: someone name and registration form under it in the hospital, when then using, this voice identification system must be at the in addition identification of incoming call input voice data, export use from Chinese database, to capture correct Chinese words and phrases or signal, as: the incoming call input data was " Yo-yo Ma " and together with an identification number (as the registration form number) originally, but the voice data of input is (ㄇ ㄚ ㄧ ㄡ ㄧ ㄡ
Figure C0214476400073
), and " Yo-yo Ma " is with pronunciation (ㄇ ㄚ in the Chinese database of voice identification system ㄧ ㄡ
Figure C0214476400075
ㄧ ㄡ ) store, cause voice data (the ㄇ ㄚ that easily makes input ㄧ ㄡ
Figure C0214476400078
ㄧ ㄡ ) can't simple and easy identification to export correct Chinese words and phrases or signal, cause the operation trouble of follow-up system, as: can't the correct name of identification in the hospital cause and can't voice register (wherein, the long character string of registration form number situation that breaking of voice also may take place).
(3) as voice switching system of restaurant's guest extension or the like: possessed a cover voice identification system in the said system device all, the voice data will send a telegram here and import as name or guest extension etc., enters by microphone in this voice identification system; And the voice identification system in restaurant generally is to have utilized the pronunciation mode to import a Chinese database earlier, as: a certain objective room number reaches tenant's name of check-in in the restaurant, when then using, this voice identification system must be at the in addition identification of the voice data of incoming call input, export use from Chinese database, to capture correct Chinese words and phrases or signal, as: the incoming call input data was that " Yo-yo Ma " reaches together with an identification number (as the guest room number) originally, but the voice data of input is (ㄇ ㄚ
Figure C02144764000710
ㄧ ㄡ
Figure C02144764000711
ㄧ ㄡ ), and " Yo-yo Ma " is with pronunciation (ㄇ ㄚ in the Chinese database of voice identification system
Figure C02144764000713
ㄧ ㄡ
Figure C02144764000714
ㄧ ㄡ ) store, cause voice data (the ㄇ ㄚ that easily makes input
Figure C02144764000716
ㄧ ㄡ
Figure C02144764000717
ㄧ ㄡ
Figure C02144764000718
) can't simple and easy identification to export correct Chinese words and phrases or signal, cause the operation trouble of follow-up system, as: restaurant because of the correct name of tenant in can't the incoming call recognizing voice, cause and to be forwarded to the guest room automatically, and the incoming call of wanting desire switching guest room in the general restaurant telephone system must be represented guest extension and tenant's correct name, beginning will be sent a telegram here and will be forwarded to the guest room automatically if coincide, in case promote improperly.
Summary of the invention
The discrimination method that the purpose of this invention is to provide breaking of voice in a kind of Chinese speech identification system is to increase the accuracy of System Discrimination.
The present invention's method may further comprise the steps:
Utilize keyboard or other input media, Chinese material is input into the data buffer zone;
Utilize words and phrases to judge the resolution process unit, handle the Chinese material of data buffer zone, and distinguish continuously " three " sound or appellation reduplicated word according to speech circle rule;
At the appellation reduplicated word of Chinese material, produce appellation reduplicated word data, and be stored in Storage Media;
At continuous " three " in Chinese material sound, produce continuously " three " sound data, and be stored in Storage Media;
Make Storage Media be connected in the signal fusing processor;
Reception one sees through the Chinese speech data of the phonetic entry of signal encoding;
Utilize the signal fusing processor, according to the data of Storage Media and the Chinese speech data of phonetic entry is carried out identification;
Again comparison result is exported.
Described words and phrases judgement resolution process unit is to decompose the Chinese material of input according to speech circle rule, makes to divide into general comparison, continuous " three " acoustic ratio to reaching the comparison of appellation reduplicated word.
Described words and phrases judgement resolution process unit can be built generally comparison data and put at Storage Media to form database.
Described words and phrases judge that the resolution process unit can be with " three " sound classification continuously, divide into 2 continuous " three " sound, 3 continuous " three " sound, 4 continuous " three " sound, 5 continuous " three " sound, 6 continuous " three " sound, 7 different numbers of words such as continuous " three " sound, and comply with different numbers of words and build the different comparison data of putting, comprise: then the 1st word wherein set up " two " acoustic ratio to data at 2 continuous " three " sound, at 3 continuous " three " sound then to wherein the 1st, 2 words are set up " two " acoustic ratio to data, at 4 continuous " three " sound then to wherein the 1st, 3 words are set up " two " acoustic ratio to data, at 5 continuous " three " sound then to wherein the 1st, 3,4 words are set up " two " acoustic ratio to data, at 6 " three " sound then to wherein the 1st, 3,5 words are set up " two " acoustic ratio to data, at 7 continuous " three " sound then to wherein the 1st, 3,5,6 words are set up " two " acoustic ratio to data, and use all to build through the comparison data of changing voice and put at Storage Media to form database.
During described words and phrases judge that the resolution process unit can the Chinese material with input, the 2nd word in the appellation reduplicated word set up compare data softly, and the comparison data that will change voice is built and put at Storage Media to form database.
Described Storage Media comprises the hard disk internal memory.
When utilizing the signal fusing processor and carrying out identification according to the data of Storage Media and to the Chinese speech data of phonetic entry, can carry out general voice comparison earlier, if concern and speech circle according to character string again during the comparison failure, and utilize the voice after the conversion to carry out identification.
Description of drawings
Fig. 1 is a process block diagram of the present invention.
Fig. 2 produces the variation pattern synoptic diagram of breaking of voice naturally for connection pronunciation of " three " sound word or identical reduplicated word are arranged in the Chinese speech continuously.
Fig. 3 is the actual block schematic diagram of controlling of embodiments of the invention.
Fig. 4 is the actual block schematic diagram of controlling of another embodiment of the present invention.
Embodiment
See also shown in Figure 1ly, the present invention's method may further comprise the steps:
Building aspect the Chinese database of putting voice identification system: be to utilize keyboard or other input media 1, Chinese material is input into data buffer zone 2, and judge resolution process unit 3 by words and phrases, Chinese material in the processing data buffer zone 2 is also judged, if this Chinese material is judged as " appellation reduplicated word " and then produces appellation reduplicated word data 4, be stored in Storage Medias 6 such as hard disk internal memory again; If this Chinese material is judged as " ' three ' continuously " and then produces ' three ' data 5 continuously, be stored in Storage Medias 6 such as hard disk internal memory again, make must build in the Storage Media 6 such as hard disk internal memory to be set to a Chinese database: Storage Medias 6 such as hard disk internal memory then connect to an operating system with signal fusing processor 7.
In use, via microphone or other speech input device 8 inputs one Chinese speech data, this Chinese speech data is carried out identification by signal fusing processor 7 more earlier through signal encoding 9, with comparison result output 10, receive correct signal and carry out subsequent job again for follow-up system.
See also shown in Figure 2, it is to have " three " sound word to connect pronunciation or identical reduplicated word in the Chinese speech continuously and the variation pattern synoptic diagram of generation breaking of voice naturally, after the comparison data input, promptly be introduced into according to speech circle rule decomposing program 11, the comparison data is divided into general comparison program 12, continuous " three " acoustic ratio to program 13 and appellation reduplicated word comparison program 14, wherein, if generally compare program 12, the data of then comparing is directly built and is put in database 15; And if continuous " three " acoustic ratio is to program 13, then comparison data enters according to continuous " three " sound number of words sort program 16, to compare information data area and be divided into 2 continuous " three " sound 17,3 continuous " three " sound 18,4 continuous " three " sound 19,5 continuous " three " sound 20,6 continuous " three " sound 21, different numbers of words such as 7 continuous " three " sound, 22 grades, and carry out building of different comparison data according to different numbers of words and put, comprise: the 1st word wherein set up " two " acoustic ratio to data 23 at 17 of 2 continuous " three " sound, at 18 of 3 continuous " three " sound to wherein the 1st, 2 words are set up " two " acoustic ratio to data 24, at 19 of 4 continuous " three " sound to wherein the 1st, 3 words are set up " two " acoustic ratio to data 25, at 20 of 5 continuous " three " sound to wherein the 1st, 3,4 words are set up " two " acoustic ratio to data 26, at 21 of 6 continuous " three " sound to wherein the 1st, 3,5 words are set up " two " acoustic ratio to data 27, at 7 continuous three 22 to wherein the 1st, 3,5,6 words are set up " two " acoustic ratio to data 28, and with all comparison data 23 through changing voice, 24,25,26,27,28 build and put in database 15.If appellation reduplicated word comparison program 14 is again then set up the 2nd word wherein and compared data 29 softly, and the comparison data 29 that will change voice is built and is put in database 15.Then in the database 15 at general comparison program 12, continuously " three " acoustic ratio is built to put to program 13 and appellation reduplicated word comparison program 14 and is finished the part of changing voice that most probable produces between complete comparison data, especially voice and pronunciation; When then desiring identification when voice signal 30 inputs, but the complete comparison data voice comparison 31 that mat database 15 has stored again with comparison result output 32, receives correct signal and must carry out subsequent job for follow-up system.
In order to allow the personage who knows this technology can understand feature of the present invention more, the embodiment especially exemplified by the telephone system of voice forwarding illustrates the actual situation of using of the present invention:
See also shown in Figure 3, user's calling before this 40; Proceed to auto-pickup and declare salutatory 41, as: " you are good in OO company, could you tell me whom you will look for? " Proceed to the user again and say the people 42 that (voice) will be looked for, as: " old president (ㄔ ㄣ ㄗ ㄨ ㄥ ㄊ ㄨ ㄥ
Figure C0214476400103
) "; Proceed to general comparison 43 again, i.e. old president's voice (ㄔ ㄣ ㄗ ㄨ ㄥ
Figure C0214476400105
ㄊ ㄨ ㄥ
Figure C0214476400106
) and pronunciation (ㄔ ㄣ ㄗ ㄨ ㄥ ㄊ ㄨ ㄥ ) comparison; If comparison failure (because of voice and pronunciation variant), then proceed to continuous " three " acoustic ratio to 44, i.e. old president's voice (ㄔ ㄣ ㄗ ㄨ ㄥ ㄊ ㄨ ㄥ ) with the conversion after voice (ㄔ ㄣ
Figure C02144764001013
ㄗ ㄨ ㄥ ㄊ ㄨ ㄥ ) comparison; Compare successfully (accuracy improves than general comparison 43), proceed to voice again and read out comparison result and deal with 45, as: " that you will look for is old president's voice (ㄔ ㄣ
Figure C02144764001016
ㄗ ㄨ ㄥ ㄊ ㄨ ㄥ
Figure C02144764001018
), I am your switching at once "; Transfer and successfully finish speech recognition.
See also shown in Figure 4, user's calling before this 47; Proceed to auto-pickup and declare salutatory 48, as: " be dotey family here, could you tell me whom you will look for? " Proceed to the user again and say the people 49 that (voice) will be looked for, as: " old little sister (ㄔ ㄣ ㄒ ㄧ ㄠ
Figure C02144764001020
ㄇ ㄟ ㄇ ㄟ ) "; Proceed to general comparison 50 again, promptly old little sister (ㄔ ㄣ
Figure C02144764001023
ㄒ ㄧ ㄠ ㄇ ㄟ
Figure C02144764001025
ㄇ ㄟ ) and pronunciation (ㄔ ㄣ ㄒ ㄧ ㄠ
Figure C02144764001028
ㄇ ㄟ ㄇ ㄟ
Figure C02144764001030
) comparison; If comparison failure (because of voice and pronunciation variant), then proceed to the appellation reduplicated word and compare 51, promptly old little sister (ㄔ ㄣ
Figure C02144764001031
ㄒ ㄧ ㄠ
Figure C02144764001032
ㄇ ㄟ ㄇ ㄟ
Figure C02144764001034
) with the conversion after voice (ㄔ ㄣ
Figure C02144764001035
ㄒ ㄧ ㄠ ㄇ ㄟ ㄇ ㄟ ) comparison; Compare successfully (accuracy improves than general comparison 43), proceed to voice again and read out comparison result and deal with 52, as: " you wait a moment, and I am old little sister (ㄔ ㄣ
Figure C02144764001039
ㄒ ㄧ ㄠ ㄇ ㄟ ㄇ ㄟ ) answer a call "; Transfer and successfully finish speech recognition 53.
In summary, any Chinese language at input, Chinese speech identification system of the present invention is except that can carrying out general language comparison as 43,50, more can be when the comparison failure, automatically according to character string relation and speech circle, utilize the words and phrases after voice after the conversion come the identification breaking of voice again, as " three " acoustic ratio continuously to 44 or appellation reduplicated word comparison 51, with the identification difficulty that has continuously " three " sound word to connect in effective reduction voice or produced during identical reduplicated word, and increase the accuracy of System Discrimination.
In sum, the discrimination method of breaking of voice in the Chinese speech identification system of the present invention can reach described effect by above-mentioned disclosed method really.

Claims (7)

1, the discrimination method of breaking of voice in a kind of Chinese speech identification system may further comprise the steps:
Utilize keyboard or other input media, Chinese material is input into the data buffer zone; Utilize words and phrases to judge the resolution process unit, handle the Chinese material of data buffer zone, and distinguish continuously " three " sound or appellation reduplicated word according to speech circle rule;
At the appellation reduplicated word of Chinese material, produce appellation reduplicated word data, and be stored in Storage Media;
At continuous " three " in Chinese material sound, produce continuously " three " sound data, and be stored in Storage Media;
Make Storage Media be connected in the signal fusing processor;
Reception one sees through the Chinese speech data of the phonetic entry of signal encoding;
Utilize the signal fusing processor, according to the data of Storage Media and the Chinese speech data of phonetic entry is carried out identification;
Again comparison result is exported.
2, according to the discrimination method of breaking of voice in the described a kind of Chinese speech identification system of claim 1, it is characterized in that: described words and phrases judgement resolution process unit is to decompose the Chinese material of input according to speech circle rule, makes and divides into general comparison, continuous " three " acoustic ratio to reaching the comparison of appellation reduplicated word.
3, according to the discrimination method of breaking of voice in the described a kind of Chinese speech identification system of claim 2, it is characterized in that: described words and phrases judgement resolution process unit can be built generally comparison data and put at Storage Media to form database.
4, discrimination method according to breaking of voice in claim 1 or the 2 described a kind of Chinese speech identification systems, it is characterized in that: described words and phrases judge that the resolution process unit can be with " three " sound classification continuously, divide into 2 continuous " three " sound, 3 continuous " three " sound, 4 continuous " three " sound, 5 continuous " three " sound, 6 continuous " three " sound, 7 different numbers of words such as continuous " three " sound, and comply with different numbers of words and build the different comparison data of putting, comprise: then the 1st word wherein set up " two " acoustic ratio to data at 2 continuous " three " sound, at 3 continuous " three " sound then to wherein the 1st, 2 words are set up " two " acoustic ratio to data, at 4 continuous " three " sound then to wherein the 1st, 3 words are set up " two " acoustic ratio to data, at 5 continuous " three " sound then to wherein the 1st, 3,4 words are set up " two " acoustic ratio to data, at 6 " three " sound then to wherein the 1st, 3,5 words are set up " two " acoustic ratio to data, at 7 continuous " three " sound then to wherein the 1st, 3,5,6 words are set up " two " acoustic ratio to data, and use all to build through the comparison data of changing voice and put at Storage Media to form database.
5, according to the discrimination method of breaking of voice in claim 1 or the 2 described a kind of Chinese speech identification systems, it is characterized in that: described words and phrases judge that the resolution process unit can be with in the Chinese material of importing, the 2nd word in the appellation reduplicated word set up compare data softly, and the comparison data that will change voice is built and put at Storage Media to form database.
6, according to the discrimination method of breaking of voice in claim 1 or the 2 described a kind of Chinese speech identification systems, it is characterized in that: when utilizing the signal fusing processor and carrying out identification according to the data of Storage Media and to the Chinese speech data of phonetic entry, can carry out general voice comparison earlier, if concern and speech circle according to character string again during the comparison failure, and utilize the comparison data of changing voice to carry out identification.
7, according to the discrimination method of breaking of voice in the described a kind of Chinese speech identification system of claim 1, it is characterized in that: described this Storage Media comprises the hard disk internal memory.
CN021447640A 2002-12-10 2002-12-10 Method for recognizing voice modified tone in system of recognizing voice of the Chinese language Expired - Fee Related CN1217313C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN021447640A CN1217313C (en) 2002-12-10 2002-12-10 Method for recognizing voice modified tone in system of recognizing voice of the Chinese language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN021447640A CN1217313C (en) 2002-12-10 2002-12-10 Method for recognizing voice modified tone in system of recognizing voice of the Chinese language

Publications (2)

Publication Number Publication Date
CN1416111A CN1416111A (en) 2003-05-07
CN1217313C true CN1217313C (en) 2005-08-31

Family

ID=4750660

Family Applications (1)

Application Number Title Priority Date Filing Date
CN021447640A Expired - Fee Related CN1217313C (en) 2002-12-10 2002-12-10 Method for recognizing voice modified tone in system of recognizing voice of the Chinese language

Country Status (1)

Country Link
CN (1) CN1217313C (en)

Also Published As

Publication number Publication date
CN1416111A (en) 2003-05-07

Similar Documents

Publication Publication Date Title
CN1116770C (en) Automatic hostel server using speech recognition
CN1150452C (en) Speech recognition correction for equipment wiht limited or no displays
CN1238832C (en) Phonetics identifying system and method based on constrained condition
CN103578467B (en) Acoustic model building method, voice recognition method and electronic device
US7286989B1 (en) Speech-processing system and method
CN1120470C (en) Speaker recognition over large population with fast and detailed matches
US7103542B2 (en) Automatically improving a voice recognition system
CN1238836C (en) Combining DTW and HMM in speaker dependent and independent modes for speech recognition
CN1098500C (en) Method and apparatus for translation
US5873061A (en) Method for constructing a model of a new word for addition to a word model database of a speech recognition system
US20020198715A1 (en) Artificial language generation
JP2009104156A (en) Telephone communication terminal
WO1994016434A1 (en) Recursive finite state grammar
KR19980070329A (en) Method and system for speaker independent recognition of user defined phrases
GB2423403A (en) Distributed language processing system and method of outputting an intermediary signal
US7302381B2 (en) Specifying arbitrary words in rule-based grammars
US20010056345A1 (en) Method and system for speech recognition of the alphabet
CN1217313C (en) Method for recognizing voice modified tone in system of recognizing voice of the Chinese language
US6963832B2 (en) Meaning token dictionary for automatic speech recognition
CN1190726C (en) Speech follow read and pronunciation correction system and method for portable electronic apparatus
CN1317134A (en) Method and system for voice dialling
CN1153127C (en) Intelligent common spoken Chinese phonetic input method and dictation machine
CN1275174C (en) Chinese language input method possessing speech sound identification auxiliary function and its system
JP2001134285A (en) Speech recognition device
Mniszewski et al. Hiertalker: A default hierarchy of high order neural networks that learns to read English aloud

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee