CN101656069A

CN101656069A - Chinese voice information communication system and communication method thereof

Info

Publication number: CN101656069A
Application number: CN200910183303A
Authority: CN
Inventors: 陈拙夫
Original assignee: Individual
Current assignee: Individual
Priority date: 2009-09-17
Filing date: 2009-09-17
Publication date: 2010-02-24

Abstract

The invention discloses a Chinese voice information communication system and a communication method thereof. The system comprises terminal machines and an information server, and the communication process of the system is as follows: firstly, a sender holds the terminal machine to input voice; the terminal machine records and recognizes the voice as Pinyin; Pinyin information is sent to the information server by a fixed or mobile network; the information server transmits the Pinyin information to the terminal machine of a receiver; and the terminal machine of the receiver synthesizes and readouts the Pinyin information for the receiver. The invention records the voice section by section by utilizing artificial assistance, thereby greatly reducing the difficulty on voice recognition of terminal equipment, preventing the data transmission quantity of the voice information from being higher than that of text information, providing convenience in information reutilization, greatly reducing the cost of information transmission and transference and increasing the speed of information transfer. The receiving terminal adopts a voice synthesis technology, thereby enabling the receiving andthe sending of information to be fast and cheap.

Description

A kind of Chinese speech information communication system and the means of communication thereof

Technical field

The invention belongs to communication field, specifically is a kind of Chinese speech information communication system and the means of communication thereof, and this method is by reaching the purpose of input of simplification information and reduction communication cost in conjunction with the novel voice technology.

Background technology

Current, realize that the mode of real-time voice communication is a lot, mainly comprise phone, speech talkback, voip etc., but no matter which kind of communication modes, its transmitted data amount is big, and demand on signal quality height, cost are relatively also high.And the problem of current speech information scheme correspondingly, requires transmitted data amount big, and transmission is slow, and server buffer requires big, and the cost height is not easy to recycling.Meanwhile, the problem of Word message then is input inconvenience.The problem of server side speech recognition is that transmitted data amount is big, and noise is got rid of difficulty, and the server side operand is big, the cost height.The problem of current terminal speech recognition is that operand is big, and noise is got rid of difficulty, and is fast and the accurate recognition technology is immature, the cost height.

Summary of the invention

Technical matters to be solved by this invention provides a kind of Chinese speech infosystem and its implementation that adopts the specific human voices input method.This phonitic entry method utilization is manually auxiliary to the voice segment typing, greatly reduce terminal device difficulty of speech recognition degree, make the volume of transmitted data of voice messaging be not more than the text message volume of transmitted data, greatly reduce information transmission and transfer cost, improved information transfer rate.Adopt speech synthesis technique in the receiving terminal, make the convenient and swift cheapness of information transmit-receive.

Chinese speech information communication system of the present invention comprises terminating machine and information server, wherein:

Described terminating machine comprises sound identification module, and phonetic synthesis module and information transmission modular and at least one are used for the device that the user imports reciprocating action;

Described information server is connected with the information transmission modular of terminating machine by public network, and information on services transfer and tissue, inquiry service are provided.

In addition, the present invention also provides a kind of Chinese speech information communication method, and the hardware components of this method comprises terminating machine and information server,

Described terminating machine comprises sound identification module, phonetic synthesis module and information transmission modular, and at least one is used for the device that the user imports reciprocating action;

Its communication process is:

1) at first transmit leg is held this terminating machine input voice;

2) terminating machine becomes phonetic by sound identification module with voice recording and identification;

3) information transmission modular of terminating machine sends Pinyin information to information server by fixing or mobile network;

4) information server is forwarded to the receiving side terminal machine by the mobile network with Pinyin information;

5) read to the take over party after the phonetic synthesis module of receiving side terminal machine is synthesized Pinyin information and listen.

Above-mentioned steps 2) step that in speech recognition is become phonetic is:

6) voice segments of recording is carried out filtering, in first audio section, seeks the higher point of energy, these some places or near include the vowel information of user pronunciation;

7) above respectively pressing from high to low compared with vowel articulation resonance peak feature database in proper order, differentiate whether this some place is vowel, vowel shape of the mouth as one speaks state, and speaker's part vocal print feature;

8) after searching out first yuan point of articulation, according to the position of this some place resonance peak, in the language spectrum, after going forward, follows the tracks of time shaft the vowel formant energy variation of whole pronunciation, can obtain the terminal of vowel articulation thus;

9) carry out the resonance peak contrast to extracting key point in the whole vowel articulation process, judge shape of the mouth as one speaks variation in the vowel articulation process, thereby draw vowel information accurately;

10) near the consonant characteristic signal that may have the vowel articulation starting point is strengthened test comparison, predict according to acquired part vocal print feature in the test and judge, with accurate acquisition consonant information;

11) obtain syllable information accurately by above judgement.

The user imports the device of reciprocating action, can be to press the roller of function or the displacement transducer of contact induction for computer mouse, touch pad, touch-screen, band.Wherein effect is preferably touch pad or touch-screen, divides two input fields on this touch pad or the touch-screen, and the input process of step 1) is:

Press touch pad or touch-screen, system enters to be prepared the voice typing stage,

When finger slides to another when distinguishing from touch pad or touch-screen one district, the user says a word and is recorded by sound identification module by Mike, when finger sliding when returning from another district of touch pad or touch-screen, the user says next word, record by sound identification module by Mike again, and so forth

Unclamp touch pad or touch-screen, sound identification module terminated speech typing state is also discerned automatic phrase composing in the identifying automatically with above-mentioned recording to the pronunciation of several Chinese characters as one.

When the device of importing reciprocating action as the user was computer mouse, the input process of step 1) was:

Press mousebutton, system enters to be prepared the voice typing stage;

To a direction sliding mouse, the user says a word and is recorded by sound identification module by Mike, and when sliding mouse round about, the user says next word, record by sound identification module by Mike again, and so forth,

Unclamp mousebutton, sound identification module terminated speech typing state is also discerned automatic phrase composing in the identifying automatically with above-mentioned recording to the pronunciation of several Chinese characters as one.

The device of importing reciprocating action as the user comprises that the input process of step 1) was when band was pressed the displacement transducer of the roller of function or contact induction:

Finger is pressed, and system enters to be prepared the voice typing stage;

Slide to direction, the user says a word and is recorded by sound identification module by Mike, and when sliding mouse round about, the user says next word, record by sound identification module by Mike again, and so forth,

Finger unclamps, and sound identification module terminated speech typing state is also discerned automatic phrase composing in the identifying automatically with above-mentioned recording to the pronunciation of several Chinese characters as one.

The present invention can simplification information import, and improves the information input speed, and input information is convenient to thumb and reuse, and can reduce volume of transmitted data, also can well use in the place that signal quality is bad, and can reduce the communication cost greatly, improves communication efficiency.

Description of drawings

Fig. 1 represents that total system is by the flow process that sends to reception;

Fig. 2 represents to operate the characteristics and the method for assistant voice input;

Fig. 3 has represented sound spectrograph, and the present invention obtains consonant and first range of sound by energy peak from sound spectrograph, inserts the checkpoint then respectively and compares differentiation.

Embodiment

Chinese speech information communication system of the present invention mainly comprises terminating machine and information server, wherein:

Terminating machine comprises sound identification module, phonetic synthesis module and information transmission modular;

Information server is connected with the information transmission modular of terminating machine by public network, and information on services transfer and tissue, inquiry service are provided.

The terminal device of Chinese speech information communication system (mobile phone or computing machine) should comprise the device that at least one user imports reciprocating action, comprises mouse, roller, slide block, touch-screen, and wherein effect is preferably touch-screen.With the touch-screen is example, divides two districts on the touch-screen, and concrete input process is a (see figure 2):

1, press touch-screen, system enters to be prepared the voice typing stage.

2, the user says a word when finger slides to another district from a district, and user when another district slides says next word when finger, and so forth.

3, unclamp touch-screen, this moment terminated speech typing state and above-mentioned recording discerned to the pronunciation of several Chinese characters as automatically.

Its communication process is a (see figure 1):

1) at first transmit leg is held this terminating machine input voice;

Because artificial cutting Chinese character and word are distinguished the scope of signal characteristic and dwindled greatly in the system voice identifying, so identifying obtains simplifying, and recognition result is also more accurate.

Above-mentioned steps 2) step that in speech recognition is become phonetic is a (see figure 3):

7) above respectively pressing from high to low compared with vowel articulation resonance peak feature database in proper order, differentiate whether this some place is noise, the shape of the mouth as one speaks changes;

8) after searching out first non-noise point, according to the position of this some place resonance peak, in the language spectrum, after going forward, follows the tracks of time shaft the vowel formant energy variation of whole pronunciation, can obtain the terminal of vowel articulation thus;

What 9) (front have, and refers to herein to insertion point in the whole vowel articulation process?) carry out the resonance peak contrast, judge shape of the mouth as one speaks variation in the vowel articulation process, thereby draw vowel information accurately;

10) near the consonant characteristic signal that may have the vowel articulation starting point is carried out exploratory test comparison, with accurate judgement consonant information;

11) obtain syllable information accurately by above judgement.

Adopt said method that numerous embodiments can be arranged:

1, the person to person exchanges by mobile phone speech

Send terminal and adopt the mobile phone of band touch-screen to realize that mobile phone removes noise by dual microphone, gprs or 3g network communication send to server after the speech recognition, and transit server is given and accepted mobile phone.

2, the person to person exchanges with computerized speech by mobile phone

Computer side adopts mouse and microphone phonetic entry, and mobile phone and computer expert cross transit server information.

3, human mobile phone or computer exchange with the automation services interrogation responsor

The user side sends to server with the phonetic entry content, and server is accurately with content analysis and beam back corresponding informance, for example automatic speech service, voice news, community of voice forum, phonetic search.

Claims

1, a kind of Chinese speech information communication system is characterized in that comprising terminating machine and information server,

2, Chinese speech information communication system according to claim 1 is characterized in that device that the user imports reciprocating action is that computer mouse, touch pad, touch-screen, band are pressed the roller of function or the displacement transducer of contact induction.

3, a kind of Chinese speech information communication method is characterized in that hardware components comprises terminating machine and information server,

Described information server is connected with the information transmission modular of terminating machine by public network, and information on services transfer and tissue, inquiry service are provided;

Its communication process is:

1) at first transmit leg is held this terminating machine input voice;

4, Chinese speech information communication method according to claim 3 is characterized in that step 2) in speech recognition is become phonetic step be:

11) obtain syllable information accurately by above judgement.

5, Chinese speech information communication method according to claim 3 is characterized in that the device that the user imports reciprocating action is touch pad or touch-screen, divides two input fields on described touch pad or the touch-screen, and the input process of step 1) is:

6, Chinese speech information communication method according to claim 3 is characterized in that the device that the user imports reciprocating action is a computer mouse, and the input process of step 1) is:

Press mousebutton, system enters to be prepared the voice typing stage;

7, Chinese speech information communication method according to claim 3 is characterized in that device that the user imports reciprocating action comprises that band presses the roller of function or the displacement transducer of contact induction; The input process of step 1) is:

Finger is pressed, and system enters to be prepared the voice typing stage;