CN101067780B - Character inputting system and method for intelligent equipment - Google Patents

Character inputting system and method for intelligent equipment Download PDF

Info

Publication number
CN101067780B
CN101067780B CN2007101124124A CN200710112412A CN101067780B CN 101067780 B CN101067780 B CN 101067780B CN 2007101124124 A CN2007101124124 A CN 2007101124124A CN 200710112412 A CN200710112412 A CN 200710112412A CN 101067780 B CN101067780 B CN 101067780B
Authority
CN
China
Prior art keywords
voice
candidate
pinyin
module
candidate character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2007101124124A
Other languages
Chinese (zh)
Other versions
CN101067780A (en
Inventor
张会鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shiji Guangsu Information Technology Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN2007101124124A priority Critical patent/CN101067780B/en
Publication of CN101067780A publication Critical patent/CN101067780A/en
Application granted granted Critical
Publication of CN101067780B publication Critical patent/CN101067780B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

This invention discloses a character input system of intelligent equipment including: a phone receiving module used in receiving phones, a phone parameter library used in storing the corresponding relations of phones and spellings, a conversion module used in converting phone signals received by the phone module to corresponding spellings and a character generating module used in generating characters based on the spellings converted by the conversion module. This invention also discloses a character input method for intelligent equipment, which stores the corresponding relations between phones and spellings in advance, then converts received phones into corresponding spellings based on the relations then to generate characters according to the converted spellings.

Description

Character input system and method of intelligent equipment
Technical Field
The present invention relates to a word processing technology of an intelligent device, and in particular, to a word input system and a word input method of an intelligent device.
Background
Computer users typically use chinese input software to input chinese characters into smart devices. The Chinese character input software is a tool software running on the computer operating system, which converts the keyboard input code or other non-keyboard input media data into Chinese character input software. The chinese input software can be divided into keyboard input software and non-keyboard input software.
At present, the most mature and widely used software is keyboard Chinese input software. The keyboard Chinese character input software is a method for inputting Chinese characters by utilizing a keyboard according to a certain coding rule.
The number of English letters is only 26, and the English letters correspond to 26 letters on the keyboard, so that the English letters can be directly input without any input software for English. The number of Chinese characters is tens of thousands, and they have no corresponding relation with keyboard, but in order to input Chinese characters into computer, it must encode Chinese characters, and connect these codes with keys on the keyboard, then can input a certain Chinese character code through the keyboard, then convert it into Chinese characters according to the code.
At present, there are hundreds of Chinese character coding schemes, wherein there are dozens of Chinese characters which are used as a graphic character and are commonly expressed by the sound, the shape and the meaning of the character, and the Chinese character input coding method basically adopts the method of associating the sound, the shape and the meaning with a specific key and then combining the sound, the shape and the meaning according to different Chinese characters to complete the input of the Chinese characters.
Non-keyboard Chinese input software includes handwriting input software, optical character recognition technology (OCR) input software, voice input software, and the like.
The handwriting input software is a handwriting Chinese recognition input software in a pen-type environment, accords with the habit of Chinese people writing with pens, and can be recognized and displayed by a computer as long as the Chinese people write on a handwriting board according to the usual habit. However, the handwriting input software needs a matched hardware handwriting board, and the Chinese characters are written and recorded on the matched handwriting board by using a pen (which can be any type of hard pen), so that the method is convenient and quick, and the wrong character rate is low. In addition, characters can be written in a designated area by using a mouse and then converted into Chinese characters through handwriting software, but the method requires that the mouse is very skilled to operate.
OCR input software requires that a document to be input is first converted into a graphic by a scanner and then the graphic is converted into text. Therefore, this input method needs to be equipped with a scanner, and the higher the printing quality of the original, the higher the recognition accuracy, and generally the characters of the printing form such as books, magazines, etc. are preferred, and if the paper of the original is thin, the patterns and characters on the back of the paper may also be transmitted during scanning, which may interfere with the final recognition effect.
The speech input method is an input method for recognizing the speech into Chinese characters by a computer according to the speech of an operator, and is also called a voice control input method. The speech input method inputs the speech of Chinese characters to the computer through the microphone connected with the computer, utilizes the speech recognition system to analyze and distinguish the Chinese characters or phrases, displays the recognized Chinese characters in the editing area, and then transmits the characters in the editing area to the editing of other documents of the computer through the 'transmitting' function.
The phonetic input method has the advantages that hands are not used for inputting, the hands are liberated, only the pronunciation of the Chinese character is read, and the phonetic input method is simple and quick to use.
However, the current speech input method is mainly to store the corresponding relationship between the voice signal and the chinese characters in the computer in advance, convert the human voice into the voice signal after inputting the voice, compare the converted voice signal with the existing voice signal stored in the computer, and then select the corresponding chinese characters to input to the computer. Because the number of Chinese characters is very large, 8 thousands, each Chinese character corresponds to a voice signal, and the pronunciation of each person is very different, the processing difficulty of directly converting the voice into the Chinese characters by the current voice input method is high, the wrong character rate is very high, and the influence on the input accuracy is large.
Disclosure of Invention
In view of the above, the main objective of the present invention is to provide a text input system for an intelligent device, which can not only increase the input speed, but also reduce the difficulty of converting speech into text, and increase the accuracy of text input.
Another objective of the present invention is to provide a text input method for an intelligent device, which can also increase the input speed, reduce the difficulty of converting speech into text, and increase the accuracy of text input.
In order to achieve the purpose of the invention, the main technical scheme of the invention is as follows:
a text input system for a smart device, the system comprising:
the voice receiving module is used for receiving voice;
the voice parameter library is used for storing the corresponding relation between the voice and the pinyin;
the voice type judging module is used for pre-storing the voice command and judging whether the voice received by the voice receiving module is the stored voice command or not, if so, the voice command is sent to the character generating module, otherwise, the voice is sent to the converting module;
the conversion module is used for converting the received voice signals into corresponding pinyin according to the corresponding relation stored in the voice parameter library;
the character generation module is used for generating candidate characters according to the pinyin converted by the conversion module; and selecting the finally input characters from the candidate characters according to the voice command input by the voice type judging module.
Preferably, the correspondence between the voice and the pinyin is as follows: the corresponding relation between the phonetic elements and the syllables; and the text input system further comprises:
the voice library is used for recording a voice sequence;
and the syllable establishing module is used for establishing syllables corresponding to each voice element of each voice sequence recorded in the voice database and storing the corresponding relation between each voice element and the corresponding syllable into the voice parameter database.
Preferably, the system further comprises:
and the training probability parameter module is used for counting and generating the training probability parameters of all syllables according to the voice sequence, the voice elements and the syllables corresponding to the voice elements in the voice database and storing the training probability parameters into the voice parameter database.
Preferably, the conversion module specifically includes:
the decomposition module is used for decomposing the voice signal into at least one voice element;
a candidate pinyin generation module, configured to select a syllable from the syllables corresponding to each of the voice elements in sequence from the decomposed first voice element to form a candidate pinyin string;
the occurrence probability calculation module is used for calculating the occurrence probability of each candidate pinyin string according to the training probability parameters;
the selection unit is used for selecting a candidate pinyin string with the maximum occurrence probability as the pinyin converted by the voice signal; or, the method is used for selecting more than one candidate pinyin strings with relatively high occurrence probability to be output, and determining the pinyin finally converted by the voice signal according to a selection instruction input from the outside.
Preferably, the conversion module specifically includes:
the decomposition module is used for decomposing the voice signal into at least one voice element;
a candidate pinyin generation module, configured to sequentially search all syllables corresponding to each voice element from the decomposed first voice element to form candidate pinyins of phrases or words;
the occurrence probability calculation module is used for calculating the occurrence probability of each candidate pinyin according to the training probability parameters;
and the selection unit is used for sequentially outputting the candidate pinyins according to the occurrence probability and determining the final converted pinyin of the voice signal according to a selection instruction input from the outside.
Preferably, the text generation module specifically includes:
the candidate character generating module is used for generating a candidate character list at least comprising one candidate character according to the pinyin converted by the converting module;
the result generation module is used for outputting the generated candidate character list, detecting whether a voice instruction input by the voice type judgment module is received or not, selecting characters from the candidate character list according to the received voice instruction when the voice instruction is received, and outputting the selected characters;
correspondingly, the voice type judging module sends the voice command to the result generating module when judging that the received voice is the stored voice command.
Preferably, the result generating module specifically includes: and the voice instruction matching module is used for storing the corresponding matching relation between the voice instruction and the candidate character position in the candidate character list, matching the received voice instruction with the candidate character position in the candidate character list according to the corresponding matching relation, and selecting the candidate character from the matched candidate character position to be used as the character finally input by the character input system when the matching is correct.
Preferably, the result generation module is connected with an external keyboard to receive a keyboard instruction;
the result generation module further comprises: and the physical contact instruction matching module is used for storing the corresponding matching relation between the physical contact instruction and the candidate character position in the candidate character list, matching the received keyboard instruction with the candidate character position in the candidate character list according to the corresponding matching relation, and selecting the candidate character from the matched candidate character position as the character finally input by the character input system when the matching is correct.
A character input method of intelligent equipment pre-stores the corresponding relation between voice and pinyin and voice instructions; the method further comprises the following steps:
A. receiving voice;
B. judging whether the received voice is a stored voice instruction or not, and if so, executing the step C; otherwise, converting the received voice into corresponding pinyin according to the corresponding relation between the stored voice and the pinyin;
C. and generating candidate characters according to the converted pinyin, receiving and recognizing a voice instruction, and selecting the finally input characters from the candidate characters according to the voice instruction.
Preferably, the correspondence between the voice and the pinyin is as follows: the corresponding relation between the phonetic elements and the syllables;
the specific method for pre-storing the corresponding relation between the voice and the pinyin comprises the following steps:
recording voice sequences, and storing the recorded voice sequences into a voice library;
establishing syllables corresponding to each voice element of each voice sequence in a voice library;
storing the corresponding relation between each phonetic element and the corresponding syllable.
Preferably, the method further comprises: according to the voice sequence, the voice elements and the corresponding syllables in the voice library, carrying out statistics to generate training probability parameters of each syllable;
the method for converting the voice into the pinyin in the step B comprises the following steps:
b1, decomposing the voice into at least one voice element, and searching all syllables corresponding to each voice element;
b2, selecting a syllable from the syllables corresponding to each phonetic element in turn from the first phonetic element to form a candidate pinyin string;
b3, calculating the occurrence probability of each candidate pinyin string according to the training probability parameters;
b4, selecting a candidate pinyin string with the maximum occurrence probability as the pinyin after the voice signal conversion; or selecting more than one candidate pinyin strings with relatively high occurrence probability to output, and determining the pinyin finally converted by the voice signal according to a selection instruction input from the outside.
Preferably, the method further comprises: according to the voice sequence, the voice elements and the corresponding syllables in the voice library, carrying out statistics to generate training probability parameters of each syllable;
the method for converting the voice into the pinyin in the step B comprises the following steps:
b1, decomposing the voice into at least one voice element, and searching all syllables corresponding to each voice element;
b2, starting from the decomposed first voice element, sequentially forming syllables corresponding to each voice element into candidate pinyin of a phrase or a single character;
b3, calculating the occurrence probability of each candidate pinyin according to the training probability parameters;
b4, outputting the candidate pinyins in sequence according to the occurrence probability, and determining the final converted pinyins of the voice signals according to the selection instruction of the user.
Preferably, the training probability parameters of the syllables comprise an initial probability parameter, a transition probability parameter and a transmission probability parameter; wherein,
generating an initial probability parameter according to M/N, wherein M is the frequency of the specific syllable appearing at the head of the pinyin string corresponding to one voice sequence, and N is the total number of all the voice sequences recorded in the voice library;
the transition probability parameter is generated according to O/P, wherein O is the common display times of the two syllables in the voice library, and P is the total number of the first syllables in the two syllables established in the voice library;
generating the emission probability parameter according to Q/R, wherein Q is the total number of the voice elements corresponding to a specific syllable in the voice library, and R is the total number of the specific syllable in the voice library;
the step B3 specifically includes: multiplying the initial probability parameter, the transfer probability parameter and the emission probability parameter of the syllable in the candidate pinyin string to obtain the value of the occurrence probability of the candidate pinyin string.
Preferably, the step C of generating the candidate characters according to the converted pinyin is: generating and displaying a candidate character list including at least one candidate character according to the converted pinyin;
and C, selecting the finally input characters as follows: and C1, selecting words from the candidate word list according to the voice command, and inputting the selected words into the intelligent equipment.
Preferably, the method further comprises: pre-storing a corresponding matching relation between the voice command and the position of the candidate character in the candidate character list;
the step C1 specifically includes: and matching the received voice command with the candidate character positions in the candidate character list according to the corresponding matching relation between the voice command and the candidate character positions in the candidate character list, and taking the candidate characters at the matched candidate character positions as the finally input characters when the matching is correct.
Preferably, the method further comprises: pre-storing a corresponding matching relation between a keyboard instruction and a candidate character position in a candidate character list;
the method further comprises: after a keyboard instruction is detected, matching the detected keyboard instruction with the candidate character positions in the candidate character list according to the corresponding matching relation between the keyboard instruction and the candidate character positions in the candidate character list, and taking the candidate characters at the matched candidate character positions as the finally input characters when the matching is correct.
The invention firstly converts the voice signal into pinyin, then processes the pinyin and converts the pinyin into characters. Therefore, compared with the existing keyboard input method, the method has the advantages of simple and quick input, improves the speed of character input, and further improves the working efficiency. Compared with the existing voice input method, the voice is converted into pinyin, the pinyin is converted into characters, the corresponding relation between the voice and the pinyin is stored in the intelligent device, the number of the pinyin is much smaller than that of the Chinese characters, and the number of the voice needing to be stored and identified is greatly reduced.
The invention further specifies the corresponding relation between the voice and the pinyin as the corresponding relation between the voice elements and the syllables, and the number of syllables of the Chinese character is only 403 which is far less than the number of pinyin strings, so that the number of stored voice can be further reduced, and the character input is simpler and quicker.
The invention also sets a voice library which can pre-record voice, generates training probability parameters of syllables according to the recorded voice, and selects the pinyin converted by the voice again through the training probability parameters, and the pinyin with the maximum probability is converted into Chinese characters, so that the invention can maximally avoid the problem of low input accuracy rate caused by more pronunciations and nonstandard Chinese characters, and further improve the accuracy rate of Chinese character input.
In addition, in the process of converting pinyin into characters, the method firstly generates candidate characters, and then selects the characters to be input by using a voice instruction or a physical contact instruction (such as a keyboard instruction, a touch instruction of a touch screen and the like), so that the operation process of inputting the characters is further simplified; and the user can also freely select whether to select the characters by a voice input mode, directly input and select the characters by physical contact, or combine the two modes, so that the user has greater flexibility in the character input process.
Drawings
FIG. 1 is a schematic structural diagram of a text input system according to the present invention;
FIG. 2 is a schematic diagram of a conversion module in the text input system according to the present invention;
FIG. 3 is a schematic diagram illustrating the display of a candidate text generated by the text input system according to the present invention;
FIG. 4 is a block diagram of a candidate text generation module of the text input system of the present invention;
FIG. 5 is a schematic structural diagram of a result generation module of the text input system according to the present invention;
FIG. 6 is a flow chart of a text input method of the intelligent device according to the present invention;
FIG. 7 is a diagram illustrating two candidate Pinyin strings with relatively high occurrence probabilities;
FIG. 8 is a diagram illustrating sequential output of Pinyin corresponding to phrases or words according to occurrence probability;
FIG. 9 is a diagram illustrating a candidate list of an example Pinyin string;
FIG. 10 is a simplified candidate list diagram of FIG. 9;
fig. 11 is a schematic diagram illustrating a display of the candidate word list generated in fig. 10.
Detailed Description
The invention is explained in more detail below with reference to specific embodiments and the drawing.
The core idea of the invention is as follows: pre-storing the corresponding relation between the voice and the pinyin; when inputting characters, the speech input is used, firstly the speech signal is received, the received speech signal is converted into corresponding pinyin according to the stored correspondence between the speech and the pinyin, and then the characters are generated according to the converted pinyin.
The intelligent device of the invention can be a device with intelligent information processing capability, such as a computer, a smart phone, a palm computer and the like. The present invention is described herein in the context of a computer.
The characters can be Chinese characters, the pinyin is Chinese pinyin, the characters can also be other characters with pronunciation based on pinyin, such as Korean, and the like, and the pinyin can be the pinyin of the characters. The embodiment of the present invention is described by taking Chinese and pinyin as examples.
Fig. 1 is a schematic structural diagram of a text input system according to the present invention. Referring to fig. 1, the text input system mainly includes:
the voice receiving module 101 is connected to an external microphone of the computer, for example, connected to an earphone with a microphone in the computer, and is configured to receive a voice signal. The voice receiving module 101 can adopt the existing voice receiving technology, and a user can input a voice signal of a Chinese character to the character input system through a microphone, and the voice receiving module 101 receives and completes digital conversion.
And the voice parameter library 102 is used for storing the corresponding relation between the voice and the pinyin. The corresponding relation can be the corresponding relation between the phonetic elements and syllables, or the corresponding relation between a specific phonetic and a specific pinyin. The phonetic elements are the pronunciations of the individual Chinese characters.
The conversion module 103 may be directly connected to the speech receiving module 101 and the speech parameter library 102, and is configured to convert the speech signal received by the speech receiving module 101 into a corresponding pinyin according to the correspondence stored in the speech parameter library 102.
And the character generating module 104 is configured to generate characters according to the pinyin converted by the converting module 103, and further input the generated characters to a display device and/or a storage device of the intelligent device for display and/or storage processing.
The text input system of the present invention may further comprise:
a speech library 105 for recording speech sequences.
A syllable establishing module 106, configured to establish a syllable corresponding to each voice element of each voice sequence recorded in the voice database 105, and store the corresponding relationship between each voice element and its corresponding syllable in the voice parameter database 102.
The present invention can utilize the voice library 105 and the syllable establishing module 106 to set the corresponding relationship between the voice elements and the syllables.
In order to improve the recognition accuracy of the input speech by the character input system of the invention, the invention can also generate the training probability parameter of the syllable according to the speech recorded in the speech library 105, and the pinyin converted by the speech is selected and recognized again through the training probability parameter and is converted into the corresponding pinyin string. To achieve this object, the text input system of the present invention further comprises:
a training probability parameter module 107, configured to statistically generate a training probability parameter of each syllable according to the voice sequence, the voice element, and the corresponding syllable in the voice database 105, and store the training probability parameter in the voice parameter database 102.
Fig. 2 is a schematic structural diagram of the conversion module 103 in the text input system according to the present invention. Referring to fig. 2, the conversion module 103 includes:
a decomposition module 201, configured to decompose the received voice signal into at least one voice element.
And a candidate pinyin generation module 202, configured to select a syllable from the syllables corresponding to each voice element in sequence from the decomposed first voice element to form a candidate pinyin string.
And the occurrence probability calculation module 203 is configured to calculate an occurrence probability of each candidate pinyin string according to the training probability parameter.
A selecting unit 204, configured to select a candidate pinyin string with the highest occurrence probability as the pinyin after the voice signal conversion; or, the method is used for selecting more than one candidate pinyin strings with relatively high occurrence probability to be output, and determining the pinyin finally converted by the voice signal according to a selection instruction input from the outside.
As another embodiment, the specific modules in the conversion module 103 may further have the following functions:
the decomposition module is used for decomposing the voice signal into at least one voice element;
a candidate pinyin generation module, configured to sequentially search all syllables corresponding to each voice element from the decomposed first voice element to form candidate pinyins of phrases or words;
the occurrence probability calculation module is used for calculating the occurrence probability of each candidate pinyin according to the training probability parameters;
and the selection unit is used for sequentially outputting the candidate pinyins according to the occurrence probability and determining the final converted pinyin of the voice signal according to a selection instruction input from the outside.
The text generation module 104 specifically includes:
a candidate character generating module 108, configured to generate a candidate character list including at least one candidate character according to the pinyin string converted by the converting module 103.
And the result generating module 109 is configured to display the generated candidate character list, detect whether a selection instruction input from the outside is received, select a character from the candidate character list according to the input selection instruction when the selection instruction is received, and input the selected character to the intelligent device.
For example: inputting a voice "chinese" from a microphone, receiving the voice by the voice receiving module 101, then transferring the voice to the converting module 103, converting the voice into a pinyin string "zhong 'guo' ren" by the converting module 103, inputting the pinyin string into the character generating module 104, and generating candidate characters by the candidate character generating module 108, as shown in fig. 3; then, the user inputs a selection instruction, and the result generation module 109 selects the first candidate word according to the selection instruction, thereby completing the input.
FIG. 4 is a block diagram of the candidate text generation module 108 of the text input system according to the present invention. Referring to fig. 4, the candidate text generation module 108 specifically includes:
a candidate word generating module 401, configured to generate a candidate word according to the pinyin string converted by the converting module 103.
And a complete sentence generating module 402, configured to generate a candidate complete sentence according to the candidate word by using a complete sentence generating algorithm.
The selection instruction input to the result generation module 109 may be a voice instruction or a physical contact instruction, where the physical contact instruction may be a keyboard instruction, a touch instruction on a touch screen, or another instruction generated by physical contact, and the keyboard instruction is described as an example herein.
As an embodiment, in order to receive a voice instruction, between the voice receiving module 101 and the converting module 103, a voice type distinguishing module 110 may be further included, where the voice receiving module 101 inputs a received voice signal to the voice type distinguishing module 110, a voice instruction is pre-stored, and is used to judge whether the voice signal received by the voice receiving module 101 is the stored voice instruction, if so, it is judged that the type of the voice signal is the voice instruction, and then the voice instruction is sent to the result generating module 109, otherwise, the voice signal is sent to the converting module 103.
To receive the keyboard command, the result generation module 109 needs to connect with the keyboard of the smart device to receive the keyboard command.
The selection instruction can be input only through a keyboard, can be input only through voice, or can be input through the keyboard or the voice at the same time, and can be freely selected by a user.
Fig. 5 is a schematic structural diagram of the result generation module 109 of the text input system according to the present invention. Referring to fig. 5, the result generation module 109 further includes:
the detection module 501: the method is used for detecting the type of an input instruction, inputting the instruction into a voice instruction matching module 502 if the input instruction is a voice instruction, and inputting the instruction into a physical contact instruction matching module 503 if the input instruction is a keyboard instruction.
The voice instruction matching module 502 is configured to store a corresponding matching relationship between a voice instruction and a candidate character position in a candidate character list, match the received voice instruction with the candidate character position in the candidate character list according to the corresponding matching relationship, and select a candidate character from the matched candidate character position as a character finally input by the text input system if the matching is correct.
And the physical contact instruction matching module 503 is configured to store a corresponding matching relationship between a keyboard instruction and a candidate character position in the candidate character list, match the received keyboard instruction with the candidate character position in the candidate character list according to the corresponding matching relationship, and select a candidate character from the matched candidate character position as a character finally input by the text input system if the matching is correct.
Fig. 5 shows the structure of the result generation module 109 when the selection command can be a voice command or a keyboard command. When the text input system only inputs a selection instruction through voice, the result generation module 109 may only include a voice instruction matching module 502; when the text input system only inputs a selection instruction through a keyboard, the result generation module 109 may only include the physical contact instruction matching module 503.
Fig. 6 is a flowchart of a text input method of the intelligent device according to the present invention. Referring to fig. 6, the method includes:
step 601, pre-storing the corresponding relation between the voice and the pinyin.
The corresponding relationship may be stored in the speech parameter library 102, and the corresponding relationship may be the corresponding relationship between the speech element and the syllable, or the corresponding relationship between the specific speech and the specific pinyin. For example: the syllable corresponding to the voice of 'I' is 'wo', 'men' is 'men', the syllable corresponding to the voice of 'Ye' is 'shi', 'I' and 'men' are all voice elements; the specific speech "we is" and the pinyin "wo 'men' shi" can also be stored as a corresponding relationship. The voice and the pinyin are stored according to a digital signal form which can be recognized by the intelligent equipment.
Step 602, receiving a voice signal. Specifically, the voice may be received from a voice input device of the smart device, such as a microphone, and converted into a digital signal that can be processed by the smart device.
Step 603, converting the received voice signal into corresponding pinyin according to the stored correspondence between the voice and the pinyin. For example, when receiving the voice signal of "me", the pinyin "wo" corresponding to the voice signal is searched in the stored correspondence.
Step 604, generating characters according to the converted pinyin. For example: the pinyin wo is converted into the character me, and the conversion can be realized by adopting the existing pinyin input method.
In the invention, the conversion from the speech to the pinyin is realized by adopting a hidden Markov model (HHM) method. HMM is an important statistical natural language model, and is widely used in the fields of speech recognition, phonetic-to-word conversion, and the like. It is essentially a probability function of a markov process.
In hidden markov models, the observed events are random functions of states. The model is thus a double stochastic process in which the state transition process of the model is not observable, i.e. hidden, whereas the stochastic process of observable events is a stochastic function of the hidden state transition process. It can be formally described as a quintuple HMM ═ S, O, a, B, pi >. The processing procedure can be simply described as firstly utilizing a statistical method to perform learning training on the existing data, for example, performing statistics on the voice library and the corresponding pinyin library to obtain the parameter relationship between the voice library and the pinyin string, namely, the parameter library. Then, when a new voice comes, the information in the parameter library is used to determine the pinyin string which is closest to the voice, i.e. has the highest probability, and the pinyin string is used as the pinyin string result corresponding to the voice.
The specific method of converting speech to pinyin using hidden markov models of the present invention is described below.
The method for storing the corresponding relation between the voice and the pinyin by adopting the voice training method comprises the following steps:
and 701, recording the voice sequences, and storing the recorded voice sequences into a voice library.
For example, recording a large number of speech sequences, which may be sentences or articles spoken by different people, etc.
Step 702, establishing syllables corresponding to each voice element of each voice sequence in a voice library; storing the corresponding relation between each phonetic element and the corresponding syllable.
For example, a speech sequence "we are all ordinary" read by a certain person is decomposed into speech elements "i", "d", "yes", "flat", "no" and "person", and then corresponding syllables "wo", "men", "dou", "shi", "ping", "fan" and "ren" are established for each speech element. The phonetic sequence "we are all adventures" read by another person is also decomposed into phonetic elements, and the same syllables "wo", "men", "dou", "shi", "ping", "fan", "ren" are established, respectively. Therefore, the same syllable can correspond to the voice elements of various different accents through voice training, so that the influence of the accents of input personnel is avoided, and the accuracy of voice recognition is improved.
Then, the invention can also further generate the training probability parameter of each syllable by statistics according to the voice sequence, the voice elements and the corresponding syllables in the voice library.
The training probability parameters of the syllables comprise an initial probability parameter, a transition probability parameter and a transmission probability parameter.
The initial probability parameter is the probability that a syllable appears in the pinyin header corresponding to the voice sequence, and can be according to the formula: M/N is generated, wherein M is the number of times a specific syllable appears in the head of the pinyin string corresponding to a voice sequence, and N is the total number of all the voice sequences recorded in the voice library.
The transition probability parameter is the probability that a syllable and another syllable co-appear, i.e. the two syllables appear simultaneously in front and back order, for example: the two syllables "wo" and "men" will usually co-appear as "wo' men"; the transition probability parameter is according to the formula: and generating O/P, wherein O is the co-occurrence number of the two syllables in the voice library, and P is the total number of the first syllables in the two syllables established in the voice library.
The emission probability parameter is the probability of a syllable co-appearing with a voice, for example: because of the different pronunciation of accents, the "I" speech can be uttered as the speech represented by the syllables of "wo", "e", or "huo", so the "I" speech may be co-apparent with "wo", "e", or "huo". The transmission probability parameter is according to the formula: Q/R is generated, wherein Q is the total number of occurrences of the speech element corresponding to a particular syllable in the speech pool, and R is the total number of occurrences of the particular syllable in the speech pool.
Using hidden markov model, the method for converting speech into pinyin in step 603 is:
step 6031, decompose the voice into at least one voice element, and find all syllables corresponding to each voice element.
For example, inputting a voice "we are all ordinary people", decomposing the voice into seven voice elements of "i", "people", "all", "yes", "flat", "no", and "people", and searching corresponding pinyin syllables from a prestored corresponding relationship between the voice and pinyin, for example:
"I" corresponds to the syllable "wo".
"these" correspond to the syllables "men" and "meng".
"all" corresponds to syllable "dou".
"is" corresponding syllables "shi" and "si".
"Flat" corresponds to the syllable "ping".
"Van" corresponds to the syllable "fan".
"human" corresponds to the syllable "ren".
Step 6032, starting from the first phonetic element, selecting a syllable from the syllables corresponding to each phonetic element in turn to form a candidate pinyin string.
For example, the candidate pinyin strings corresponding to the above speech "we are all adventures" are:
1、“wo’men’dou’shi’ping’fan’ren”。
2、“wo’men’dou’si’ping’fan’ren”。
3、“wo’meng’dou’shi’ping’fan’ren”。
4、“wo’meng’dou’si’ping’fan’ren”。
and 6033, calculating the occurrence probability of each candidate pinyin string according to the training probability parameters. Specifically, the initial probability parameter, the transition probability parameter and the emission probability parameter of the syllable in the candidate pinyin string are multiplied to obtain a value which is the occurrence probability of the candidate pinyin string.
Step 6034, select a candidate pinyin string with the highest probability of occurrence as the pinyin after the voice signal conversion.
For example, by calculation, the occurrence probability of the pinyin string "wo 'men' dou 'shi' ping 'fan' ren" is the highest, and the pinyin string can be selected as the converted pinyin.
Or, more than one candidate pinyin strings with relatively high occurrence probability can be selected and output to be displayed to the user, the user selects the candidate pinyin strings, and the candidate pinyin strings selected by the user are used as the pinyin after the voice signal conversion.
For example, the occurrence probabilities of the pinyin strings "wo 'men' dou 'shi' ping 'fan' ren" and "wo 'men' dou 'si' ping 'fan' ren" are two relatively high, and the two pinyin strings may be selected as the converted pinyins. At this time, the two candidate pinyin strings may be output and displayed to the user, as shown in fig. 7, each candidate pinyin string is preceded by a label, and the user selects the candidate pinyin string according to the label, and if the user selects 1, the pinyin string "wo 'men' dou 'shi' ping 'fan' ren" is taken as the pinyin after the voice signal conversion.
In addition, the following alternatives of steps 6032 to 6034 may be provided, which are step 6032 ', step 6033 ' and step 6034 ', respectively.
6032', starting from the first decomposed speech element, the syllables corresponding to each speech element are sequentially combined into candidate pinyin of a phrase or a single character. For example:
for example, the candidate pinyin corresponding to the above speech "we are all adventures" is:
the first two speech elements form candidate pinyins of phrases "wo 'men" and "wo' meng";
the second and third phonetic elements constitute candidate pinyins of the phrases "dou 'shi" and "dou' si";
the last three phonetic elements constitute the candidate pinyin of the phrase "ping 'fan' ren".
And 6033', calculating the occurrence probability of each candidate pinyin according to the training probability parameters. Specifically, the initial probability parameter, the transition probability parameter and the emission probability parameter of syllables in the candidate pinyin are multiplied to obtain a value which is the occurrence probability of the candidate pinyin.
And 6034', sequentially outputting the candidate pinyins according to the occurrence probability, and determining the final converted pinyins of the voice signals according to a selection instruction of a user.
For example, fig. 8 is a schematic diagram of sequentially outputting pinyin corresponding to phrases or words according to the occurrence probability. As shown in the first step 801 of fig. 8, "1: wo' men "and" 2: wo 'meng' selected by the user, and if the user selects 1, further displaying the subsequent phrases according to the occurrence probability; as shown in the second step 802 of fig. 8, "1: dou' shi "and" 2: dou' si ", further selected by the user, if the user selects 1, further displaying the subsequent phrase; as shown in the third step 803 in fig. 8, "ping 'fan' ren" may be displayed, where the last pinyin may be selected by the user or by default by the system; finally, the 'wo' men 'dou' shi 'ping' fan 'ren' is used as the converted pinyin of the voice signal.
Of course, in the above process, all syllables (i.e. the pinyin of a single character) corresponding to each speech element may also be displayed in sequence, and the user selects the syllable of each speech element in sequence, thereby determining the pinyin finally converted by the speech signal.
After the pinyin string is obtained, text is generated using step 604. Step 604 may specifically include:
step 6041, generate a candidate word list including at least one candidate word according to the converted pinyin, and display the candidate word list on the intelligent device.
Step 6042, detecting whether the intelligent equipment inputs a selection instruction, and if the selection instruction is detected, executing step 6043; otherwise, this step 6042 is repeated.
And 6043, selecting characters from the candidate character list according to the selection instruction, and inputting the selected characters into the intelligent device.
The candidate text list in step 6041 may be a candidate word or a candidate whole sentence. The specific generation method comprises the following two steps:
and I, generating a candidate word. The invention needs to set a mapping table from the pinyin string to the candidate word sequence, namely a pinyin dictionary. The candidate words corresponding to each pinyin string in the pinyin dictionary are sorted according to the word frequency of the candidate words from large to small, the method for generating the candidate words is simple, namely the candidate words are searched in the pinyin dictionary according to the pinyin string, after the matched pinyin string is found, the first n candidate words corresponding to the pinyin string are output, and n is the number of the candidate words which can be displayed on the input method output interface.
And secondly, generating the whole sentence. In order to realize the input of the whole sentence, the invention adopts the maximum probability method to realize the prediction of the whole sentence, namely: in a pinyin string input by a user, there are a plurality of candidate word combination methods. Firstly, finding out all candidate words appearing in the Pinyin string, and then finding out a combination scheme with the maximum probability in the combination of the candidate words as a final whole sentence generation result.
Fig. 9 is a schematic diagram of a candidate word list of the pinyin string "wo 'men' dou 'shi' ping 'fan' ren". As shown in fig. 9, each arc corresponds to one or more candidate words, the candidate words are sorted from top to bottom in the graph according to word frequency from high to low, and each arc carries word frequency information which is not marked in the graph, and the word frequency information refers to the word frequency of the word with the largest word frequency in all the candidate words corresponding to the pinyin string. Because only one candidate whole sentence information is provided for the user, only the word with the highest word frequency is effective, that is, the words with the word frequency arranged in the second place and later, such as 'nest', 'gate', 'fighter' and the like, cannot appear in the final candidate whole sentence result.
Fig. 10 is a schematic diagram of the candidate word list simplified from fig. 9. As shown in fig. 10, a path with the highest probability is obtained by using a shortest path algorithm between two points, such as Dijkstra algorithm, Viterbi algorithm, etc., the path with the highest probability is a dashed path shown in fig. 10, the path is a word combination scheme, the path with the highest probability is displayed as a final whole sentence prediction result at the first position of a candidate word window, and the candidate word list window is shown in fig. 11, where only one whole sentence candidate result is shown, that is, "we are all trivial" at the first candidate position, and all candidate word results are from the second candidate position to the back.
After the candidate character list is generated, a user needs to select one of the candidate character lists as a final input result. In the present invention, the final input result can be determined in two ways, one is keyboard selection and the other is voice selection. That is, in step 6042, the selection instruction may be a keyboard instruction or a voice instruction.
When a user inputs a selection instruction through a keyboard, the method needs to pre-store the corresponding matching relation between the keyboard instruction and the position of the candidate character in the candidate character list; step 6043 specifically includes: after a keyboard instruction is detected, matching the detected keyboard instruction with the candidate character positions in the candidate character list according to the corresponding matching relation between the keyboard instruction and the candidate character positions in the candidate character list, and if the matching is correct, taking the candidate characters at the matched candidate character positions as the finally input characters.
When a user inputs a voice instruction through a microphone, the method needs to store the voice instruction and the corresponding matching relation between the voice instruction and the position of the candidate character in the candidate character list in advance, each selection instruction is represented by the voice of one character, and the corresponding relation from the voice to the selection instruction is established. For example, the speech of "1" corresponds to the first candidate character selected, the speech of "up" corresponds to the candidate character selected in the previous page, and the speech of "down" corresponds to the candidate character selected in the next page. The user can also modify the voice instruction according to the needs and represent the operation by different voice instructions, for example, the user can define some unusual voices as the voice instructions, so that the conflict between the voice instructions and the voice input is greatly reduced.
Further, after step 602 and before step 603, the method further includes: judging whether the received voice is a pre-stored voice instruction, if so, matching the received voice instruction with the candidate character position in the candidate character list according to the corresponding matching relation between the voice instruction and the candidate character position in the candidate character list, and if the matching is correct, taking the candidate character at the matched candidate character position as the finally input character; if not, step 603 is performed.
The above description is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (16)

1. A text input system for a smart device, the system comprising:
the voice receiving module is used for receiving voice;
the voice parameter library is used for storing the corresponding relation between the voice and the pinyin;
the voice type judging module is used for pre-storing the voice command and judging whether the voice received by the voice receiving module is the stored voice command or not, if so, the voice command is sent to the character generating module, otherwise, the voice is sent to the converting module;
the conversion module is used for converting the received voice signals into corresponding pinyin according to the corresponding relation stored in the voice parameter library;
the character generation module is used for generating candidate characters according to the pinyin converted by the conversion module; and selecting the finally input characters from the candidate characters according to the voice command input by the voice type judging module.
2. The text input system of claim 1, wherein the correspondence between the speech and pinyin is: the corresponding relation between the phonetic elements and the syllables; and the text input system further comprises:
the voice library is used for recording a voice sequence;
and the syllable establishing module is used for establishing syllables corresponding to each voice element of each voice sequence recorded in the voice database and storing the corresponding relation between each voice element and the corresponding syllable into the voice parameter database.
3. The text entry system of claim 2, further comprising:
and the training probability parameter module is used for counting and generating the training probability parameters of all syllables according to the voice sequence, the voice elements and the syllables corresponding to the voice elements in the voice database and storing the training probability parameters into the voice parameter database.
4. The text input system of claim 3, wherein the conversion module specifically comprises:
the decomposition module is used for decomposing the voice signal into at least one voice element;
a candidate pinyin generation module, configured to select a syllable from the syllables corresponding to each of the voice elements in sequence from the decomposed first voice element to form a candidate pinyin string;
the occurrence probability calculation module is used for calculating the occurrence probability of each candidate pinyin string according to the training probability parameters;
the selection unit is used for selecting a candidate pinyin string with the maximum occurrence probability as the pinyin converted by the voice signal; or, the method is used for selecting more than one candidate pinyin strings with relatively high occurrence probability to be output, and determining the pinyin finally converted by the voice signal according to a selection instruction input from the outside.
5. The text input system of claim 3, wherein the conversion module specifically comprises:
the decomposition module is used for decomposing the voice signal into at least one voice element;
a candidate pinyin generation module, configured to sequentially search all syllables corresponding to each voice element from the decomposed first voice element to form candidate pinyins of phrases or words;
the occurrence probability calculation module is used for calculating the occurrence probability of each candidate pinyin according to the training probability parameters;
and the selection unit is used for sequentially outputting the candidate pinyins according to the occurrence probability and determining the final converted pinyin of the voice signal according to a selection instruction input from the outside.
6. The text input system of claim 1, wherein the text generation module specifically comprises:
the candidate character generating module is used for generating a candidate character list at least comprising one candidate character according to the pinyin converted by the converting module;
the result generation module is used for outputting the generated candidate character list, detecting whether a voice instruction input by the voice type judgment module is received or not, selecting characters from the candidate character list according to the received voice instruction when the voice instruction is received, and outputting the selected characters;
correspondingly, the voice type judging module sends the voice command to the result generating module when judging that the received voice is the stored voice command.
7. The text input system of claim 6, wherein the result generation module specifically comprises: and the voice instruction matching module is used for storing the corresponding matching relation between the voice instruction and the candidate character position in the candidate character list, matching the received voice instruction with the candidate character position in the candidate character list according to the corresponding matching relation, and selecting the candidate character from the matched candidate character position to be used as the character finally input by the character input system when the matching is correct.
8. The text entry system of claim 6, wherein the result generation module is connected to an external keyboard to receive keyboard commands;
the result generation module further comprises: and the physical contact instruction matching module is used for storing the corresponding matching relation between the physical contact instruction and the candidate character position in the candidate character list, matching the received keyboard instruction with the candidate character position in the candidate character list according to the corresponding matching relation, and selecting the candidate character from the matched candidate character position as the character finally input by the character input system when the matching is correct.
9. A character input method of intelligent equipment is characterized in that the corresponding relation between voice and pinyin and a voice instruction are stored in advance; the method further comprises the following steps:
A. receiving voice;
B. judging whether the received voice is a stored voice instruction or not, and if so, executing the step C; otherwise, converting the received voice into corresponding pinyin according to the corresponding relation between the stored voice and the pinyin;
C. and generating candidate characters according to the converted pinyin, receiving and recognizing a voice instruction, and selecting the finally input characters from the candidate characters according to the voice instruction.
10. The text input method of claim 9, wherein the correspondence between the speech and the pinyin is: the corresponding relation between the phonetic elements and the syllables;
the specific method for pre-storing the corresponding relation between the voice and the pinyin comprises the following steps:
recording voice sequences, and storing the recorded voice sequences into a voice library;
establishing syllables corresponding to each voice element of each voice sequence in a voice library;
storing the corresponding relation between each phonetic element and the corresponding syllable.
11. The text entry method of claim 10, further comprising: according to the voice sequence, the voice elements and the corresponding syllables in the voice library, carrying out statistics to generate training probability parameters of each syllable;
the method for converting the voice into the pinyin in the step B comprises the following steps:
b1, decomposing the voice into at least one voice element, and searching all syllables corresponding to each voice element;
b2, selecting a syllable from the syllables corresponding to each phonetic element in turn from the first phonetic element to form a candidate pinyin string;
b3, calculating the occurrence probability of each candidate pinyin string according to the training probability parameters;
b4, selecting a candidate pinyin string with the maximum occurrence probability as the pinyin after the voice signal conversion; or selecting more than one candidate pinyin strings with relatively high occurrence probability to output, and determining the pinyin finally converted by the voice signal according to a selection instruction input from the outside.
12. The text entry method of claim 10, further comprising: according to the voice sequence, the voice elements and the corresponding syllables in the voice library, carrying out statistics to generate training probability parameters of each syllable;
the method for converting the voice into the pinyin in the step B comprises the following steps:
b1, decomposing the voice into at least one voice element, and searching all syllables corresponding to each voice element;
b2, starting from the decomposed first voice element, sequentially forming syllables corresponding to each voice element into candidate pinyin of a phrase or a single character;
b3, calculating the occurrence probability of each candidate pinyin according to the training probability parameters;
b4, outputting the candidate pinyins in sequence according to the occurrence probability, and determining the final converted pinyins of the voice signals according to the selection instruction of the user.
13. The text input method of claim 11, wherein the training probability parameters of the syllables include an initial probability parameter, a transition probability parameter, and a transmission probability parameter; wherein,
generating an initial probability parameter according to M/N, wherein M is the frequency of the specific syllable appearing at the head of the pinyin string corresponding to one voice sequence, and N is the total number of all the voice sequences recorded in the voice library;
the transition probability parameter is generated according to O/P, wherein O is the common display times of the two syllables in the voice library, and P is the total number of the first syllables in the two syllables established in the voice library;
generating the emission probability parameter according to Q/R, wherein Q is the total number of the voice elements corresponding to a specific syllable in the voice library, and R is the total number of the specific syllable in the voice library;
the step B3 specifically includes: multiplying the initial probability parameter, the transfer probability parameter and the emission probability parameter of the syllable in the candidate pinyin string to obtain the value of the occurrence probability of the candidate pinyin string.
14. The text entry method of claim 9, wherein the step C of generating candidate text based on the converted pinyin is: generating and displaying a candidate character list including at least one candidate character according to the converted pinyin;
and C, selecting the finally input characters as follows:
and C1, selecting words from the candidate word list according to the voice command, and inputting the selected words into the intelligent equipment.
15. The text input method of claim 14,
the method further comprises: pre-storing a corresponding matching relation between the voice command and the position of the candidate character in the candidate character list;
the step C1 specifically includes: and matching the received voice command with the candidate character positions in the candidate character list according to the corresponding matching relation between the voice command and the candidate character positions in the candidate character list, and taking the candidate characters at the matched candidate character positions as the finally input characters when the matching is correct.
16. The text entry method of claim 14, further comprising: pre-storing a corresponding matching relation between a keyboard instruction and a candidate character position in a candidate character list;
the method further comprises: after a keyboard instruction is detected, matching the detected keyboard instruction with the candidate character positions in the candidate character list according to the corresponding matching relation between the keyboard instruction and the candidate character positions in the candidate character list, and taking the candidate characters at the matched candidate character positions as the finally input characters when the matching is correct.
CN2007101124124A 2007-06-21 2007-06-21 Character inputting system and method for intelligent equipment Active CN101067780B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2007101124124A CN101067780B (en) 2007-06-21 2007-06-21 Character inputting system and method for intelligent equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2007101124124A CN101067780B (en) 2007-06-21 2007-06-21 Character inputting system and method for intelligent equipment

Publications (2)

Publication Number Publication Date
CN101067780A CN101067780A (en) 2007-11-07
CN101067780B true CN101067780B (en) 2010-06-02

Family

ID=38880346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2007101124124A Active CN101067780B (en) 2007-06-21 2007-06-21 Character inputting system and method for intelligent equipment

Country Status (1)

Country Link
CN (1) CN101067780B (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110096706A (en) * 2011-09-01 2019-08-06 上海触乐信息科技有限公司 The system and method for candidate word is adjusted on portable device based on geographical location dynamic
CN102591477A (en) * 2012-01-18 2012-07-18 邓晓波 Character selection method and character selection device for typing in short sentence
CN103294370A (en) * 2012-03-05 2013-09-11 北京千橡网景科技发展有限公司 Method and equipment for triggering keystroke operation
CN102779508B (en) * 2012-03-31 2016-11-09 科大讯飞股份有限公司 Sound bank generates Apparatus for () and method therefor, speech synthesis system and method thereof
CN103915095B (en) 2013-01-06 2017-05-31 华为技术有限公司 The method of speech recognition, interactive device, server and system
CN104238991B (en) * 2013-06-21 2018-05-25 腾讯科技(深圳)有限公司 Phonetic entry matching process and device
CN103578464B (en) * 2013-10-18 2017-01-11 威盛电子股份有限公司 Language model establishing method, speech recognition method and electronic device
CN103559880B (en) * 2013-11-08 2015-12-30 百度在线网络技术(北京)有限公司 Voice entry system and method
CN103903615B (en) * 2014-03-10 2018-11-09 联想(北京)有限公司 A kind of information processing method and electronic equipment
CN104598937B (en) * 2015-01-22 2019-03-12 百度在线网络技术(北京)有限公司 The recognition methods of text information and device
CN105511641A (en) * 2016-01-09 2016-04-20 温州智之粹知识产权有限公司 Voice control keyboard
CN107016994B (en) * 2016-01-27 2020-05-08 阿里巴巴集团控股有限公司 Voice recognition method and device
CN105913846B (en) * 2016-05-25 2019-12-06 北京云知声信息技术有限公司 voice registration realization method, device and system
CN106653007B (en) * 2016-12-05 2019-07-16 苏州奇梦者网络科技有限公司 A kind of speech recognition system
CN106843523B (en) * 2016-12-12 2020-09-22 百度在线网络技术(北京)有限公司 Character input method and device based on artificial intelligence
CN108573706B (en) * 2017-03-10 2021-06-08 北京搜狗科技发展有限公司 Voice recognition method, device and equipment
CN107347111A (en) * 2017-05-16 2017-11-14 上海与德科技有限公司 The control method and terminal of terminal
CN109785842B (en) * 2017-11-14 2023-09-05 蔚来(安徽)控股有限公司 Speech recognition error correction method and speech recognition error correction system
CN108520743B (en) * 2018-02-02 2021-01-22 百度在线网络技术(北京)有限公司 Voice control method of intelligent device, intelligent device and computer readable medium
TW202011384A (en) * 2018-09-13 2020-03-16 廣達電腦股份有限公司 Speech correction system and speech correction method
CN109767763B (en) * 2018-12-25 2021-01-26 苏州思必驰信息科技有限公司 Method and device for determining user-defined awakening words
CN111739514B (en) * 2019-07-31 2023-11-14 北京京东尚科信息技术有限公司 Voice recognition method, device, equipment and medium
CN110503958A (en) * 2019-08-30 2019-11-26 厦门快商通科技股份有限公司 Audio recognition method, system, mobile terminal and storage medium
CN110992959A (en) * 2019-12-06 2020-04-10 北京市科学技术情报研究所 Voice recognition method and system
CN111144096B (en) * 2019-12-11 2023-09-29 心医国际数字医疗系统(大连)有限公司 Pinyin completion training method, completion model, completion method and completion input method based on HMM
CN111090338B (en) * 2019-12-11 2021-08-27 心医国际数字医疗系统(大连)有限公司 Training method of HMM (hidden Markov model) input method model of medical document, input method model and input method
CN115346531B (en) * 2022-08-02 2024-08-09 启迪万众网络科技(北京)有限公司 Voice-to-text recognition system for voice media processing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1357821A (en) * 2000-12-15 2002-07-10 丽台科技股份有限公司 Phonetic input method
US20060177017A1 (en) * 2005-02-08 2006-08-10 Denso Corporation Device for converting voice to numeral
CN1896947A (en) * 2005-07-14 2007-01-17 光宝科技股份有限公司 Character inputting method and computer system therewith
CN1901041A (en) * 2005-07-22 2007-01-24 康佳集团股份有限公司 Voice dictionary forming method and voice identifying system and its method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1357821A (en) * 2000-12-15 2002-07-10 丽台科技股份有限公司 Phonetic input method
US20060177017A1 (en) * 2005-02-08 2006-08-10 Denso Corporation Device for converting voice to numeral
CN1896947A (en) * 2005-07-14 2007-01-17 光宝科技股份有限公司 Character inputting method and computer system therewith
CN1901041A (en) * 2005-07-22 2007-01-24 康佳集团股份有限公司 Voice dictionary forming method and voice identifying system and its method

Also Published As

Publication number Publication date
CN101067780A (en) 2007-11-07

Similar Documents

Publication Publication Date Title
CN101067780B (en) Character inputting system and method for intelligent equipment
KR100656736B1 (en) System and method for disambiguating phonetic input
JP4829901B2 (en) Method and apparatus for confirming manually entered indeterminate text input using speech input
US8311829B2 (en) Multimodal disambiguation of speech recognition
TWI266280B (en) Multimodal disambiguation of speech recognition
US7848926B2 (en) System, method, and program for correcting misrecognized spoken words by selecting appropriate correction word from one or more competitive words
CN100472411C (en) Method for cancelling character string in inputting method and word inputting system
CN102272827B (en) Method and apparatus utilizing voice input to resolve ambiguous manually entered text input
US20080180283A1 (en) System and method of cross media input for chinese character input in electronic equipment
CN101334704B (en) Multichannel Chinese input method facing to mobile equipment
CA2487614A1 (en) Method for entering text
JP3476007B2 (en) Recognition word registration method, speech recognition method, speech recognition device, storage medium storing software product for registration of recognition word, storage medium storing software product for speech recognition
CA2613154A1 (en) Dictionary lookup for mobile devices using spelling recognition
US20020069058A1 (en) Multimodal data input device
JPWO2008018274A1 (en) Character conversion device and method for controlling character conversion device
US20080002885A1 (en) Method of learning a context of a segment of text, and associated handheld electronic device
JP4230142B2 (en) Hybrid oriental character recognition technology using keypad / speech in adverse environment
KR101250897B1 (en) Apparatus for word entry searching in a portable electronic dictionary and method thereof
CN1965349A (en) Multimodal disambiguation of speech recognition
CN111429886B (en) Voice recognition method and system
JP2002189490A (en) Method of pinyin speech input
JP2007535692A (en) System and method for computer recognition and interpretation of arbitrarily spoken characters
KR101777141B1 (en) Apparatus and method for inputting chinese and foreign languages based on hun min jeong eum using korean input keyboard
CN1357821A (en) Phonetic input method
KR20100062831A (en) Apparatus and method for generating n-best hypothesis based on confusion matrix and confidence measure in speech recognition of connected digits

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: SHENZHEN SHIJI LIGHT SPEED INFORMATION TECHNOLOGY

Free format text: FORMER OWNER: TENGXUN SCI-TECH (SHENZHEN) CO., LTD.

Effective date: 20131015

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20131015

Address after: A Tencent Building in Shenzhen Nanshan District City, Guangdong streets in Guangdong province science and technology 518057 16

Patentee after: Shenzhen Shiji Guangsu Information Technology Co., Ltd.

Address before: 518057 Guangdong city of Shenzhen province high tech Park high-tech South Road Fiyta high-tech building 7

Patentee before: Tencent Technology (Shenzhen) Co., Ltd.