CN111475129A - Method and equipment for displaying candidate homophones through voice recognition - Google Patents

Method and equipment for displaying candidate homophones through voice recognition Download PDF

Info

Publication number
CN111475129A
CN111475129A CN201910067927.XA CN201910067927A CN111475129A CN 111475129 A CN111475129 A CN 111475129A CN 201910067927 A CN201910067927 A CN 201910067927A CN 111475129 A CN111475129 A CN 111475129A
Authority
CN
China
Prior art keywords
words
candidate
word
main
displaying
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910067927.XA
Other languages
Chinese (zh)
Inventor
周末
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201910067927.XA priority Critical patent/CN111475129A/en
Publication of CN111475129A publication Critical patent/CN111475129A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The invention discloses a display method of candidate homophones for voice recognition, which comprises the following steps: receiving the voice-recognized data from the server; analyzing the data and judging whether candidate words exist in the data or not; and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked. The application also provides corresponding electronic equipment and a computer readable storage medium. By applying the technical scheme disclosed by the invention, the intelligence of the intelligent equipment in voice recognition can be improved, and the problem that the user needs to manually type again in the voice recognition is avoided.

Description

Method and equipment for displaying candidate homophones through voice recognition
Technical Field
The invention relates to the technical field of voice recognition, in particular to a method and equipment for displaying candidate homophones for voice recognition.
Background
With the development of technology, people often use voice recognition function when using application programs in intelligent devices (such as various mobile devices, handheld devices, etc.). However, the accuracy of the existing speech recognition cannot reach 99.99% based on the profound sophistication of the Chinese language. The specific reasons are as follows:
1. the Chinese coding used in machine software is generally unified as GB2312 coding. 6376 Chinese characters are included in GB2312 code, wherein ancient characters are not included, while the Chinese has 21 initial consonants, 35 vowels, four tones and 400 syllables, and therefore the number of syllables is far smaller than that of Chinese characters. That is to say: chinese contains a large number of homophones and homophones.
2. The Speech recognition technology (ASR) is a technology that enables machines to "understand" human Speech. The main flow of speech recognition is shown in fig. 1:
firstly, performing signal processing including noise reduction, framing and the like on an input section of voice;
then, feature extraction is carried out based on the result of signal processing, and then acoustic mode matching is carried out based on an acoustic model;
finally, language processing is carried out based on the language model, and a character result corresponding to the voice is obtained.
According to the flow shown in fig. 1, the function of converting speech into text is in the stage of language processing. The main principles of speech processing are: receiving an acoustic sequence (which can be simply understood as pinyin), and giving a result with the maximum recognition probability corresponding to the acoustic sequence according to a large number of language models subjected to text training, context semantics and statistical rules, wherein the result is the finally recognized character.
The above process is illustrated below by a simple example:
step 1: voice recording: yu ef. Here, since the patent document needs to be expressed in a text manner, pinyin is used for representation, and a sound signal corresponding to the pinyin is actually input.
Step 2: the first syllable yu, can be identified by a number of words, for example: month, about, over, happy, etc. Since there is also a context entry, the result is not returned first for the time being.
And 3, step 3: the second syllable, at yu' f, is identified, and when combined with the above, the result of the identification will change significantly, excluding combinations that are not part of the word in daily use, such as: the homonym options identified may be: yuenangg, Yuesfu, Yufu, Lefu, etc. And according to the judgment of the language model, selecting the word with the highest recognition probability from the homophones as a recognition result and returning the word as the recognition result.
The probabilistic algorithm is trained based on a large amount of text in the language model. The more text that is trained, the higher the probability that it can be accurately recognized. However, the above prior art techniques are not ideal for recognizing less probable context semantics and in other special cases, the recognition result is not ideal.
For the reasons, when the intelligent device is used for voice recognition in daily life, the fact that the input voice is homophones is often met, but characters displayed after voice recognition are not the target words wanted by people. According to the prior art, when the situation is met, the user is usually required to manually re-input the characters by using an input method so as to modify the characters into the target words.
Therefore, based on the current common voice recognition technical scheme, if the voice of the homophone word is recorded, only the vocabulary with the higher usage rate can be recognized, and the vocabulary which the user wants to express cannot be correctly recognized. As exemplified above, the word with the highest probability identified is "monthly payment", but the intent of the logger is "the parent of the moon". If necessary, only the original characters can be deleted and then manually typed in again. When the number of entered texts is large, the part needing to be modified needs to be searched line by line. The intelligence of the intelligent device is seriously influenced by the existence of the problems.
Disclosure of Invention
The embodiment of the invention provides a method and equipment for displaying candidate homophones for voice recognition and a computer-readable storage medium, which are used for avoiding the problem that a user needs to manually type and input again in voice recognition.
The embodiment of the application discloses a method for displaying candidate homophones for voice recognition, which comprises the following steps:
receiving the voice-recognized data from the server;
analyzing the data and judging whether candidate words exist in the data or not;
and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked.
Preferably, the method further comprises:
underlining below the primary word;
or, the main word is presented in a different color from the other words;
alternatively, the main word is underlined below the main word and is presented in a different color from the other words.
Preferably, the method further comprises:
and when the clicking operation on the main word is detected, displaying the candidate word of the main word by using a candidate word display frame.
Preferably, the candidate words of the main word are sequentially displayed according to the sequence from high recognition probability to low recognition probability.
Preferably, the method further comprises:
and when the selection operation of any candidate word is detected, displaying the selected candidate word in the main text and hiding the candidate word display box.
Preferably, the presenting the selected candidate word in the main text comprises:
and displaying the selected candidate words in a hyperlink mode, wherein the candidate words can be clicked.
Preferably, underlining is performed below the candidate word shown in the main text;
or displaying the candidate words shown in the main text in a color different from other words;
or, underlining the candidate word displayed in the main text and displaying the candidate word displayed in the main text in a color different from other words.
Preferably, the method further comprises:
and when the clicking operation on the candidate words displayed in the main text is detected, displaying the main words and other candidate words by using the candidate word display box.
The embodiment of the application also discloses an electronic device, which comprises a memory, a processor and a computer program which is stored on the memory and can be run on the processor, wherein the processor executes the program to realize the following steps:
receiving the voice-recognized data from the server;
analyzing the data and judging whether candidate words exist in the data or not;
and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked.
Preferably, the processor executes the program to further implement the following steps:
underlining below the primary word;
or, the main word is presented in a different color from the other words;
alternatively, the main word is underlined below the main word and is presented in a different color from the other words.
Preferably, the processor executes the program to further implement the following steps:
and when the clicking operation on the main word is detected, displaying the candidate words of the main word in sequence by using a candidate word display frame according to the sequence from high to low of the recognition probability.
Preferably, the processor executes the program to further implement the following steps:
and when the selection operation of any candidate word is detected, displaying the selected candidate word in the main text in a hyperlink mode, wherein the candidate word can be clicked, and hiding a candidate word display box.
Preferably, the processor executes the program to further implement the following steps:
underlining below the candidate word presented in the main text;
or displaying the candidate words shown in the main text in a color different from other words;
or, underlining the candidate word displayed in the main text and displaying the candidate word displayed in the main text in a color different from other words.
Preferably, the processor executes the program to further implement the following steps:
and when the clicking operation on the candidate words displayed in the main text is detected, displaying the main words and other candidate words by using the candidate word display box.
Embodiments of the application also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method steps of any of claims 1-8.
The embodiment of the invention provides the method and the device for displaying the candidate homophones for voice recognition, and the method and the device have the advantages that the voice recognition function of the existing intelligent terminal is improved, the voice recognition result of the homophones is displayed, and the plurality of recognized candidate homophones are displayed for the user to select, so that the user can select the candidate words in a click selection mode, the intelligence of the intelligent device in the voice recognition is improved, and the problem that the user needs to manually type and input again in the voice recognition is solved.
Drawings
FIG. 1 is a schematic diagram of a main flow of conventional speech recognition;
FIG. 2 is a flowchart illustrating a method for displaying candidate homophones for speech recognition according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a conventional json data format;
FIG. 4 is a schematic diagram of an exemplary interface showing a word with the highest recognition probability according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating an interactive selection method for candidate homophones in speech recognition according to a second embodiment of the present invention;
FIG. 6 is a schematic diagram of an exemplary interface for displaying candidate words according to a second embodiment of the present invention;
FIG. 7 is a schematic diagram of an interface showing a candidate word selected by a user according to a second embodiment of the present invention;
FIG. 8 is a schematic interface diagram illustrating a candidate word display box for displaying candidate words according to a second embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and examples.
The embodiment of the invention provides a technical scheme for improving the voice recognition function of an intelligent terminal, which is used for simultaneously displaying a plurality of recognized candidate homophones for a user to select for the voice recognition result with homophones, so that the user can select in a click selection mode, the intelligence of intelligent equipment in voice recognition is improved, and the problem that the user needs to manually type and input again in the voice recognition is solved.
The method for displaying the candidate homophones for voice recognition provided by the embodiment of the invention comprises the following steps:
firstly, receiving data after voice recognition from a server;
then, analyzing the data and judging whether candidate words exist in the data or not;
and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked.
The main word may be displayed in a manner of underlining the lower part of the main word, displaying the main word in a color different from that of other words, or underlining the lower part of the main word and displaying the main word in a color different from that of other words, so as to distinguish the main word from other words.
As described above, a main word may be clicked, and when a click operation on the main word is detected, a candidate word of the main word is displayed by using a candidate word display box, so that a candidate word to be selected is displayed to a user. When the candidate words are presented, the candidate words can be presented in sequence from high to low in recognition probability.
In addition, when a selection operation of any candidate word is detected, it indicates that the user wishes to treat the word as a new candidate word, and therefore, the selected candidate word is presented in the main text, and a candidate word presentation box displayed before is hidden. When the selected candidate word is presented, the candidate word can be presented in a hyperlink manner as described above, and the candidate word can be clicked.
Similarly, for the candidate words displayed in the main text, underlining may be performed below the candidate words, or displaying the candidate words in a color different from that of other words, or underlining and displaying the candidate words in a color different from that of other words.
And when the clicking operation on the candidate words displayed in the main text is detected, displaying the main words and other candidate words by using the candidate word display box, so that other candidate words which can be selected are displayed to the user.
The technical solution of the present application is further described in detail by three preferred embodiments:
the first embodiment is as follows:
a flow chart of a method for displaying candidate homophones for speech recognition provided by an embodiment of the present invention is shown in fig. 2, and includes the following steps:
step 1: the client receives the voice-recognized data from the server.
In this embodiment, the client refers to an application client providing a voice recognition function in the smart device.
In the process of server speech recognition, according to the existing speech model algorithm, the server needs to extract 1-N homophones and return the homophones to the client, for example, a common json data format is shown in fig. 3:
still taking "yu ef multi" as an example, as given in the background art, the server will return 3 homophones of monthly payment, Yue father and Yufu to the client, and respectively give the recognition probabilities of the 3 words: 0.87, 0.67 and 0.32.
Step 2: and the client analyzes the data returned by the server.
And step 3: the client judges whether the returned data contains a candidate word, if so, the step 4 is executed; otherwise, displaying according to a conventional mode, and ending.
And 4, step 4: and displaying the word with the highest recognition probability as a word in a hyperlink-like manner, underlining the word below the word, and clicking.
Preferably, the word may also be presented in a different color than the other words. In this embodiment, the word with the highest recognition probability is referred to as the "main word", and the "main word" is relative to the "candidate word".
An exemplary interface that presents the words with the highest probability of recognition is shown in fig. 4. According to the embodiment, step 1, the recognition probability of "monthly payment" is the highest, so that in the interface shown in fig. 4, the embodiment displays "monthly payment" in a hyperlink manner, the main word can be clicked, and "monthly payment" is in a blue font and is underlined.
Example two:
after the word with the highest recognition probability is displayed according to the embodiment, the candidate homophones can be further selected according to the interaction method provided by the second embodiment of the present invention, as shown in fig. 5, including:
step 1: and detecting the clicking operation of the user.
Step 2: and (3) if the word corresponding to the clicking operation of the user can be clicked, indicating that the word has a candidate word, and executing the step 3, otherwise, ending.
And step 3: and displaying the candidate words of the word by using the candidate word display box.
An exemplary interface for presenting candidate words is shown in fig. 6, where the interface shown in fig. 6 presents two candidate words for "monthly payment": "Yue father" and "le Fu". Preferably, the candidate words may be displayed in order of the recognition probability from high to low, and at most N candidate words are displayed, for example: n is equal to 3. If not, the display can be slid to view.
And 4, step 4: when detecting that the user selects any candidate word, executing step 5.
And 5: and taking the candidate word selected by the user as a new main word, displaying the new main word in the main text, and hiding the candidate word display frame.
Preferably, the candidate word selected by the user can also be presented in a hyperlink-like manner, underlined below the word, and clickable. Assuming that the user selects "Yuenai," the presentation interface is as shown in FIG. 7.
When the user clicks the current main word "Yuenai" again, the original main word "monthly payment" is displayed in the candidate word display box together with the other candidate word "Yufu", as shown in FIG. 8. For the same word, the candidate words can be switched by repeatedly clicking.
Example three:
the second embodiment replaces the original main word with a candidate word according to the selection of the user. For the main word displayed in the candidate word display frame, the main word may also be restored to the main word in a mode of re-selection, which is described in this embodiment.
Referring to fig. 8 in the second embodiment, the following steps are continued:
step 1: when the fact that the user clicks the current main word 'Yuenai' is detected, the original main word 'monthly payment' and other candidate words 'Yufu' are displayed in the candidate word display box together.
Step 2: when detecting that the user selects the "monthly payment" in the candidate word presentation box, step 3 is executed.
And step 3: and displaying the monthly payment selected by the user as a new main word in the main text, and hiding the candidate word display frame.
So far, the original main word "monthly payment" becomes the main word again and is displayed in the main text box, and other candidate words are hidden.
Corresponding to the above method, an embodiment of the present application further provides an electronic device, whose constituent structure is shown in fig. 9, and includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the following steps:
receiving the voice-recognized data from the server;
analyzing the data and judging whether candidate words exist in the data or not;
and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked.
Preferably, the processor executes the program to further implement the following steps:
underlining below the primary word;
or, the main word is presented in a different color from the other words;
alternatively, the main word is underlined below the main word and is presented in a different color from the other words.
Preferably, the processor executes the program to further implement the following steps:
and when the clicking operation on the main word is detected, displaying the candidate words of the main word in sequence by using a candidate word display frame according to the sequence from high to low of the recognition probability.
Preferably, the processor executes the program to further implement the following steps:
and when the selection operation of any candidate word is detected, displaying the selected candidate word in the main text in a hyperlink mode, wherein the candidate word can be clicked, and hiding a candidate word display box.
Preferably, the processor executes the program to further implement the following steps:
underlining below the candidate word presented in the main text;
or displaying the candidate words shown in the main text in a color different from other words;
or, underlining the candidate word displayed in the main text and displaying the candidate word displayed in the main text in a color different from other words.
Preferably, the processor executes the program to further implement the following steps:
and when the clicking operation on the candidate words displayed in the main text is detected, displaying the main words and other candidate words by using the candidate word display box.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for presenting the candidate homophones for speech recognition according to the embodiment of the present application.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (15)

1. A method for displaying candidate homophones for speech recognition is characterized by comprising the following steps:
receiving the voice-recognized data from the server;
analyzing the data and judging whether candidate words exist in the data or not;
and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked.
2. The method of claim 1, further comprising:
underlining below the primary word;
or, the main word is presented in a different color from the other words;
alternatively, the main word is underlined below the main word and is presented in a different color from the other words.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and when the clicking operation on the main word is detected, displaying the candidate word of the main word by using a candidate word display frame.
4. The method of claim 3, wherein:
and displaying the candidate words of the main word in sequence according to the sequence of the recognition probability from high to low.
5. The method of claim 3, further comprising:
and when the selection operation of any candidate word is detected, displaying the selected candidate word in the main text and hiding the candidate word display box.
6. The method of claim 5, wherein presenting the selected candidate word in a main text comprises:
and displaying the selected candidate words in a hyperlink mode, wherein the candidate words can be clicked.
7. The method of claim 6, wherein:
underlining below the candidate word presented in the main text;
or displaying the candidate words shown in the main text in a color different from other words;
or, underlining the candidate word displayed in the main text and displaying the candidate word displayed in the main text in a color different from other words.
8. The method of claim 6, further comprising:
and when the clicking operation on the candidate words displayed in the main text is detected, displaying the main words and other candidate words by using the candidate word display box.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of:
receiving the voice-recognized data from the server;
analyzing the data and judging whether candidate words exist in the data or not;
and if the candidate words exist, displaying the words with the highest recognition probability as main words in a hyperlink mode, wherein the main words can be clicked.
10. The electronic device of claim 9, wherein the processor, when executing the program, further performs the steps of:
underlining below the primary word;
or, the main word is presented in a different color from the other words;
alternatively, the main word is underlined below the main word and is presented in a different color from the other words.
11. The electronic device according to claim 9 or 10, wherein the processor when executing the program further performs the steps of:
and when the clicking operation on the main word is detected, displaying the candidate words of the main word in sequence by using a candidate word display frame according to the sequence from high to low of the recognition probability.
12. The electronic device of claim 11, wherein the processor, when executing the program, further performs the steps of:
and when the selection operation of any candidate word is detected, displaying the selected candidate word in the main text in a hyperlink mode, wherein the candidate word can be clicked, and hiding a candidate word display box.
13. The electronic device of claim 12, wherein the processor, when executing the program, further performs the steps of:
underlining below the candidate word presented in the main text;
or displaying the candidate words shown in the main text in a color different from other words;
or, underlining the candidate word displayed in the main text and displaying the candidate word displayed in the main text in a color different from other words.
14. The electronic device of claim 13, wherein the processor, when executing the program, further performs the steps of:
and when the clicking operation on the candidate words displayed in the main text is detected, displaying the main words and other candidate words by using the candidate word display box.
15. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method steps of any one of claims 1 to 8.
CN201910067927.XA 2019-01-24 2019-01-24 Method and equipment for displaying candidate homophones through voice recognition Pending CN111475129A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910067927.XA CN111475129A (en) 2019-01-24 2019-01-24 Method and equipment for displaying candidate homophones through voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910067927.XA CN111475129A (en) 2019-01-24 2019-01-24 Method and equipment for displaying candidate homophones through voice recognition

Publications (1)

Publication Number Publication Date
CN111475129A true CN111475129A (en) 2020-07-31

Family

ID=71743484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910067927.XA Pending CN111475129A (en) 2019-01-24 2019-01-24 Method and equipment for displaying candidate homophones through voice recognition

Country Status (1)

Country Link
CN (1) CN111475129A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181167A (en) * 2020-10-27 2021-01-05 维沃移动通信有限公司 Input method candidate word processing method and electronic equipment
CN112530421A (en) * 2020-11-03 2021-03-19 科大讯飞股份有限公司 Voice recognition method, electronic equipment and storage device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103645876A (en) * 2013-12-06 2014-03-19 百度在线网络技术(北京)有限公司 Voice inputting method and device
US20140358533A1 (en) * 2013-05-30 2014-12-04 International Business Machines Corporation Pronunciation accuracy in speech recognition
CN107316639A (en) * 2017-05-19 2017-11-03 北京新美互通科技有限公司 A kind of data inputting method and device based on speech recognition, electronic equipment
CN108711422A (en) * 2018-05-14 2018-10-26 腾讯科技(深圳)有限公司 Audio recognition method, device, computer readable storage medium and computer equipment
CN109215630A (en) * 2018-11-14 2019-01-15 北京羽扇智信息科技有限公司 Real-time speech recognition method, apparatus, equipment and storage medium
CN110808049A (en) * 2018-07-18 2020-02-18 深圳市北科瑞声科技股份有限公司 Voice annotation text correction method, computer device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140358533A1 (en) * 2013-05-30 2014-12-04 International Business Machines Corporation Pronunciation accuracy in speech recognition
CN103645876A (en) * 2013-12-06 2014-03-19 百度在线网络技术(北京)有限公司 Voice inputting method and device
CN107316639A (en) * 2017-05-19 2017-11-03 北京新美互通科技有限公司 A kind of data inputting method and device based on speech recognition, electronic equipment
CN108711422A (en) * 2018-05-14 2018-10-26 腾讯科技(深圳)有限公司 Audio recognition method, device, computer readable storage medium and computer equipment
CN110808049A (en) * 2018-07-18 2020-02-18 深圳市北科瑞声科技股份有限公司 Voice annotation text correction method, computer device and storage medium
CN109215630A (en) * 2018-11-14 2019-01-15 北京羽扇智信息科技有限公司 Real-time speech recognition method, apparatus, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181167A (en) * 2020-10-27 2021-01-05 维沃移动通信有限公司 Input method candidate word processing method and electronic equipment
CN112530421A (en) * 2020-11-03 2021-03-19 科大讯飞股份有限公司 Voice recognition method, electronic equipment and storage device

Similar Documents

Publication Publication Date Title
CN108255290B (en) Modal learning on mobile devices
TWI443551B (en) Method and system for an input method editor and computer program product
TWI437449B (en) Multi-mode input method and input method editor system
US20160103812A1 (en) Typing assistance for editing
AU2005229636B2 (en) Generic spelling mnemonics
CN111462740A (en) Voice command matching for voice-assisted application prototyping for non-speech alphabetic languages
CN110147549A (en) For executing the method and system of text error correction
JP7400112B2 (en) Biasing alphanumeric strings for automatic speech recognition
CN111475129A (en) Method and equipment for displaying candidate homophones through voice recognition
CN111209367A (en) Information searching method, information searching device, electronic equipment and storage medium
CN113743102B (en) Method and device for recognizing characters and electronic equipment
CN114550693A (en) Multilingual voice translation method and system
CN113268981A (en) Information processing method and device and electronic equipment
JP5318030B2 (en) Input support apparatus, extraction method, program, and information processing apparatus
CN109144284B (en) Information display method and device
CN113722467B (en) Processing method, system, device and storage medium for user search intention
CN100565553C (en) The method and system that is used for the handwriting input of Asian language
JP2005149042A (en) Voice input translation system and translation program
CN118051593A (en) Data processing method and device and electronic equipment
KR20240063576A (en) Apparatus and method for user interface for pronunciation analysis
CN114911896A (en) Voice-based searching method and related equipment
CN116595137A (en) Question answering method and device, electronic equipment and storage medium
CN115877997A (en) Interactive element-oriented voice interaction method, system and storage medium
WO2022104297A1 (en) Multimodal input-based data selection and command execution
CN115270769A (en) Text error correction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination