WO2006002219A2 - Systems and methods for spell correction of non-roman characters and words - Google Patents
Systems and methods for spell correction of non-roman characters and words Download PDFInfo
- Publication number
- WO2006002219A2 WO2006002219A2 PCT/US2005/022027 US2005022027W WO2006002219A2 WO 2006002219 A2 WO2006002219 A2 WO 2006002219A2 US 2005022027 W US2005022027 W US 2005022027W WO 2006002219 A2 WO2006002219 A2 WO 2006002219A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- input
- entry
- language
- questionable
- user input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
- G06F40/129—Handling non-Latin characters, e.g. kana-to-kanji conversion
Definitions
- the present invention relates generally to processing non-Roman based languages. More specifically, systems and methods to process and correct spelling errors for non- Roman based words such as in Chinese, Japanese, and Korean languages using a rule-based classifier and a hidden Markov model are disclosed.
- Description of Related Art Spell correction generally includes detecting erroneous words and determining appropriate replacements for the erroneous words.
- an effective spell checker for a non- Roman based language should make use of contextual information to determine which characters and/or words in context are not suitable.
- Spell correction for non-Roman languages such as CJK languages is also complex and challenging in that there are no standard dictionaries in such languages because the definition of CJK words are not clean. For example, some may regard "Beijing city" in Chinese as one word while others may regard them as two words.
- the English dictionary/wordlist lookup is a key feature in English spell correction and thus English spell correction methods cannot be easily adapted for use in CJK languages.
- the systems and methods use transformation rules, hidden Markov models and similarity matrix of confusing characters.
- the similarity between a pair of confusing characters may be a positive number if the characters have the same pronunciation and/or share some input keystrokes in simplified or traditional Chinese. Otherwise, the value is zero.
- the similarity may have a Boolean value, e.g., 1 for a pair of confusing characters and 0 for a pair of non-confusing characters.
- the systems and methods are particularly applicable to web-based search engines and downloadable applications at client sites, e.g., implemented in a toolbar or deskbar, but are applicable to various other applications.
- the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device, a method, or a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or electronic communication lines.
- the term computer generally refers to any device with computing power such as personal digital assistants (PDAs), cellular telephones, and network switches.
- the method generally includes converting an input entry in a first language such as Chinese to at least one intermediate entry in an intermediate representation, such as pinyin, different from the first language, converting the intermediate entry to at least one possible alternative spelling of the input in the first language, and determining that the input entry is either a correct or questionable input entry when a match between the input entry and all possible alternative spellings to the input entry is or is not located, respectively.
- a first language such as Chinese
- an intermediate representation such as pinyin
- pinyin refers to all phonetic notations for Chinese, simplified or traditional, include zhuyin fuhao (Bopomofo), i.e., "The Notation of Annotated Sounds.” Similarity between pairs of confusing characters in the first language can be defined according to common tokens in the intermediate representation.
- the questionable input entry may be classified using, for example, a transformation rule based classifier based on transformation rules generated by a transformation rules generator.
- a transformation rule based classifier based on transformation rules generated by a transformation rules generator.
- Various other classifiers such as decision tree and neural network classifiers may be similarly employed.
- the converting may include converting multiple input entries, such as user queries in a query log.
- the method may further include classifying, e.g., by a transformation rule based classifier, the questionable entry as a correctly spelled or an incorrectly spelled entry based on a set of rules such as spell correction transformation rules. Users' votes, e.g., query logs and/or webpages, are preferably utilized to generate the transformation rules.
- the method may also include generating and training the spell correction transformation rules using a transformation rules generator using the questionable input entry and the possible alternative spellings.
- the method may further include receiving a user input in the first language, determining whether any of the rules apply to the user input, generating at least one alternate spelling in the first language corresponding to the user input upon determining that at least one rule applies to the user input, comparing a likelihood of the user input with a likelihood of at least one alternate spelling of the user input, and making a spell correction suggestion and/or a spell correction with at least one alternate spelling of the user input that has a higher likelihood than the user input.
- a system generally includes a first converter configured to convert an input in a first language to at least one intermediate representation of the input entry, the intermediate representation being different from the first language, a second converter configured to convert the intermediate representation to at least one possible alternative spelling of the input in the first language, locating a match by comparing the possible alternative spelling to the input entry, and determining that the input entry is a questionable input entry if a match is not located from all the possible alternative spellings and that the input entry is a correct input entry if a match is located.
- a computer program product for use in conjunction with a computer system having a computer readable storage medium on which are stored instructions executable on a computer processor, the instructions generally including receiving an input entry in a first language, converting the input entry to at least one intermediate representation of the input entry, the intermediate representation being different from the first language, converting the intermediate representation to at least one possible alternative spelling in the first language, locating a match by comparing at least one possible alternative spelling to the input entry, and determining that the input entry is a questionable input entry if a match is not located from all the possible alternative spellings and that the input entry is a correct input entry if a match is located.
- An application implementing the system and method may be implemented on a server site such as on a search engine or may be implemented on a client site such as a user's computer, e.g., downloaded, to provide spell corrections for text inputting into a document or to interface with a remote server such as a search engine.
- the client site application may optionally include a user-editable table of stop rule patterns that allows the user to customize the application by specifying that certain spell corrections are disallowed, e.g., never replace X and Y except when X precedes or follows Z.
- FIG. 1 is block diagram of an illustrative system and method for performing forward and reverse conversions to and from an intermediate form of the non-Roman based language to determine possible alternate spellings for questionable original inputs.
- FIG. 2 is block diagram of an illustrative system and method for generating spell correction transformation rules from a set of entries.
- FIG. 3 is a flowchart illustrating a process for automatically generating spell correction transformation rules.
- FIG. 4 is a flowchart illustrating a process utilizing the transformation rules for processing an entry to determine spell correction suggestions, if any.
- alternate spelling or alternate form of an input is used herein to refer to an alternate set of characters and/or words different from the input but in the same language as the input, whether the input is a single character or word, a series or collection of characters and/or words, a phrase, a sentence, etc.
- the questionable input entries are identified from input entries and possible alternate spellings are generated by the questionable input entry detector illustrated in FIG. 1.
- the spell correction transformation rules are then generated and trained and the questionable entries are classified as correct or incorrect by the transformation rules generator and classifier as shown in FIG. 2.
- the systems and methods use transformation rules, hidden Markov models and similarity matrix of confusing characters.
- the similarity between a pair of confusing characters may be a positive number if the characters have the same pronunciation and/or share some input keystrokes in simplified or traditional Chinese. Otherwise, the value is zero.
- the similarity may have a Boolean value, e.g., 1 for a pair of confusing characters and 0 for a pair of non-confusing characters.
- FIG. 1 is block diagram of an illustrative questionable input entry detector 100 for performing forward and reverse conversions to and from an intermediate form, e.g., pinyin, of simplified Chinese to identify questionable original inputs and to determine possible alternate spellings for questionable original inputs.
- the questionable input entry detector 100 illustrated in FIG. 1 makes use of the convenient fact that pinyin is a commonly-used input method for simplified Chinese. However, any other intermediate form, Roman-based or non-Roman based, may be implemented and utilized. Similarly, the questionable input entry detector 100 may be adapted for use with various other non-Roman based languages. As shown in FIG.
- a word-pinyin converter 104 converts each original entry 102 in Chinese characters into one or more pronunciations or pinyins 106 corresponding to the original entry 102.
- a pinyin- word converter 108 then converts the pinyins 106 to possible spellings 110 in Chinese characters.
- Other suitable converters 104, 106 for converting text in a first language to an intermediate representation and then back to the first language may be employed. Pinyin is merely a convenient intermediate representation for Chinese or simplified Chinese.
- a comparer 1 12 compares the original entry 102 with the possible spellings 110, both in the first language, to determine if there is a match.
- Pinyin is a phonetic input method used mainly for inputting simplified Chinese character. As referred to herein, pinyin generally refers to phonetic representation of Chinese characters, with or without representation of the tones associated with the Chinese characters.
- pinyin refers to all phonetic notations for Chinese, simplified or traditional, include zhuyin fuhao (Bopomofo), i.e., "The Notation of Annotated Sounds.” Pinyin uses Roman characters and has a vocabulary listed in the form of multiple syllable words. Because Chinese has numerous homographs and homophones, each original entry 102 may be converted into multiple pinyins 106 by the word-pinyin converter 104 and, similarly, each pinyin 106 may be converted into multiple possible spellings in Chinese characters 110 by the piny in- word converter 108.
- one phonetic syllable may correspond to many different Hanzi.
- the pronunciation of "yi" in Mandarin can correspond to over 100 Hanzi.
- the processes implemented by the word-pinyin converter 104 and the pinyin-word converter 108 of converting each original entry 102 to pinyin 106 and then back to Chinese characters 110 may be non-trivial given the large proportion of Chinese words that are homographs and/or homophones.
- the systems and methods as described herein use transformation rules, hidden Markov models and similarity matrix of confusing characters.
- the similarity between a pair of confusing characters may be a positive number if the characters have similar pronunciation, share similar input keystrokes, and/or are similarly spelled, i.e., visually similar. Otherwise, the value is zero.
- the similarity may have a Boolean value, e.g., 1 for a pair of confusing characters and 0 for a pair of non- confusing characters.
- the similarity between a pair of confusing characters in the first language can be defined according to common tokens in the intermediate representation.
- Various suitable mechanisms for converting Chinese words to pinyins and for converting pinyins to Chinese words may be implemented.
- various decoders are suitable for translating pinyin to Hanzi (Chinese characters).
- a Viterbi decoder using hidden Markov models may be implemented.
- the training for the hidden Markov models may be achieved, for example, by collecting empirical counts or by computing an expectation and performing an iterative maximization process.
- the Viterbi algorithm is a useful and efficient algorithm to decode the source input according to the output observations of a Markov communication channel.
- the Viterbi algorithm has been successfully implemented in various applications for natural language processing, such as speech recognition, optical character recognition, machine translation, speech tagging, parsing and spell checking.
- various other suitable assumptions may be made in implementing the decoding algorithm.
- the Viterbi algorithm is merely one suitable decoding algorithm that may be implemented by the decoder and various other suitable decoding algorithms such as a finite state machine, a Bayesian network, a decision plane algorithm (a high dimension Viterbi algorithm) or a Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm (a two pass forward/backward Viterbi algorithm) may be implemented.
- the questionable entries detected by the questionable input entry detector 100 generally include nearly all spelling errors. However, the questionable entries also generally include relatively high false-alarm/false-positive rate, , i.e., ratio of the number of correct queries marked as incorrect to the number of incorrect queries.
- the questionable queries 116 as determined by the questionable entry detector 100 may then be classified as correct or incorrect.
- the classifier may be a Transformation Rule Based classifier, as is preferred, or may be a decision tree classifier, a neural network classifier, and the like.
- FIG. 2 is block diagram of an illustrative system and method 120 for generating spell correction transformation rules from a set of original entries 102 as processed by the questionable entry detector 100.
- the set of original entries 102 may include user input entries such as query logs for a web search engine and/or entries derived from documents such as those available on the Internet, for example.
- the set of original inputs 102 may include a collection of user queries from the past three weeks or two months, for example.
- Examples of documents may include web content and various publications such as newspaper, books, magazines, webpages, and the like.
- the set of original inputs 102 may be derived from a set, collection or repository of documents, for example, documents written in simplified and/or traditional Chinese available on the Internet. It is noted that the illustrative systems and methods as described herein are particularly applicable in the context of a web search engine and to a search engine for a database containing organized data. However, it is to be understood that the systems and method may be adapted and employed for various other applications for spelling error detection and correction, particularly for entries in a non-Romanized language.
- the system and method may be adapted for a CJK text input application, e.g., word processing application, that detects and corrects spelling errors.
- the transformation rules generator and classifier 120 implements a transformation based learning algorithm, introduced by Eric Brill, that, during the training process, automatically extracts (learns) and ranks transformation rules according to confidence measurements from training data, e.g., human annotated incorrect spellings. These transformation rules are used by the annotator/voter 124. Note that transformation rules are different from grammar rules used in linguistics in that the transformation rules are based on statistics rather than linguistic knowledge. Thus, for example, if most of the entries incorrectly spell certain words in the same incorrect way, the incorrect spelling would be classified as correct.
- Transformation Rule Based methods are presented in US Pat. No. 6,684201 issued on Jan. 27, 2004 to Eric Brill and entitled "Linguistic Disambiguation System and Method Using String-Based Pattern Training to Learn to Resolve Ambiguity Sites," the entirety of which is incorporated by reference herein.
- the transformation rules generator 120 generates rules automatically, i.e., unsupervised, by utilizing the users' votes. In other words, the correctness of a pattern of characters is determined according to the majority of votes in the database, e.g., the query logs, rather than human annotated data.
- Each transformation rule is associated with a confidence measurement such that rules with higher confidence measurements are applied later than rules with lower confidence measurements.
- a first transformation rule may specify replacing X with Y if B precedes X.
- a second transformation rule with a higher confidence measurement may specify replacing Y with X if E follows Y.
- the first transformation rule would first be applied to an entry BXE to generate BYE.
- the second transformation rule would then be applied to the resulting entry BYE to converted the entry back to BXE.
- the order that the transformation rules are applied can affect the outcome.
- the characters being replaced and the replacement characters may be any component of the entry and need not necessarily be words.
- the condition may be based on any context, part-of-speech tags or grammatical non-terminal labels (e.g., NP for noun phrase).
- each questionable entry 116 and its corresponding possible alternate spellings 110 output by the questionable entry detector 100 is received by the annotator 124 of the spell correction transformation rules generator 120.
- the annotator 124 classifies entries 128 based initially on the initial transformation rules 126 and eventually on the extracted and ranked transformation rules 130.
- the learning phase may be supervised, i.e., by human personnel, and/or unsupervised.
- an initial set of a few common manually created transformation rules is used to automatically annotate a small set of questionable entries, with some human monitoring or without any human monitoring by utilizing users' votes.
- additional transformation rules are generated, preferably also with some human monitoring, and additional questionable entries are annotated.
- the resulting rules which govern a significant amount of user traffic for example, with relatively few rules may be regarded as very reliable and thus correspond to a high confidence measurement. Note that since rules with higher confidence typically have less coverage than those with lower confidence, both rules with high confidence and rules with comparatively lower confidence are used.
- the relatively large number of remaining questionable entries that account for a relative small proportion of user traffic may be automatically generated without human monitoring, for purposes of cost efficiency.
- One illustrative process 150 for automatically generating such rules is shown in the flowchart of FIG. 3.
- a comparison of Q and the alternate spelling Q' is made at block 156 to determine characters in Q that are possibly improper and their substitutions C.
- a window of width 2N+1 is opened with N preceding characters and N succeeding characters of C.
- any suitable length of context e.g., 2N+1, may be implemented and the length of context before and after the character in question may but need not be equal.
- the frequencies F(pre-C, C, post-C) of all subsequences (pre-C, C, post -C) from C_ ⁇ -N ⁇ , ..., C,..., C_ ⁇ N ⁇ are counted to ensure that the rule is significant, i.e., if the rule can cover a reasonable large portion of spelling errors in the questionable entries.
- Decision block 162 determines whether the rule is reliable, e.g., by. using query logs and webpages, i.e., users' voting. If the rule is determined to be reliable, the transformation rule, i.e., substitute C for C given pre-C, post-C, is extracted. Specifically, the rule is deemed to be reliable if: F(pre-C, C, post-C) > Tl and F(pre-C, C, post-C) / F(pre-C, C, post-C) > T2, where Tl is a minimum significance threshold and T2 is a minimum confidence threshold.
- the process 150 implemented by the transformation rules generator generates rules automatically, i.e., unsupervised, by utilizing the users' votes such that the correctness of a pattern of characters is determined according to the majority of votes in the database, e.g., the query logs, rather than human annotated data. Because the most frequent transformation rules will govern a very large portion of the error patterns, the size of the rule set preferably does not increase rapidly with the number of questionable entries. A minimum occurrence of each rule may also be set to limit the size of the transformation rule set.
- An application implementing the systems and methods described herein may be implemented on a server site such as on a search engine or may be implemented on a client site such as an end user's computer, e.g., downloaded, to provide spell corrections for text inputting into a word processing document or to interface with a remote server such as a search engine.
- the client site application may be implemented, for example, in a toolbar, and may optionally include a user-editable table of stop rule patterns that allows the user to customize the application by specifying that certain spell corrections are disallowed, e.g., never replace X and Y except when X precedes or follows Z.
- FIG. 4 is a flowchart illustrating a process 200 utilizing the transformation rules for processing an entry to determine spell correction suggestions, if any.
- Decision block 202 determines if any spell correction rule applies to the user input.
- a hash table of the spell correction transformation rules may be examined to determine if any transformation rule applies to the user input. For example, for a given Chinese user input ABCDE, if a transformation rule dictates that character C be replaced with C if the preceding characters to C are AB, then this particular rule is applicable to the user input. If no rules are applicable to the user input, no spell correction suggestion is made for user input. Alternatively, for each spell correction transformation rule that is applicable to the user input, alternate spellings for the user input corresponding to the applicable spell correction transformation rule are generated at block 204. In the example above, an alternate spelling ABCDE is generated for the user input ABCDE corresponding to the applicable spell correction transformation rule.
- decision block 206 the likelihood of each alternate spelling is determined and compared to the likelihood of the user input.
- decision block 206 may utilize the hidden Markov model and the Viterbi decoder to compute the likelihood.
- the relative output probabilities of ABCDE and ABCDE are determined and compared.
- the alternate spelling has a higher likelihood than the user input and thus regarded as a valid correction if: P(ABCDE) * P(transformation rule) > P(ABCDE),
- P(transformation rule) may be defined as the ratio of the number of successful corrections and the total number of corrections.
- P(ABCDE) should take into account the ambiguity in segmentation. For example, if ABCDE has two possible segmentations AB-CDE and ABC-DE, then the probably is a sum of products of Bayesian probabilities:
- P(ABCDE) P(input-end
- the equation above is a Bayesian probability derived from the original Bayesian probability by applying the Markov assumption which determines the current word by the preceding word rather than by the entire history. The determination of P(ABCDE) may be similarly made.
- the particular spell correction suggestion is not made. However, if the given alternate spelling is more likely than the user input as determined at decision block 206, the corresponding alternate spelling for the user's input is suggested and/or automatically made at block 208.
- the systems and method for spell correction as described herein are particularly well suited for use with non-Roman based languages and can be highly effective in both detecting spelling errors and in generating alternate spelling suggestions or corrections.
- the systems and method for spell correction are also particularly applicable in the context of a web search engine and to a search engine for a database containing organized data in performing spell correction of various user inputs or queries.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Document Processing Apparatus (AREA)
- Machine Translation (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020077001543A KR101146539B1 (ko) | 2004-06-23 | 2005-06-21 | 비-로마자 문자 및 단어의 철자 정정을 위한 시스템 및방법 |
| JP2007518226A JP2008504605A (ja) | 2004-06-23 | 2005-06-21 | 非ローマ文字および単語のスペル修正のためのシステムおよび方法 |
| CN2005800263504A CN101002198B (zh) | 2004-06-23 | 2005-06-21 | 用于非罗马字符和字的拼写校正系统和方法 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/875,449 US20050289463A1 (en) | 2004-06-23 | 2004-06-23 | Systems and methods for spell correction of non-roman characters and words |
| US10/875,449 | 2004-06-23 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2006002219A2 true WO2006002219A2 (en) | 2006-01-05 |
| WO2006002219A3 WO2006002219A3 (en) | 2006-08-03 |
Family
ID=35427493
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2005/022027 Ceased WO2006002219A2 (en) | 2004-06-23 | 2005-06-21 | Systems and methods for spell correction of non-roman characters and words |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20050289463A1 (https=) |
| JP (2) | JP2008504605A (https=) |
| KR (1) | KR101146539B1 (https=) |
| CN (1) | CN101002198B (https=) |
| WO (1) | WO2006002219A2 (https=) |
Families Citing this family (156)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
| US8650187B2 (en) * | 2003-07-25 | 2014-02-11 | Palo Alto Research Center Incorporated | Systems and methods for linked event detection |
| US7260780B2 (en) * | 2005-01-03 | 2007-08-21 | Microsoft Corporation | Method and apparatus for providing foreign language text display when encoding is not available |
| US8438142B2 (en) * | 2005-05-04 | 2013-05-07 | Google Inc. | Suggesting and refining user input based on original user input |
| US7321892B2 (en) * | 2005-08-11 | 2008-01-22 | Amazon Technologies, Inc. | Identifying alternative spellings of search strings by analyzing self-corrective searching behaviors of users |
| US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
| US7895223B2 (en) | 2005-11-29 | 2011-02-22 | Cisco Technology, Inc. | Generating search results based on determined relationships between data objects and user connections to identified destinations |
| US8006180B2 (en) * | 2006-01-10 | 2011-08-23 | Mircrosoft Corporation | Spell checking in network browser based applications |
| US7849144B2 (en) | 2006-01-13 | 2010-12-07 | Cisco Technology, Inc. | Server-initiated language translation of an instant message based on identifying language attributes of sending and receiving users |
| US8732314B2 (en) * | 2006-08-21 | 2014-05-20 | Cisco Technology, Inc. | Generation of contact information based on associating browsed content to user actions |
| US9552349B2 (en) * | 2006-08-31 | 2017-01-24 | International Business Machines Corporation | Methods and apparatus for performing spelling corrections using one or more variant hash tables |
| US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
| US8014996B1 (en) | 2006-09-11 | 2011-09-06 | WordRake Holdings, LLC | Computer processes for analyzing and improving document readability by identifying passive voice |
| US8024319B2 (en) * | 2007-01-25 | 2011-09-20 | Microsoft Corporation | Finite-state model for processing web queries |
| US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
| WO2008151466A1 (en) * | 2007-06-14 | 2008-12-18 | Google Inc. | Dictionary word and phrase determination |
| KR101465770B1 (ko) * | 2007-06-25 | 2014-11-27 | 구글 인코포레이티드 | 단어 확률 결정 |
| US8019748B1 (en) | 2007-11-14 | 2011-09-13 | Google Inc. | Web search refinement |
| US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
| US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
| US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
| US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
| US8589149B2 (en) | 2008-08-05 | 2013-11-19 | Nuance Communications, Inc. | Probability-based approach to recognition of user-entered data |
| US9959870B2 (en) | 2008-12-11 | 2018-05-01 | Apple Inc. | Speech recognition involving a mobile device |
| US9026426B2 (en) * | 2009-03-19 | 2015-05-05 | Google Inc. | Input method editor |
| US10255566B2 (en) | 2011-06-03 | 2019-04-09 | Apple Inc. | Generating and processing task items that represent tasks to perform |
| US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
| US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
| US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
| US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
| KR101083540B1 (ko) * | 2009-07-08 | 2011-11-14 | 엔에이치엔(주) | 통계적인 방법을 이용한 한자에 대한 자국어 발음열 변환 시스템 및 방법 |
| US9183834B2 (en) * | 2009-07-22 | 2015-11-10 | Cisco Technology, Inc. | Speech recognition tuning tool |
| US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
| US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
| US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
| US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
| CN101777124A (zh) * | 2010-01-29 | 2010-07-14 | 北京新岸线网络技术有限公司 | 一种提取视频文本信息的方法及装置 |
| US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
| CN102541837A (zh) * | 2010-12-22 | 2012-07-04 | 张家港市赫图阿拉信息技术有限公司 | 一种校正输入中文拼写的方法 |
| US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
| US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
| US10672399B2 (en) | 2011-06-03 | 2020-06-02 | Apple Inc. | Switching between text data and audio data based on a mapping |
| US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
| US8712931B1 (en) * | 2011-06-29 | 2014-04-29 | Amazon Technologies, Inc. | Adaptive input interface |
| US8706472B2 (en) * | 2011-08-11 | 2014-04-22 | Apple Inc. | Method for disambiguating multiple readings in language conversion |
| US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
| US8976118B2 (en) | 2012-01-20 | 2015-03-10 | International Business Machines Corporation | Method for character correction |
| US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
| US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
| US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
| US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
| US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
| TW201403354A (zh) * | 2012-07-03 | 2014-01-16 | Univ Nat Taiwan Normal | 以資料降維法及非線性算則建構中文文本可讀性數學模型之系統及其方法 |
| US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
| US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
| DE112014000709B4 (de) | 2013-02-07 | 2021-12-30 | Apple Inc. | Verfahren und vorrichtung zum betrieb eines sprachtriggers für einen digitalen assistenten |
| US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
| US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
| WO2014144949A2 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | Training an at least partial voice command system |
| WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
| US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
| WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
| WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
| WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
| US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
| KR101959188B1 (ko) | 2013-06-09 | 2019-07-02 | 애플 인크. | 디지털 어시스턴트의 둘 이상의 인스턴스들에 걸친 대화 지속성을 가능하게 하기 위한 디바이스, 방법 및 그래픽 사용자 인터페이스 |
| KR101809808B1 (ko) | 2013-06-13 | 2017-12-15 | 애플 인크. | 음성 명령에 의해 개시되는 긴급 전화를 걸기 위한 시스템 및 방법 |
| KR102069697B1 (ko) * | 2013-07-29 | 2020-02-24 | 한국전자통신연구원 | 자동 통역 장치 및 방법 |
| KR101749009B1 (ko) | 2013-08-06 | 2017-06-19 | 애플 인크. | 원격 디바이스로부터의 활동에 기초한 스마트 응답의 자동 활성화 |
| WO2015109468A1 (en) * | 2014-01-23 | 2015-07-30 | Microsoft Corporation | Functionality to reduce the amount of time it takes a device to receive and process input |
| CN104808806B (zh) * | 2014-01-28 | 2019-10-25 | 北京三星通信技术研究有限公司 | 根据不确定性信息实现汉字输入的方法和装置 |
| US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
| US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
| US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
| US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
| US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
| US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
| US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
| US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
| US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
| US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
| US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
| WO2015184186A1 (en) | 2014-05-30 | 2015-12-03 | Apple Inc. | Multi-command single utterance input method |
| US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
| US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
| US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
| US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
| US9377871B2 (en) | 2014-08-01 | 2016-06-28 | Nuance Communications, Inc. | System and methods for determining keyboard input in the presence of multiple contact points |
| US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
| US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
| US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
| US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
| US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
| US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
| US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
| US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
| US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
| US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
| US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
| US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
| US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
| US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
| US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
| US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
| US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
| US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
| US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
| US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
| US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
| US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
| US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
| US9753915B2 (en) | 2015-08-06 | 2017-09-05 | Disney Enterprises, Inc. | Linguistic analysis and correction |
| US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
| US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
| US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
| US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
| US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
| US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
| US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
| US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
| US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
| EP3398080A4 (en) | 2015-12-29 | 2019-07-31 | Microsoft Technology Licensing, LLC | FORMAT DOCUMENT OBJECTS BY VISUAL SUGGESTIONS |
| US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
| US10430485B2 (en) | 2016-05-10 | 2019-10-01 | Go Daddy Operating Company, LLC | Verifying character sets in domain name requests |
| US10180930B2 (en) | 2016-05-10 | 2019-01-15 | Go Daddy Operating Company, Inc. | Auto completing domain names comprising multiple languages |
| US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
| US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
| US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
| US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
| DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT |
| US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
| US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
| US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
| US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
| US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
| DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
| DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
| DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
| DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
| TWI614618B (zh) * | 2016-06-17 | 2018-02-11 | National Central University | 字詞校正方法 |
| US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
| US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
| US10269352B2 (en) * | 2016-12-23 | 2019-04-23 | Nice Ltd. | System and method for detecting phonetically similar imposter phrases |
| DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
| DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
| DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
| DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
| DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
| DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
| CN109844743B (zh) * | 2017-06-26 | 2023-10-17 | 微软技术许可有限责任公司 | 在自动聊天中生成响应 |
| CN112445953B (zh) * | 2019-08-14 | 2024-07-19 | 阿里巴巴集团控股有限公司 | 信息的搜索纠错方法、计算设备及存储介质 |
| US11443734B2 (en) | 2019-08-26 | 2022-09-13 | Nice Ltd. | System and method for combining phonetic and automatic speech recognition search |
| US11675920B2 (en) * | 2019-12-03 | 2023-06-13 | Sonicwall Inc. | Call location based access control of query to database |
| CN112232062A (zh) * | 2020-12-11 | 2021-01-15 | 北京百度网讯科技有限公司 | 文本纠错方法、装置、电子设备和存储介质 |
| JP7626451B2 (ja) * | 2021-09-09 | 2025-02-07 | Lineヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
| CN118133813B (zh) * | 2024-05-08 | 2024-08-09 | 北京澜舟科技有限公司 | 中文拼写纠错模型的训练方法以及存储介质 |
Family Cites Families (22)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4972349A (en) * | 1986-12-04 | 1990-11-20 | Kleinberger Paul J | Information retrieval system and method |
| JP2795058B2 (ja) * | 1992-06-03 | 1998-09-10 | 松下電器産業株式会社 | 時系列信号処理装置 |
| US6014615A (en) * | 1994-08-16 | 2000-01-11 | International Business Machines Corporaiton | System and method for processing morphological and syntactical analyses of inputted Chinese language phrases |
| US5893133A (en) * | 1995-08-16 | 1999-04-06 | International Business Machines Corporation | Keyboard for a system and method for processing Chinese language text |
| US5903861A (en) * | 1995-12-12 | 1999-05-11 | Chan; Kun C. | Method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer |
| US5706502A (en) * | 1996-03-25 | 1998-01-06 | Sun Microsystems, Inc. | Internet-enabled portfolio manager system and method |
| US5956739A (en) * | 1996-06-25 | 1999-09-21 | Mitsubishi Electric Information Technology Center America, Inc. | System for text correction adaptive to the text being corrected |
| US5963893A (en) * | 1996-06-28 | 1999-10-05 | Microsoft Corporation | Identification of words in Japanese text by a computer system |
| JPH10269204A (ja) * | 1997-03-28 | 1998-10-09 | Matsushita Electric Ind Co Ltd | 中国語文書自動校正方法及びその装置 |
| US6167367A (en) * | 1997-08-09 | 2000-12-26 | National Tsing Hua University | Method and device for automatic error detection and correction for computerized text files |
| CN1311881A (zh) * | 1998-06-04 | 2001-09-05 | 松下电器产业株式会社 | 语言变换规则产生装置、语言变换装置及程序记录媒体 |
| US6035269A (en) * | 1998-06-23 | 2000-03-07 | Microsoft Corporation | Method for detecting stylistic errors and generating replacement strings in a document containing Japanese text |
| US6401060B1 (en) * | 1998-06-25 | 2002-06-04 | Microsoft Corporation | Method for typographical detection and replacement in Japanese text |
| US6490563B2 (en) * | 1998-08-17 | 2002-12-03 | Microsoft Corporation | Proofreading with text to speech feedback |
| US6649222B1 (en) * | 1998-09-07 | 2003-11-18 | The Procter & Gamble Company | Modulated plasma glow discharge treatments for making superhydrophobic substrates |
| US7403888B1 (en) * | 1999-11-05 | 2008-07-22 | Microsoft Corporation | Language input user interface |
| US6848080B1 (en) * | 1999-11-05 | 2005-01-25 | Microsoft Corporation | Language input architecture for converting one text form to another text form with tolerance to spelling, typographical, and conversion errors |
| US6684201B1 (en) * | 2000-03-31 | 2004-01-27 | Microsoft Corporation | Linguistic disambiguation system and method using string-based pattern training to learn to resolve ambiguity sites |
| US7613601B2 (en) * | 2001-12-26 | 2009-11-03 | National Institute Of Information And Communications Technology | Method for predicting negative example, system for detecting incorrect wording using negative example prediction |
| US7031911B2 (en) * | 2002-06-28 | 2006-04-18 | Microsoft Corporation | System and method for automatic detection of collocation mistakes in documents |
| US7024360B2 (en) * | 2003-03-17 | 2006-04-04 | Rensselaer Polytechnic Institute | System for reconstruction of symbols in a sequence |
| US20050177358A1 (en) * | 2004-02-10 | 2005-08-11 | Edward Melomed | Multilingual database interaction system and method |
-
2004
- 2004-06-23 US US10/875,449 patent/US20050289463A1/en not_active Abandoned
-
2005
- 2005-06-21 JP JP2007518226A patent/JP2008504605A/ja not_active Withdrawn
- 2005-06-21 KR KR1020077001543A patent/KR101146539B1/ko not_active Expired - Fee Related
- 2005-06-21 WO PCT/US2005/022027 patent/WO2006002219A2/en not_active Ceased
- 2005-06-21 CN CN2005800263504A patent/CN101002198B/zh not_active Expired - Fee Related
-
2011
- 2011-11-04 JP JP2011242872A patent/JP5444308B2/ja not_active Expired - Fee Related
Also Published As
| Publication number | Publication date |
|---|---|
| KR101146539B1 (ko) | 2012-05-25 |
| JP2008504605A (ja) | 2008-02-14 |
| JP2012069142A (ja) | 2012-04-05 |
| CN101002198B (zh) | 2013-10-23 |
| JP5444308B2 (ja) | 2014-03-19 |
| WO2006002219A3 (en) | 2006-08-03 |
| KR20070027726A (ko) | 2007-03-09 |
| US20050289463A1 (en) | 2005-12-29 |
| CN101002198A (zh) | 2007-07-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20050289463A1 (en) | Systems and methods for spell correction of non-roman characters and words | |
| Bassil et al. | Ocr post-processing error correction algorithm using google online spelling suggestion | |
| US11023680B2 (en) | Method and system for detecting semantic errors in a text using artificial neural networks | |
| US9069753B2 (en) | Determining proximity measurements indicating respective intended inputs | |
| Azmi et al. | Real-word errors in Arabic texts: A better algorithm for detection and correction | |
| Mishra et al. | A survey of spelling error detection and correction techniques | |
| Loftsson | Correcting a POS-tagged corpus using three complementary methods | |
| Tufiş et al. | DIAC+: A professional diacritics recovering system | |
| Uthayamoorthy et al. | Ddspell-a data driven spell checker and suggestion generator for the tamil language | |
| Chaudhuri | Reversed word dictionary and phonetically similar word grouping based spell-checker to Bangla text | |
| Sen et al. | Bangla natural language processing: A comprehensive review of classical machine learning and deep learning based methods | |
| Jain et al. | Detection and correction of non word spelling errors in Hindi language | |
| Kaur et al. | Spell checker for Punjabi language using deep neural network | |
| Huang | Multilingual named entity extraction and translation from* text and speech | |
| Yang et al. | Spell Checking for Chinese. | |
| Tukur et al. | Tagging part of speech in hausa sentences | |
| Lyashevskaya et al. | An HMM-based PoS Tagger for Old Church Slavonic | |
| Lu et al. | An automatic spelling correction method for classical mongolian | |
| Mittra et al. | A bangla spell checking technique to facilitate error correction in text entry environment | |
| Kapočiūtė-Dzikienė et al. | Character-based machine learning vs. language modeling for diacritics restoration | |
| Ratnam et al. | Phonogram-based automatic typo correction in malayalam social media comments | |
| Terner et al. | Transliteration of Judeo-Arabic texts into Arabic script using recurrent neural networks | |
| Sonnadara et al. | Sinhala spell correction: A novel benchmark with neural spell correction | |
| Reeha et al. | Bi-directional GRU-Based Approach for Multi-Class Text Error Identification System | |
| Kundaikar et al. | Automatic Hindi OCR error correction using MLM-BERT |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2007518226 Country of ref document: JP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 1020077001543 Country of ref document: KR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 200580026350.4 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 1020077001543 Country of ref document: KR |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 05762558 Country of ref document: EP Kind code of ref document: A2 |