MX2008014865A - Metodo y aparato para correcciones de ortografia multilingues. - Google Patents
Metodo y aparato para correcciones de ortografia multilingues.Info
- Publication number
- MX2008014865A MX2008014865A MX2008014865A MX2008014865A MX2008014865A MX 2008014865 A MX2008014865 A MX 2008014865A MX 2008014865 A MX2008014865 A MX 2008014865A MX 2008014865 A MX2008014865 A MX 2008014865A MX 2008014865 A MX2008014865 A MX 2008014865A
- Authority
- MX
- Mexico
- Prior art keywords
- words
- user
- lexicon
- lexicon file
- spelling
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3322—Query formulation using system suggestions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
Un sistema y un método para la corrección de ortografía multilingüe emplea un constructor de vocabulario, que utiliza un proceso de construcción de metadatos que extrae todas las palabras de la fuente de datos con la cual un usuario trabajará, junto con sus frecuencias, para construir un archivo de vocabulario usando la fuente de datos y un algoritmo comprobador de ortografía, que determina la ortografía correcto de las palabras usadas como entrada para una búsqueda de la fuente de datos calculando un valor de cuenta para las palabras en el archivo del vocabulario según una fórmula que distinga semejanza entre la palabra introducida durante la búsqueda realizada por el usuario y las palabras contenidas en el archivo de vocabulario; y entonces clasifica la frecuencia de la palabra de entrada contra las palabras contenidas en el archivo de vocabulario. Cuando un usuario introduce una palabra, las palabras en el archivo del vocabulario se comparan con esa palabra para determinar la ortografía correcta u otra variante ortográfica para que al usuario la seleccione.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/438,289 US7558725B2 (en) | 2006-05-23 | 2006-05-23 | Method and apparatus for multilingual spelling corrections |
PCT/US2007/012212 WO2007139798A2 (en) | 2006-05-23 | 2007-05-23 | Method and apparatus for multilingual spelling corrections |
Publications (1)
Publication Number | Publication Date |
---|---|
MX2008014865A true MX2008014865A (es) | 2009-06-30 |
Family
ID=38750615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
MX2008014865A MX2008014865A (es) | 2006-05-23 | 2007-05-23 | Metodo y aparato para correcciones de ortografia multilingues. |
Country Status (6)
Country | Link |
---|---|
US (1) | US7558725B2 (es) |
EP (1) | EP2033120A4 (es) |
AU (1) | AU2007268059B2 (es) |
CA (1) | CA2653090C (es) |
MX (1) | MX2008014865A (es) |
WO (1) | WO2007139798A2 (es) |
Families Citing this family (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7030863B2 (en) | 2000-05-26 | 2006-04-18 | America Online, Incorporated | Virtual keyboard system with automatic correction |
US7750891B2 (en) * | 2003-04-09 | 2010-07-06 | Tegic Communications, Inc. | Selective input system based on tracking of motion parameters of an input device |
US7286115B2 (en) | 2000-05-26 | 2007-10-23 | Tegic Communications, Inc. | Directional input system with automatic correction |
US7821503B2 (en) | 2003-04-09 | 2010-10-26 | Tegic Communications, Inc. | Touch screen and graphical user interface |
US7562811B2 (en) | 2007-01-18 | 2009-07-21 | Varcode Ltd. | System and method for improved quality management in a product logistic chain |
EP2024863B1 (en) | 2006-05-07 | 2018-01-10 | Varcode Ltd. | A system and method for improved quality management in a product logistic chain |
US8225203B2 (en) * | 2007-02-01 | 2012-07-17 | Nuance Communications, Inc. | Spell-check for a keyboard system with automatic correction |
US8201087B2 (en) * | 2007-02-01 | 2012-06-12 | Tegic Communications, Inc. | Spell-check for a keyboard system with automatic correction |
JP2010526386A (ja) | 2007-05-06 | 2010-07-29 | バーコード リミティド | バーコード標識を利用する品質管理のシステムと方法 |
JP5638948B2 (ja) | 2007-08-01 | 2014-12-10 | ジンジャー ソフトウェア、インコーポレイティッド | インターネットコーパスを用いた、文脈依存言語の自動的な修正および改善 |
US8341520B2 (en) * | 2007-09-24 | 2012-12-25 | Ghotit Ltd. | Method and system for spell checking |
US8077983B2 (en) * | 2007-10-04 | 2011-12-13 | Zi Corporation Of Canada, Inc. | Systems and methods for character correction in communication devices |
WO2009063464A2 (en) | 2007-11-14 | 2009-05-22 | Varcode Ltd. | A system and method for quality management utilizing barcode indicators |
US11704526B2 (en) | 2008-06-10 | 2023-07-18 | Varcode Ltd. | Barcoded indicators for quality management |
US20100153358A1 (en) * | 2008-09-02 | 2010-06-17 | Itkoff David F | Information searching and retrieval system and method |
TWI391832B (zh) * | 2008-09-09 | 2013-04-01 | Inst Information Industry | 中文文章偵錯裝置、中文文章偵錯方法以及儲存媒體 |
US8543913B2 (en) * | 2008-10-16 | 2013-09-24 | International Business Machines Corporation | Identifying and using textual widgets |
JP5299011B2 (ja) * | 2009-03-25 | 2013-09-25 | セイコーエプソン株式会社 | テープ印刷装置、テープ印刷装置の制御方法及びプログラム |
US9124431B2 (en) * | 2009-05-14 | 2015-09-01 | Microsoft Technology Licensing, Llc | Evidence-based dynamic scoring to limit guesses in knowledge-based authentication |
US8856879B2 (en) | 2009-05-14 | 2014-10-07 | Microsoft Corporation | Social authentication for account recovery |
US9015036B2 (en) | 2010-02-01 | 2015-04-21 | Ginger Software, Inc. | Automatic context sensitive language correction using an internet corpus particularly for small keyboard devices |
US8301640B2 (en) * | 2010-11-24 | 2012-10-30 | King Abdulaziz City For Science And Technology | System and method for rating a written document |
US20120246133A1 (en) * | 2011-03-23 | 2012-09-27 | Microsoft Corporation | Online spelling correction/phrase completion system |
US8807422B2 (en) | 2012-10-22 | 2014-08-19 | Varcode Ltd. | Tamper-proof quality management barcode indicators |
US20140223295A1 (en) * | 2013-02-07 | 2014-08-07 | Lsi Corporation | Geographic Based Spell Check |
US20140310270A1 (en) * | 2013-04-16 | 2014-10-16 | Wal-Mart Stores, Inc. | Relevance-based cutoff for search results |
CA2985160C (en) | 2015-05-18 | 2023-09-05 | Varcode Ltd. | Thermochromic ink indicia for activatable quality labels |
JP6898298B2 (ja) | 2015-07-07 | 2021-07-07 | バーコード リミティド | 電子品質表示指標 |
US9753915B2 (en) | 2015-08-06 | 2017-09-05 | Disney Enterprises, Inc. | Linguistic analysis and correction |
US20180188823A1 (en) * | 2017-01-04 | 2018-07-05 | International Business Machines Corporation | Autocorrect with weighted group vocabulary |
CN107665190A (zh) * | 2017-09-29 | 2018-02-06 | 李晓妮 | 一种文本校对错误词库的自动构造方法和装置 |
CN108062384A (zh) * | 2017-12-13 | 2018-05-22 | 阿里巴巴集团控股有限公司 | 数据检索的方法和装置 |
US10572586B2 (en) * | 2018-02-27 | 2020-02-25 | International Business Machines Corporation | Technique for automatically splitting words |
US11010553B2 (en) * | 2018-04-18 | 2021-05-18 | International Business Machines Corporation | Recommending authors to expand personal lexicon |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3870571D1 (de) * | 1987-10-16 | 1992-06-04 | Computer Ges Konstanz | Verfahren zur automatischen zeichenerkennung. |
US5604897A (en) * | 1990-05-18 | 1997-02-18 | Microsoft Corporation | Method and system for correcting the spelling of misspelled words |
US5251316A (en) * | 1991-06-28 | 1993-10-05 | Digital Equipment Corporation | Method and apparatus for integrating a dynamic lexicon into a full-text information retrieval system |
US5774588A (en) * | 1995-06-07 | 1998-06-30 | United Parcel Service Of America, Inc. | Method and system for comparing strings with entries of a lexicon |
US5875443A (en) * | 1996-01-30 | 1999-02-23 | Sun Microsystems, Inc. | Internet-based spelling checker dictionary system with automatic updating |
US5926811A (en) * | 1996-03-15 | 1999-07-20 | Lexis-Nexis | Statistical thesaurus, method of forming same, and use thereof in query expansion in automated text searching |
US6144958A (en) * | 1998-07-15 | 2000-11-07 | Amazon.Com, Inc. | System and method for correcting spelling errors in search queries |
US6601059B1 (en) * | 1998-12-23 | 2003-07-29 | Microsoft Corporation | Computerized searching tool with spell checking |
US6904402B1 (en) * | 1999-11-05 | 2005-06-07 | Microsoft Corporation | System and iterative method for lexicon, segmentation and language model joint optimization |
US6848080B1 (en) * | 1999-11-05 | 2005-01-25 | Microsoft Corporation | Language input architecture for converting one text form to another text form with tolerance to spelling, typographical, and conversion errors |
US6711561B1 (en) * | 2000-05-02 | 2004-03-23 | Iphrase.Com, Inc. | Prose feedback in information access system |
US7254773B2 (en) * | 2000-12-29 | 2007-08-07 | International Business Machines Corporation | Automated spell analysis |
US7032174B2 (en) * | 2001-03-27 | 2006-04-18 | Microsoft Corporation | Automatically adding proper names to a database |
JP2003122743A (ja) * | 2001-10-12 | 2003-04-25 | Seiko Instruments Inc | 語句表示装置、スペルチェック装置及び電子辞書 |
US7024624B2 (en) * | 2002-01-07 | 2006-04-04 | Kenneth James Hintz | Lexicon-based new idea detector |
JP2003223437A (ja) * | 2002-01-29 | 2003-08-08 | Internatl Business Mach Corp <Ibm> | 正解語の候補の表示方法、スペルチェック方法、コンピュータ装置、プログラム |
US7113950B2 (en) * | 2002-06-27 | 2006-09-26 | Microsoft Corporation | Automated error checking system and method |
US7885963B2 (en) * | 2003-03-24 | 2011-02-08 | Microsoft Corporation | Free text and attribute searching of electronic program guide (EPG) data |
US20040250208A1 (en) * | 2003-06-06 | 2004-12-09 | Nelms Robert Nathan | Enhanced spelling checking system and method therefore |
US7254774B2 (en) * | 2004-03-16 | 2007-08-07 | Microsoft Corporation | Systems and methods for improved spell checking |
-
2006
- 2006-05-23 US US11/438,289 patent/US7558725B2/en active Active
-
2007
- 2007-05-23 EP EP07795195A patent/EP2033120A4/en not_active Ceased
- 2007-05-23 MX MX2008014865A patent/MX2008014865A/es active IP Right Grant
- 2007-05-23 AU AU2007268059A patent/AU2007268059B2/en active Active
- 2007-05-23 CA CA2653090A patent/CA2653090C/en active Active
- 2007-05-23 WO PCT/US2007/012212 patent/WO2007139798A2/en active Search and Examination
Also Published As
Publication number | Publication date |
---|---|
CA2653090A1 (en) | 2007-12-06 |
US7558725B2 (en) | 2009-07-07 |
EP2033120A2 (en) | 2009-03-11 |
AU2007268059A1 (en) | 2007-12-06 |
AU2007268059B2 (en) | 2012-05-17 |
EP2033120A4 (en) | 2011-06-08 |
CA2653090C (en) | 2014-07-15 |
WO2007139798A2 (en) | 2007-12-06 |
US20070276653A1 (en) | 2007-11-29 |
WO2007139798A3 (en) | 2008-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
MX2008014865A (es) | Metodo y aparato para correcciones de ortografia multilingues. | |
US9785630B2 (en) | Text prediction using combined word N-gram and unigram language models | |
CN101714136B (zh) | 将基于语料库的机器翻译系统适应到新领域的方法和装置 | |
US8538979B1 (en) | Generating phrase candidates from text string entries | |
WO2012070840A3 (ko) | 컨센서스 검색 장치 및 방법 | |
WO2007106858A3 (en) | System, method, and computer program product for data mining and automatically generating hypotheses from data repositories | |
NZ578672A (en) | Information-retrieval systems, methods, and software with concept-based searching and ranking | |
WO2004084178A3 (en) | Natural language processor | |
WO2007026365A3 (en) | Decision-support expert system and methods for real-time exploitation of documents in non-english languages | |
WO2009040790A3 (en) | Method and system for spell checking | |
WO2007035827A3 (en) | System and method for continuous stroke word-based text input | |
WO2012154992A3 (en) | Systems and methods for performing search and retrieval of electronic documents using a big index | |
CN102033879A (zh) | 一种中文人名识别的方法和装置 | |
WO2007066246A3 (en) | Method and system for speech based document history tracking | |
CN102693279A (zh) | 一种快速计算评论相似度的方法、装置及系统 | |
WO2008114708A1 (ja) | 音声認識システム、音声認識方法、および音声認識処理プログラム | |
CN104281698A (zh) | 一种高效的大数据查询方法 | |
US20180260390A1 (en) | Translation assistance system, translation assitance method and translation assistance program | |
WO2004059461A3 (en) | Electronic dictionary with example sentences | |
Hifny | Smoothing techniques for Arabic diacritics restoration | |
CN103324612A (zh) | 一种分词的方法及装置 | |
CN103678288A (zh) | 一种专名自动翻译的方法 | |
CN104077274B (zh) | 一种从文档集中抽取热词短语的方法和装置 | |
WO2008108061A1 (ja) | 言語処理システム、言語処理方法、言語処理プログラムおよび記録媒体 | |
Kozielski et al. | Open-lexicon language modeling combining word and character levels |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FG | Grant or registration |