BR0314865A - Método e sistema para identificar o idioma de uma série de caracteres do alfabeto dentre uma pluralidade de idiomas baseada em um sistema automático de identificação de idiomas, e, dispositivo eletrônico - Google Patents

Método e sistema para identificar o idioma de uma série de caracteres do alfabeto dentre uma pluralidade de idiomas baseada em um sistema automático de identificação de idiomas, e, dispositivo eletrônico

Info

Publication number
BR0314865A
BR0314865A BR0314865-3A BR0314865A BR0314865A BR 0314865 A BR0314865 A BR 0314865A BR 0314865 A BR0314865 A BR 0314865A BR 0314865 A BR0314865 A BR 0314865A
Authority
BR
Brazil
Prior art keywords
language
series
alphabet characters
identifying
electronic device
Prior art date
Application number
BR0314865-3A
Other languages
English (en)
Inventor
Jilei Tian
Janne Suontausta
Original Assignee
Nokia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corp filed Critical Nokia Corp
Publication of BR0314865A publication Critical patent/BR0314865A/pt

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/263Language identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

"MéTODO E SISTEMA PARA IDENTIFICAR O IDIOMA DE UMA SéRIE DE CARACTERES DO ALFABETO DENTRE UMA PLURALIDADE DE IDIOMAS BASEADA EM UM SISTEMA AUTOMáTICO DE IDENTIFICAçãO DE IDIOMAS, E, DISPOSITIVO ELETRôNICO". Método para identificar o idioma do texto escrito, onde o sistema de identificação baseado na rede neural (20) é usado para identificar o idioma de uma série de caracteres do alfabeto dentre uma pluralidade de idiomas. O grupo padrão dos caracteres do alfabeto (22) é usado para mapear uma série em uma série mapeada dos caracteres (10) do alfabeto, assim como permitir que o sistema NN-LID (20) determine a probabilidade de que a série mapeada seja um dos idiomas baseado no grupo padrão (22). Os caracteres do grupo padrão são selecionados dos caracteres do alfabeto dos grupos dependentes do idioma. O sistema de pontuação (30) é também usado para determinar a probabilidade da série de ser cada uma, um dos idiomas baseados nos grupos dependentes do idioma.
BR0314865-3A 2002-10-22 2003-07-21 Método e sistema para identificar o idioma de uma série de caracteres do alfabeto dentre uma pluralidade de idiomas baseada em um sistema automático de identificação de idiomas, e, dispositivo eletrônico BR0314865A (pt)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/279,747 US20040078191A1 (en) 2002-10-22 2002-10-22 Scalable neural network-based language identification from written text
PCT/IB2003/002894 WO2004038606A1 (en) 2002-10-22 2003-07-21 Scalable neural network-based language identification from written text

Publications (1)

Publication Number Publication Date
BR0314865A true BR0314865A (pt) 2005-08-02

Family

ID=32093450

Family Applications (1)

Application Number Title Priority Date Filing Date
BR0314865-3A BR0314865A (pt) 2002-10-22 2003-07-21 Método e sistema para identificar o idioma de uma série de caracteres do alfabeto dentre uma pluralidade de idiomas baseada em um sistema automático de identificação de idiomas, e, dispositivo eletrônico

Country Status (9)

Country Link
US (1) US20040078191A1 (pt)
EP (1) EP1554670A4 (pt)
JP (2) JP2006504173A (pt)
KR (1) KR100714769B1 (pt)
CN (1) CN1688999B (pt)
AU (1) AU2003253112A1 (pt)
BR (1) BR0314865A (pt)
CA (1) CA2500467A1 (pt)
WO (1) WO2004038606A1 (pt)

Families Citing this family (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10334400A1 (de) * 2003-07-28 2005-02-24 Siemens Ag Verfahren zur Spracherkennung und Kommunikationsgerät
US7395319B2 (en) * 2003-12-31 2008-07-01 Checkfree Corporation System using contact list to identify network address for accessing electronic commerce application
US7640159B2 (en) * 2004-07-22 2009-12-29 Nuance Communications, Inc. System and method of speech recognition for non-native speakers of a language
DE102004042907A1 (de) * 2004-09-01 2006-03-02 Deutsche Telekom Ag Online Multimedia Kreuzworträtsel
US7840399B2 (en) * 2005-04-07 2010-11-23 Nokia Corporation Method, device, and computer program product for multi-lingual speech recognition
US7548849B2 (en) * 2005-04-29 2009-06-16 Research In Motion Limited Method for generating text that meets specified characteristics in a handheld electronic device and a handheld electronic device incorporating the same
US7552045B2 (en) * 2006-12-18 2009-06-23 Nokia Corporation Method, apparatus and computer program product for providing flexible text based language identification
US8949266B2 (en) 2007-03-07 2015-02-03 Vlingo Corporation Multiple web-based content category searching in mobile search application
US20080221901A1 (en) * 2007-03-07 2008-09-11 Joseph Cerra Mobile general search environment speech processing facility
US20110054899A1 (en) * 2007-03-07 2011-03-03 Phillips Michael S Command and control utilizing content information in a mobile voice-to-speech application
US8635243B2 (en) * 2007-03-07 2014-01-21 Research In Motion Limited Sending a communications header with voice recording to send metadata for use in speech recognition, formatting, and search mobile search application
US20110054896A1 (en) * 2007-03-07 2011-03-03 Phillips Michael S Sending a communications header with voice recording to send metadata for use in speech recognition and formatting in mobile dictation application
US10056077B2 (en) * 2007-03-07 2018-08-21 Nuance Communications, Inc. Using speech recognition results based on an unstructured language model with a music system
US8838457B2 (en) * 2007-03-07 2014-09-16 Vlingo Corporation Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility
US20110054897A1 (en) * 2007-03-07 2011-03-03 Phillips Michael S Transmitting signal quality information in mobile dictation application
US8886540B2 (en) * 2007-03-07 2014-11-11 Vlingo Corporation Using speech recognition results based on an unstructured language model in a mobile communication facility application
US20110054898A1 (en) * 2007-03-07 2011-03-03 Phillips Michael S Multiple web-based content search user interface in mobile search application
US20090030697A1 (en) * 2007-03-07 2009-01-29 Cerra Joseph P Using contextual information for delivering results generated from a speech recognition facility using an unstructured language model
US20110054895A1 (en) * 2007-03-07 2011-03-03 Phillips Michael S Utilizing user transmitted text to improve language model in mobile dictation application
US20090030685A1 (en) * 2007-03-07 2009-01-29 Cerra Joseph P Using speech recognition results based on an unstructured language model with a navigation system
US20090030687A1 (en) * 2007-03-07 2009-01-29 Cerra Joseph P Adapting an unstructured language model speech recognition system based on usage
US8949130B2 (en) * 2007-03-07 2015-02-03 Vlingo Corporation Internal and external speech recognition use with a mobile communication facility
US8886545B2 (en) 2007-03-07 2014-11-11 Vlingo Corporation Dealing with switch latency in speech recognition
US20110060587A1 (en) * 2007-03-07 2011-03-10 Phillips Michael S Command and control utilizing ancillary information in a mobile voice-to-speech application
US20090030691A1 (en) * 2007-03-07 2009-01-29 Cerra Joseph P Using an unstructured language model associated with an application of a mobile communication facility
US20090030688A1 (en) * 2007-03-07 2009-01-29 Cerra Joseph P Tagging speech recognition results based on an unstructured language model for use in a mobile communication facility application
US8996379B2 (en) * 2007-03-07 2015-03-31 Vlingo Corporation Speech recognition text entry for software applications
JP5246751B2 (ja) * 2008-03-31 2013-07-24 独立行政法人理化学研究所 情報処理装置、情報処理方法、およびプログラム
US8107671B2 (en) 2008-06-26 2012-01-31 Microsoft Corporation Script detection service
US8019596B2 (en) * 2008-06-26 2011-09-13 Microsoft Corporation Linguistic service platform
US8266514B2 (en) 2008-06-26 2012-09-11 Microsoft Corporation Map service
US8073680B2 (en) 2008-06-26 2011-12-06 Microsoft Corporation Language detection service
US8311824B2 (en) * 2008-10-27 2012-11-13 Nice-Systems Ltd Methods and apparatus for language identification
US8224641B2 (en) 2008-11-19 2012-07-17 Stratify, Inc. Language identification for documents containing multiple languages
US8224642B2 (en) * 2008-11-20 2012-07-17 Stratify, Inc. Automated identification of documents as not belonging to any language
WO2011096015A1 (ja) * 2010-02-05 2011-08-11 三菱電機株式会社 認識辞書作成装置及び音声認識装置
JP5259020B2 (ja) * 2010-10-01 2013-08-07 三菱電機株式会社 音声認識装置
EP2724261A4 (en) * 2011-06-24 2015-07-29 Google Inc DETECTION OF INITIAL LANGUAGES FOR SEARCH QUESTIONS
GB201216640D0 (en) * 2012-09-18 2012-10-31 Touchtype Ltd Formatting module, system and method for formatting an electronic character sequence
CN103578471B (zh) * 2013-10-18 2017-03-01 威盛电子股份有限公司 语音辨识方法及其电子装置
US9195656B2 (en) * 2013-12-30 2015-11-24 Google Inc. Multilingual prosody generation
US20160035344A1 (en) * 2014-08-04 2016-02-04 Google Inc. Identifying the language of a spoken utterance
US9812128B2 (en) * 2014-10-09 2017-11-07 Google Inc. Device leadership negotiation among voice interface devices
US9858484B2 (en) * 2014-12-30 2018-01-02 Facebook, Inc. Systems and methods for determining video feature descriptors based on convolutional neural networks
US10417555B2 (en) 2015-05-29 2019-09-17 Samsung Electronics Co., Ltd. Data-optimized neural network traversal
US10474753B2 (en) * 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10282415B2 (en) 2016-11-29 2019-05-07 Ebay Inc. Language identification for text strings
CN108288078B (zh) * 2017-12-07 2020-09-29 腾讯科技(深圳)有限公司 一种图像中字符识别方法、装置和介质
CN108197087B (zh) * 2018-01-18 2021-11-16 奇安信科技集团股份有限公司 字符编码识别方法及装置
KR102123910B1 (ko) * 2018-04-12 2020-06-18 주식회사 푸른기술 머신 러닝을 이용한 지폐 일련번호 인식 장치 및 방법
EP3564949A1 (en) * 2018-04-23 2019-11-06 Spotify AB Activation trigger processing
JP2020056972A (ja) * 2018-10-04 2020-04-09 富士通株式会社 言語識別プログラム、言語識別方法及び言語識別装置
CN117935785A (zh) * 2019-05-03 2024-04-26 谷歌有限责任公司 用于在端到端模型中跨语言语音识别的基于音素的场境化
US11720752B2 (en) * 2020-07-07 2023-08-08 Sap Se Machine learning enabled text analysis with multi-language support
US20220198155A1 (en) * 2020-12-18 2022-06-23 Capital One Services, Llc Systems and methods for translating transaction descriptions

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5062143A (en) * 1990-02-23 1991-10-29 Harris Corporation Trigram-based method of language identification
US5548507A (en) * 1994-03-14 1996-08-20 International Business Machines Corporation Language identification process using coded language words
IL109268A (en) * 1994-04-10 1999-01-26 Advanced Recognition Tech Method and system for image recognition
US6615168B1 (en) * 1996-07-26 2003-09-02 Sun Microsystems, Inc. Multilingual agent for use in computer systems
US6009382A (en) * 1996-08-19 1999-12-28 International Business Machines Corporation Word storage table for natural language determination
US6216102B1 (en) * 1996-08-19 2001-04-10 International Business Machines Corporation Natural language determination using partial words
US6415250B1 (en) * 1997-06-18 2002-07-02 Novell, Inc. System and method for identifying language using morphologically-based techniques
CA2242065C (en) * 1997-07-03 2004-12-14 Henry C.A. Hyde-Thomson Unified messaging system with automatic language identification for text-to-speech conversion
JPH1139306A (ja) * 1997-07-16 1999-02-12 Sony Corp 多言語情報の処理システムおよび処理方法
US6047251A (en) * 1997-09-15 2000-04-04 Caere Corporation Automatic language identification system for multilingual optical character recognition
CN1111841C (zh) * 1997-09-17 2003-06-18 西门子公司 在语言识别中通过计算机来确定至少两个单词的序列出现概率的方法
US6157905A (en) * 1997-12-11 2000-12-05 Microsoft Corporation Identifying language and character set of data representing text
US6016471A (en) * 1998-04-29 2000-01-18 Matsushita Electric Industrial Co., Ltd. Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word
KR100509797B1 (ko) * 1998-04-29 2005-08-23 마쯔시다덴기산교 가부시키가이샤 결정 트리에 의한 스펠형 문자의 복합 발음 발생과 스코어를위한 장치 및 방법
JP2000148754A (ja) * 1998-11-13 2000-05-30 Omron Corp マルチリンガル・システム,マルチリンガル処理方法およびマルチリンガル処理のプログラムを記憶した媒体
US6167369A (en) * 1998-12-23 2000-12-26 Xerox Company Automatic language identification using both N-gram and word information
JP2000250905A (ja) * 1999-02-25 2000-09-14 Fujitsu Ltd 言語処理装置及びそのプログラム記憶媒体
US6182148B1 (en) * 1999-03-18 2001-01-30 Walid, Inc. Method and system for internationalizing domain names
DE19963812A1 (de) * 1999-12-30 2001-07-05 Nokia Mobile Phones Ltd Verfahren zum Erkennen einer Sprache und zum Steuern einer Sprachsyntheseeinheit sowie Kommunikationsvorrichtung
CN1144173C (zh) * 2000-08-16 2004-03-31 财团法人工业技术研究院 概率导向的容错式自然语言理解方法
US7277732B2 (en) * 2000-10-13 2007-10-02 Microsoft Corporation Language input system for mobile devices
FI20010644A (fi) * 2001-03-28 2002-09-29 Nokia Corp Merkkisekvenssin kielen määrittäminen
US7191116B2 (en) * 2001-06-19 2007-03-13 Oracle International Corporation Methods and systems for determining a language of a document

Also Published As

Publication number Publication date
JP2006504173A (ja) 2006-02-02
KR20050070073A (ko) 2005-07-05
AU2003253112A1 (en) 2004-05-13
EP1554670A1 (en) 2005-07-20
EP1554670A4 (en) 2008-09-10
CN1688999B (zh) 2010-04-28
JP2009037633A (ja) 2009-02-19
US20040078191A1 (en) 2004-04-22
CA2500467A1 (en) 2004-05-06
WO2004038606A1 (en) 2004-05-06
CN1688999A (zh) 2005-10-26
KR100714769B1 (ko) 2007-05-04

Similar Documents

Publication Publication Date Title
BR0314865A (pt) Método e sistema para identificar o idioma de uma série de caracteres do alfabeto dentre uma pluralidade de idiomas baseada em um sistema automático de identificação de idiomas, e, dispositivo eletrônico
Gamon Using mostly native data to correct errors in learners’ writing
CN103970765B (zh) 一种改错模型训练方法、装置和文本改错方法、装置
Abu‐Rabia et al. Morphological structures in visual word recognition: The case of Arabic
BRPI0417656A (pt) método, meio legìvel por computador, e, sistema
BR0012964A (pt) Método, sistema e meio de armazenamento passìvel de leitura por computador para configuração de acionador de dispositivo automático
US8041559B2 (en) System and method for disambiguating non diacritized arabic words in a text
BRPI0403304A (pt) Sistema e métodos melhorados para classificar documentos com base em informação estruturalmente inter-relacionada
BR102019022037A2 (pt) método e sistema para registro de dados de tíquete através de processamento sequencial
ES2039351T3 (es) Sistema de memoria cache virtual, organizado en paginas.
BRPI0413167A (pt) sistema e método para segmentar e direcionar membros de audiência
BR0301945A (pt) Cassete cirúrgico
BRPI0516979A (pt) dispositivo eletrÈnico e método para a interpretação de texto visual
BR112018010437A2 (pt) proteção do código básico de entrada/saída (bios)
BR0301577A (pt) Sistema de identificação para cartucho cirúrgico
BR0006006A (pt) Método e sistema para compatibilizar objetivos interativos conectados em linha
BR0007767A (pt) Sistema e método para a geração de dados dependentes
AR025850A1 (es) Metodo multimodal de ingreso de datos que comprende aceptar ingresos de voz y mecanico e identificar el elemento de datos y dispositivo para el mismo
BR0314545A (pt) Sistema de identificação
BRPI0606162A2 (pt) método e aparelho para inserir um caractere através de um dispositivo apontador e mìdia de gravação que pode ser lida por computador
Chen et al. Improving native language identification by using spelling errors
CO2023016040A2 (es) Sistema y método de detección automática de tópicos en textos
Horbach et al. The influence of spelling errors on content scoring performance
BR0106463A (pt) Determinação da fonte de texto em uma imagem
Stolz et al. When some dots turn a different color…: Thoughts on how (not) to determine whether or not reduplication is universal

Legal Events

Date Code Title Description
B08F Application dismissed because of non-payment of annual fees [chapter 8.6 patent gazette]

Free format text: REFERENTE A 8A ANUIDADE.

B08K Patent lapsed as no evidence of payment of the annual fee has been furnished to inpi [chapter 8.11 patent gazette]

Free format text: REFERENTE AO DESPACHO 8.6 PUBLICADO NA RPI 2160 DE 29/05/2012.