KR930014120A - 단어의 모호성을 제거하는 방법과 그 장치 - Google Patents

단어의 모호성을 제거하는 방법과 그 장치 Download PDF

Info

Publication number
KR930014120A
KR930014120A KR1019920027574A KR920027574A KR930014120A KR 930014120 A KR930014120 A KR 930014120A KR 1019920027574 A KR1019920027574 A KR 1019920027574A KR 920027574 A KR920027574 A KR 920027574A KR 930014120 A KR930014120 A KR 930014120A
Authority
KR
South Korea
Prior art keywords
word
meaning
pair
words
mean
Prior art date
Application number
KR1019920027574A
Other languages
English (en)
Inventor
와드 처치 켄네쓰
아더 게일 윌리암
에릭 야로우스키 데이빗
Original Assignee
고든 이. 넬슨
아메리칸 텔리폰 앤드 텔레그라프 캄파니
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 고든 이. 넬슨, 아메리칸 텔리폰 앤드 텔레그라프 캄파니 filed Critical 고든 이. 넬슨
Publication of KR930014120A publication Critical patent/KR930014120A/ko

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/45Example-based machine translation; Alignment

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

본 발명은 단어/의미 쌍이 문맥에 적합한지를 판단하는 장치와 방법에 관한 것이다. 연습과 시험 모두에 긴문맥들(100 단어) 이 채용되고, 시험은 그 문맥에 있는 어휘 단어들의 가중치들을 합산함으로써 이루어진다. 연습원문의 조건부적 샘플에서 어휘 단어가 출현하는 확률과 온진한 연습 원문에 그 단어가 출현하는 확를 사이에 끼어드는 가중치들은 베이시안 기술을 이용하여 결정된다. 시현에서 더 개량된 사항은 하나의 단어가 단일 강연에서 단일 의미로만 사용된다는 사실을 이용하고 있다. 원문의 2개 국어 병용 본문들에 대한 연습과 로게트 백과사전의 범주들을 사용하는 연습을 포함하는 자동화된 연습 기술들도 설명되어 있다.

Description

단어의 모호성을 제거하는 방법과 그 장치
본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음
제1도는 단어/의미 쌍이 문맥에 적합할 확률을 판단하는 장치의 블록도.
제2도는 1도의 테이블(107)이 만들어질 수 있는 데이타의 테이블.
제3도는 조건부적 샘플 일부의 실례.

Claims (18)

  1. 원문내의 주어진 위치를 포함하고 원문의 단일 행보다 실질적으로 더 긴 원문내 단어들의 순서를 결정하는 단계와, 상기 순서를 자동으로 분석함으로써 단어/의미 쌍이 적합한 의미를 갖는지 여부의 판단을 하는 단계를 포함하는, 상기 단어/의미 상이 상기 주어진 위치에 적합한 의미를 갖는지를 자동으로 판단하는 방법.
  2. 제1항에 있어서, 상기 단어들의 순서를 결정하는 단계는 그 길이가 100 단어 부근에 있는 순서를 판단하는 상기 방법.
  3. 제1항에 있어서, 상기 판단이 상기 단어/의미 쌍이 상기 적합한 의미를 갖는지를 충분히 강하게 지시하는지를 판단하는 단계와, 그 판단이 충분히 강하게 지시하는 지를 판단하는 단계와, 그 판단이 그렇지 않으면 상기 적합한 의미의 판단과 인접 위치에 상기 적합한 의미의 다른 판단을 비교함으로써 상기 적합한 의미의 마지막 판단을 하는 단계를 더 포함하는 상기 방법.
  4. 제1항에 있어서, 판단을 하는 상기 단계에 있어서, 상기 단어/의미 쌍이 상기 주어진 위치에 적합한 의미를 가질 확률을 결정하기 위하여 상기 순서와 단어/의미 쌍의 의미를 내포하는 베이시안 판별 기술을 채용함으로써 상기 순서가 자동으로 분석되는 상기 방법.
  5. 제1항에 있어서, 다수의 단어/의미 쌍들이 있고, 판단을 내리는 상기 단계가 상기 단어/의미 쌍들 각각에 대하여 수행되고, 상기 방법이 상기 주어진 위치에 가장 적합한 단어/의미 쌍을 선택하는 단계를 더 포함하는 상기 방법.
  6. 제5항에 있어서, 상기 단어/의미 쌍을 선택하는 단계는 인접 위치에서 가장 적합한 의미의 다른 판단과 상기 판단을 비교하는 단계를 포함하는 상기 방법.
  7. 제1항에 있어서. 상기 판단을 하는 단계는 상기 순서내의 단어들에 대하여, 상기 단어/의미 쌍이 적합한 의미를 갖는지를 간단하기 위해 어떤 단어들의 가중치들을 지시하는 테이블로부터 가중치들을 얻는 단계와, 상기 단어/의미 쌍이 상기 주어진 위치에 대해 적합한 가능성을 판단하도록 상기 가중치들을 합산하는 단계를 포함하는 상기 방법.
  8. 제7항에 있어서, 다수의 단어/의미 쌍들이 있고 상기 가중치들을 얻어 이 가중치들을 합산하는 단계는 각 단어/의미 쌍에 대하여 수행되고, 상기 방법은 상기 주어진 위치에 대해 상기 단어/의미 쌍이 적합할 최대 가눙성을 지시하는 합산된 가중치들을 가진 상기 단어/의미 쌍을 선택하는 단계를 더 포함하는 상기 방법.
  9. 제8항에 있어서, 상기 단어/의미 쌍을 선택하는 단계는 인접 위치에서 가장 적합한 의미의 다른 판단과 상기 선택된 단어/의미 쌍을 비교하는 단계를 포함하는 상기 방법.
  10. 제1항 내지 제9항증 어느 한 하에 있어서, 상기 단어/의미 쌍에 있는 단어가 상기 주어진 위치에 출현하는 상기 방법.
  11. 원문의 단일행보다 실질적으로 더 길고 상기 원문내의 주어진 위치를 포함하는 원문내의 단어들 순서를 얻는 수단과, 단어/의미 쌍이 적합한 의미를 갖는지를 판단하도록 상기 순서를 분석하는 수단을 포함하는, 상기 단어/의미 쌍이 원문내의 주어진 위치에 적합한 의미를 갖는지를 판단하는 장치.
  12. 제11항에 있어서, 상기 순서를 분석하는 수단은 상기 단어/의미 쌍이 적합나 의미를 갖는지를 판단하기위해 어떤 단어들의 가중치들을 지시하는 테이블을 더 포함하고, 상기 순서를 분석하는 수단은 상기 테이블로부터 상기 순서내의 단어들의 가중치들을 얻고 이 단어들의 가중치들을 합산함으로써 상기 순서를 분석하여 상기 단어/의미 쌍이 상기 주어진 위치에 적합할 가능성을 판단하는 상기 장치.
  13. 단어/의미 쌍이 원문내의 주어진 위치에 적합한 의미를 갖는지를 판단하는 장치에 사용할 확률 테이블을 만드는 방법으로서, 상기 단어/의미 쌍의 의미에 의미론적으로 관련된 문맥들을 포함하는 원문 어휘자료의 조건 부적 샘플을 만드는 단계, 상기 조건부적 샘플에 출력하는 각 단어에 대하여 상기 단어/의미 쌍의 단어가 상기 단어/의미 쌍의 의미를 가질 확률에 관하여 상기 조건적척 샘플내의 단어의 가중치를 판단하기 위해 베이시안기술(Bayesian technique)을 채용하는 단계, 그리고 주어진 가중치를 넘는 단어들이 출현할 때마다 그 단어의 가중치를 포함한 테이블 엔트리를 만드는 단계,를 포함하는 상기 테이블을 만드는 방법.
  14. 제13항에 있어서, 상기 베이시안 기술을 채용하는 단계에서, 상기 조건부척 샘플에 나타나는 각 단어에 대한 가중치(wt)는라는 식으로 판단되고, 여기서 π그단어의 조건부적 확률이고, a는 상기 조건부적 샘플에 있는 단어의 빈도이고, E는 주어진 a에 대한 π의 기대값인 상기 방법.
  15. 제13항에 있어서, 상기 조건부적 샘플을 다른 언어로 만드는 상기 원문어휘자료의 번역문이 존재하고, 조건부적 샘플을 만드는 상기 단계에 있어서 상기 단어/의미 쌍의 단어가 그 단어/의미 쌍의 의미로 사용되는 지에 대한 판단은 그 번역문에 있는 대용 단어를 참조하여 이루어지는 상기 방법.
  16. 제13항에 있어서, 상기 단어/의미 쌍과 동일한 의미론적 범주에 속하는 범주화된 단어들의 리스트가 존재하고, 조건부적 샘플을 만드는 상기 단계에 있어서, 상기 문맥들이 상기 리스트의 단어들을 내포하는 상기 방법.
  17. 제16항에 있어서, 상기 베이시안 기술을 채용하는 단계에서, 상기 조건부적 샘플에 출현하는 각 단어의 가중치(wt)는라는 식으로 판단되고, 여기서 π그단어의 조건부적 확률이고, a는 상기 조건부적 샘플에 있는 단어의 빈도값이고, E는 주어진 a에 대한 π의 기대값인 상기방법.
  18. 제17항에 있어서, 상기 베이시안 기술을 채용하는 단계에서, 문맥에 포함된 상기 리스트의 각 단어에 대하여 그 리스트의 단어가 상기 어회자료에 출현하는 회수 k를 결정하고 그러한 단어에 대한 문맥내의 모든 단어들이 1/k의 가중치를 a에 기여하게 함으로써 a가 유도되는 상기 방법.
    ※ 참고사항 : 최초출원 내용에 의하여 공개하는 것임.
KR1019920027574A 1991-12-30 1992-12-30 단어의 모호성을 제거하는 방법과 그 장치 KR930014120A (ko)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US814,850 1991-12-30
US07/814,850 US5541836A (en) 1991-12-30 1991-12-30 Word disambiguation apparatus and methods

Publications (1)

Publication Number Publication Date
KR930014120A true KR930014120A (ko) 1993-07-22

Family

ID=25216159

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019920027574A KR930014120A (ko) 1991-12-30 1992-12-30 단어의 모호성을 제거하는 방법과 그 장치

Country Status (5)

Country Link
US (1) US5541836A (ko)
EP (1) EP0550160A2 (ko)
JP (1) JPH05242138A (ko)
KR (1) KR930014120A (ko)
CA (1) CA2083733A1 (ko)

Families Citing this family (166)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6278967B1 (en) * 1992-08-31 2001-08-21 Logovista Corporation Automated system for generating natural language translations that are domain-specific, grammar rule-based, and/or based on part-of-speech analysis
GB2279164A (en) * 1993-06-18 1994-12-21 Canon Res Ct Europe Ltd Processing a bilingual database.
US5873056A (en) * 1993-10-12 1999-02-16 The Syracuse University Natural language processing system for semantic vector representation which accounts for lexical ambiguity
US5822720A (en) 1994-02-16 1998-10-13 Sentius Corporation System amd method for linking streams of multimedia data for reference material for display
US5794050A (en) * 1995-01-04 1998-08-11 Intelligent Text Processing, Inc. Natural language understanding system
US5659771A (en) * 1995-05-19 1997-08-19 Mitsubishi Electric Information Technology Center America, Inc. System for spelling correction in which the context of a target word in a sentence is utilized to determine which of several possible words was intended
US5680511A (en) * 1995-06-07 1997-10-21 Dragon Systems, Inc. Systems and methods for word recognition
US5828991A (en) * 1995-06-30 1998-10-27 The Research Foundation Of The State University Of New York Sentence reconstruction using word ambiguity resolution
US5717914A (en) * 1995-09-15 1998-02-10 Infonautics Corporation Method for categorizing documents into subjects using relevance normalization for documents retrieved from an information retrieval system in response to a query
US5878386A (en) * 1996-06-28 1999-03-02 Microsoft Corporation Natural language parser with dictionary-based part-of-speech probabilities
US6119114A (en) * 1996-09-17 2000-09-12 Smadja; Frank Method and apparatus for dynamic relevance ranking
US6173298B1 (en) 1996-09-17 2001-01-09 Asap, Ltd. Method and apparatus for implementing a dynamic collocation dictionary
JPH10198680A (ja) * 1997-01-07 1998-07-31 Hitachi Ltd 分散辞書管理方法及びそれを用いた機械翻訳方法
GB2321117A (en) * 1997-01-09 1998-07-15 Sharp Kk Disambiguating syntactic word multiples
GB2323693B (en) * 1997-03-27 2001-09-26 Forum Technology Ltd Speech to text conversion
DE69837979T2 (de) * 1997-06-27 2008-03-06 International Business Machines Corp. System zum Extrahieren einer mehrsprachigen Terminologie
US5987448A (en) * 1997-07-25 1999-11-16 Claritech Corporation Methodology for displaying search results using character recognition
JPH11102414A (ja) 1997-07-25 1999-04-13 Kuraritec Corp ヒートマップを用いて光学式文字認識の訂正を行うための方法および装置、並びに、ocr出力の誤りを発見するための一連の命令を記録したコンピュータ読み取り可能な記録媒体
JPH11110480A (ja) * 1997-07-25 1999-04-23 Kuraritec Corp テキストの表示方法およびその装置
US5970483A (en) 1997-07-25 1999-10-19 Claritech Corporation Apparatus and methodology for submitting search queries
US6078878A (en) * 1997-07-31 2000-06-20 Microsoft Corporation Bootstrapping sense characterizations of occurrences of polysemous words
WO1999017223A1 (en) * 1997-09-30 1999-04-08 Ihc Health Services, Inc. Aprobabilistic system for natural language processing
US6260008B1 (en) * 1998-01-08 2001-07-10 Sharp Kabushiki Kaisha Method of and system for disambiguating syntactic word multiples
US6782510B1 (en) * 1998-01-27 2004-08-24 John N. Gross Word checking tool for controlling the language content in documents using dictionaries with modifyable status fields
US6154783A (en) * 1998-09-18 2000-11-28 Tacit Knowledge Systems Method and apparatus for addressing an electronic document for transmission over a network
WO2000017727A2 (en) 1998-09-18 2000-03-30 Tacit Knowledge Systems Method and apparatus for querying a user knowledge profile
US6253202B1 (en) 1998-09-18 2001-06-26 Tacit Knowledge Systems, Inc. Method, system and apparatus for authorizing access by a first user to a knowledge profile of a second user responsive to an access request from the first user
US6377949B1 (en) 1998-09-18 2002-04-23 Tacit Knowledge Systems, Inc. Method and apparatus for assigning a confidence level to a term within a user knowledge profile
US6115709A (en) 1998-09-18 2000-09-05 Tacit Knowledge Systems, Inc. Method and system for constructing a knowledge profile of a user having unrestricted and restricted access portions according to respective levels of confidence of content of the portions
AU5910699A (en) 1998-09-18 2000-04-10 Tacit Knowledge Systems Method of constructing and displaying an entity profile constructed utilizing input from entities other than the owner
US8380875B1 (en) 1998-09-18 2013-02-19 Oracle International Corporation Method and system for addressing a communication document for transmission over a network based on the content thereof
US6256629B1 (en) * 1998-11-25 2001-07-03 Lucent Technologies Inc. Method and apparatus for measuring the degree of polysemy in polysemous words
DE19855137A1 (de) * 1998-11-30 2000-05-31 Honeywell Ag Verfahren zur Konvertierung von Daten
AU2440100A (en) 1999-03-19 2000-10-09 Trados Gmbh Workflow management system
US6490548B1 (en) 1999-05-14 2002-12-03 Paterra, Inc. Multilingual electronic transfer dictionary containing topical codes and method of use
KR20010004404A (ko) * 1999-06-28 2001-01-15 정선종 키팩트기반 텍스트 검색시스템과, 이를 이용한 키팩트기반 텍스트 색인방법 및 검색방법
US20060116865A1 (en) 1999-09-17 2006-06-01 Www.Uniscape.Com E-services translation utilizing machine translation and translation memory
US6405162B1 (en) * 1999-09-23 2002-06-11 Xerox Corporation Type-based selection of rules for semantically disambiguating words
US6256605B1 (en) * 1999-11-08 2001-07-03 Macmillan Alan S. System for and method of summarizing etymological information
US7315891B2 (en) * 2000-01-12 2008-01-01 Vericept Corporation Employee internet management device
US7716163B2 (en) 2000-06-06 2010-05-11 Microsoft Corporation Method and system for defining semantic categories and actions
US7770102B1 (en) * 2000-06-06 2010-08-03 Microsoft Corporation Method and system for semantically labeling strings and providing actions based on semantically labeled strings
US7712024B2 (en) 2000-06-06 2010-05-04 Microsoft Corporation Application program interfaces for semantically labeling strings and providing actions based on semantically labeled strings
US7421645B2 (en) 2000-06-06 2008-09-02 Microsoft Corporation Method and system for providing electronic commerce actions based on semantically labeled strings
US7788602B2 (en) 2000-06-06 2010-08-31 Microsoft Corporation Method and system for providing restricted actions for recognized semantic categories
US6668251B1 (en) 2000-11-01 2003-12-23 Tacit Knowledge Systems, Inc. Rendering discriminator members from an initial set of result data
US20040111386A1 (en) * 2001-01-08 2004-06-10 Goldberg Jonathan M. Knowledge neighborhoods
AU2002237495A1 (en) * 2001-03-13 2002-09-24 Intelligate Ltd. Dynamic natural language understanding
US7032174B2 (en) * 2001-03-27 2006-04-18 Microsoft Corporation Automatically adding proper names to a database
US7778816B2 (en) 2001-04-24 2010-08-17 Microsoft Corporation Method and system for applying input mode bias
US7191115B2 (en) * 2001-06-20 2007-03-13 Microsoft Corporation Statistical method and apparatus for learning translation relationships among words
US8214196B2 (en) 2001-07-03 2012-07-03 University Of Southern California Syntax-based statistical translation model
US7130861B2 (en) 2001-08-16 2006-10-31 Sentius International Corporation Automated creation and delivery of database content
JP2003157376A (ja) * 2001-11-21 2003-05-30 Ricoh Co Ltd ネットワークシステム、識別情報管理方法、サーバ装置、プログラム、および記録媒体
AU2003269808A1 (en) 2002-03-26 2004-01-06 University Of Southern California Constructing a translation lexicon from comparable, non-parallel corpora
US7325194B2 (en) 2002-05-07 2008-01-29 Microsoft Corporation Method, system, and apparatus for converting numbers between measurement systems based upon semantically labeled strings
US7707496B1 (en) 2002-05-09 2010-04-27 Microsoft Corporation Method, system, and apparatus for converting dates between calendars and languages based upon semantically labeled strings
US7707024B2 (en) 2002-05-23 2010-04-27 Microsoft Corporation Method, system, and apparatus for converting currency values based upon semantically labeled strings
US7742048B1 (en) 2002-05-23 2010-06-22 Microsoft Corporation Method, system, and apparatus for converting numbers based upon semantically labeled strings
US7281245B2 (en) 2002-06-05 2007-10-09 Microsoft Corporation Mechanism for downloading software components from a remote source for use by a local software application
US7827546B1 (en) 2002-06-05 2010-11-02 Microsoft Corporation Mechanism for downloading software components from a remote source for use by a local software application
US7356537B2 (en) 2002-06-06 2008-04-08 Microsoft Corporation Providing contextually sensitive tools and help content in computer-generated documents
US7716676B2 (en) 2002-06-25 2010-05-11 Microsoft Corporation System and method for issuing a message to a program
US7392479B2 (en) 2002-06-27 2008-06-24 Microsoft Corporation System and method for providing namespace related information
US7209915B1 (en) 2002-06-28 2007-04-24 Microsoft Corporation Method, system and apparatus for routing a query to one or more providers
US7353165B2 (en) * 2002-06-28 2008-04-01 Microsoft Corporation Example based machine translation system
US7158983B2 (en) 2002-09-23 2007-01-02 Battelle Memorial Institute Text analysis technique
US9805373B1 (en) 2002-11-19 2017-10-31 Oracle International Corporation Expertise services platform
US7249012B2 (en) * 2002-11-20 2007-07-24 Microsoft Corporation Statistical method and apparatus for learning translation relationships among phrases
US7783614B2 (en) 2003-02-13 2010-08-24 Microsoft Corporation Linking elements of a document to corresponding fields, queries and/or procedures in a database
US7356457B2 (en) * 2003-02-28 2008-04-08 Microsoft Corporation Machine translation using learned word associations without referring to a multi-lingual human authored dictionary of content words
US7711550B1 (en) 2003-04-29 2010-05-04 Microsoft Corporation Methods and system for recognizing names in a computer-generated document and for providing helpful actions associated with recognized names
US8640234B2 (en) * 2003-05-07 2014-01-28 Trustwave Holdings, Inc. Method and apparatus for predictive and actual intrusion detection on a network
US7558841B2 (en) 2003-05-14 2009-07-07 Microsoft Corporation Method, system, and computer-readable medium for communicating results to a data query in a computer network
US7739588B2 (en) 2003-06-27 2010-06-15 Microsoft Corporation Leveraging markup language data for semantically labeling text strings and data and for providing actions based on semantically labeled text strings and data
US8548794B2 (en) 2003-07-02 2013-10-01 University Of Southern California Statistical noun phrase translation
US20070136251A1 (en) * 2003-08-21 2007-06-14 Idilia Inc. System and Method for Processing a Query
CA2536271A1 (en) 2003-08-21 2005-03-03 Idilia Inc. System and method for associating documents with contextual advertisements
WO2005020092A1 (en) * 2003-08-21 2005-03-03 Idilia Inc. System and method for processing a query
US7475010B2 (en) * 2003-09-03 2009-01-06 Lingospot, Inc. Adaptive and scalable method for resolving natural language ambiguities
JP3856778B2 (ja) * 2003-09-29 2006-12-13 株式会社日立製作所 複数言語を対象とした文書分類装置及び文書分類方法
US7487515B1 (en) 2003-12-09 2009-02-03 Microsoft Corporation Programmable object model for extensible markup language schema validation
US7178102B1 (en) 2003-12-09 2007-02-13 Microsoft Corporation Representing latent data in an extensible markup language document
US7404195B1 (en) 2003-12-09 2008-07-22 Microsoft Corporation Programmable object model for extensible markup language markup in an application
US7434157B2 (en) 2003-12-09 2008-10-07 Microsoft Corporation Programmable object model for namespace or schema library support in a software application
US7509573B1 (en) 2004-02-17 2009-03-24 Microsoft Corporation Anti-virus security information in an extensible markup language document
US20100262621A1 (en) * 2004-03-05 2010-10-14 Russ Ross In-context exact (ice) matching
US7983896B2 (en) 2004-03-05 2011-07-19 SDL Language Technology In-context exact (ICE) matching
US8296127B2 (en) 2004-03-23 2012-10-23 University Of Southern California Discovery of parallel text portions in comparable collections of corpora and training using comparable texts
US8666725B2 (en) 2004-04-16 2014-03-04 University Of Southern California Selection and use of nonstatistical translation components in a statistical machine translation framework
GB2417103A (en) * 2004-08-11 2006-02-15 Sdl Plc Natural language translation system
JP5452868B2 (ja) 2004-10-12 2014-03-26 ユニヴァーシティー オブ サザン カリフォルニア トレーニングおよび復号のためにストリングからツリーへの変換を使うテキスト‐テキスト・アプリケーションのためのトレーニング
US20060206806A1 (en) * 2004-11-04 2006-09-14 Motorola, Inc. Text summarization
US9195766B2 (en) * 2004-12-14 2015-11-24 Google Inc. Providing useful information associated with an item in a document
WO2006086179A2 (en) * 2005-01-31 2006-08-17 Textdigger, Inc. Method and system for semantic search and retrieval of electronic documents
US20070073678A1 (en) * 2005-09-23 2007-03-29 Applied Linguistics, Llc Semantic document profiling
US7599828B2 (en) * 2005-03-01 2009-10-06 Microsoft Corporation Grammatically correct contraction spelling suggestions for french
JP4050755B2 (ja) * 2005-03-30 2008-02-20 株式会社東芝 コミュニケーション支援装置、コミュニケーション支援方法およびコミュニケーション支援プログラム
JP2008537225A (ja) * 2005-04-11 2008-09-11 テキストディガー,インコーポレイテッド クエリについての検索システムおよび方法
US20080195601A1 (en) * 2005-04-14 2008-08-14 The Regents Of The University Of California Method For Information Retrieval
US9606634B2 (en) * 2005-05-18 2017-03-28 Nokia Technologies Oy Device incorporating improved text input mechanism
US20090193334A1 (en) * 2005-05-18 2009-07-30 Exb Asset Management Gmbh Predictive text input system and method involving two concurrent ranking means
US8099281B2 (en) * 2005-06-06 2012-01-17 Nunance Communications, Inc. System and method for word-sense disambiguation by recursive partitioning
US8676563B2 (en) 2009-10-01 2014-03-18 Language Weaver, Inc. Providing human-generated and machine-generated trusted translations
US8886517B2 (en) 2005-06-17 2014-11-11 Language Weaver, Inc. Trust scoring for language translation systems
US7574348B2 (en) * 2005-07-08 2009-08-11 Microsoft Corporation Processing collocation mistakes in documents
US20070073745A1 (en) * 2005-09-23 2007-03-29 Applied Linguistics, Llc Similarity metric for semantic profiling
US7992085B2 (en) 2005-09-26 2011-08-02 Microsoft Corporation Lightweight reference user interface
US7788590B2 (en) 2005-09-26 2010-08-31 Microsoft Corporation Lightweight reference user interface
US10319252B2 (en) * 2005-11-09 2019-06-11 Sdl Inc. Language capability assessment and training apparatus and techniques
WO2007081681A2 (en) 2006-01-03 2007-07-19 Textdigger, Inc. Search system with query refinement and search method
WO2007114932A2 (en) * 2006-04-04 2007-10-11 Textdigger, Inc. Search system and method with text function tagging
US8943080B2 (en) 2006-04-07 2015-01-27 University Of Southern California Systems and methods for identifying parallel documents and sentence fragments in multilingual document collections
KR100785928B1 (ko) * 2006-07-04 2007-12-17 삼성전자주식회사 멀티모달 정보를 이용한 사진 검색 방법 및 사진 검색시스템
US8886518B1 (en) 2006-08-07 2014-11-11 Language Weaver, Inc. System and method for capitalizing machine translated text
US7689408B2 (en) * 2006-09-01 2010-03-30 Microsoft Corporation Identifying language of origin for words using estimates of normalized appearance frequency
US8521506B2 (en) * 2006-09-21 2013-08-27 Sdl Plc Computer-implemented method, computer software and apparatus for use in a translation system
US8433556B2 (en) 2006-11-02 2013-04-30 University Of Southern California Semi-supervised training for statistical word alignment
US9122674B1 (en) 2006-12-15 2015-09-01 Language Weaver, Inc. Use of annotations in statistical machine translation
US8468149B1 (en) 2007-01-26 2013-06-18 Language Weaver, Inc. Multi-lingual online community
US8112402B2 (en) * 2007-02-26 2012-02-07 Microsoft Corporation Automatic disambiguation based on a reference resource
US8615389B1 (en) 2007-03-16 2013-12-24 Language Weaver, Inc. Generation and exploitation of an approximate language model
US8831928B2 (en) 2007-04-04 2014-09-09 Language Weaver, Inc. Customizable machine translation service
US8825466B1 (en) 2007-06-08 2014-09-02 Language Weaver, Inc. Modification of annotated bilingual segment pairs in syntax-based machine translation
US8280721B2 (en) * 2007-08-31 2012-10-02 Microsoft Corporation Efficiently representing word sense probabilities
US20090254540A1 (en) * 2007-11-01 2009-10-08 Textdigger, Inc. Method and apparatus for automated tag generation for digital content
US8209164B2 (en) * 2007-11-21 2012-06-26 University Of Washington Use of lexical translations for facilitating searches
US8190423B2 (en) * 2008-09-05 2012-05-29 Trigent Software Ltd. Word sense disambiguation using emergent categories
WO2010061507A1 (ja) * 2008-11-28 2010-06-03 日本電気株式会社 言語モデル作成装置
GB2468278A (en) * 2009-03-02 2010-09-08 Sdl Plc Computer assisted natural language translation outputs selectable target text associated in bilingual corpus with input target text from partial translation
US9262403B2 (en) 2009-03-02 2016-02-16 Sdl Plc Dynamic generation of auto-suggest dictionary for natural language translation
CN101901210A (zh) * 2009-05-25 2010-12-01 日电(中国)有限公司 词义消歧系统和方法
US8990064B2 (en) 2009-07-28 2015-03-24 Language Weaver, Inc. Translating documents based on content
US8380486B2 (en) 2009-10-01 2013-02-19 Language Weaver, Inc. Providing machine-generated translations and corresponding trust levels
US10417646B2 (en) 2010-03-09 2019-09-17 Sdl Inc. Predicting the cost associated with translating textual content
US9128929B2 (en) 2011-01-14 2015-09-08 Sdl Language Technologies Systems and methods for automatically estimating a translation time including preparation time in addition to the translation itself
US11003838B2 (en) 2011-04-18 2021-05-11 Sdl Inc. Systems and methods for monitoring post translation editing
US9396725B2 (en) * 2011-05-09 2016-07-19 At&T Intellectual Property I, L.P. System and method for optimizing speech recognition and natural language parameters with user feedback
US8738375B2 (en) * 2011-05-09 2014-05-27 At&T Intellectual Property I, L.P. System and method for optimizing speech recognition and natural language parameters with user feedback
US8694303B2 (en) 2011-06-15 2014-04-08 Language Weaver, Inc. Systems and methods for tuning parameters in statistical machine translation
US8886515B2 (en) 2011-10-19 2014-11-11 Language Weaver, Inc. Systems and methods for enhancing machine translation post edit review processes
US8738364B2 (en) 2011-12-14 2014-05-27 International Business Machines Corporation Adaptation of vocabulary levels for enhanced collaboration
US20130198268A1 (en) * 2012-01-30 2013-08-01 David Hyman Generation of a music playlist based on text content accessed by a user
CN103294661A (zh) * 2012-03-01 2013-09-11 富泰华工业(深圳)有限公司 语言歧义消除系统及方法
US8942973B2 (en) 2012-03-09 2015-01-27 Language Weaver, Inc. Content page URL translation
US10261994B2 (en) 2012-05-25 2019-04-16 Sdl Inc. Method and system for automatic management of reputation of translators
US9201876B1 (en) * 2012-05-29 2015-12-01 Google Inc. Contextual weighting of words in a word grouping
US9152622B2 (en) 2012-11-26 2015-10-06 Language Weaver, Inc. Personalized machine translation via online adaptation
US9208442B2 (en) 2013-04-26 2015-12-08 Wal-Mart Stores, Inc. Ontology-based attribute extraction from product descriptions
US9213694B2 (en) 2013-10-10 2015-12-15 Language Weaver, Inc. Efficient online domain adaptation
US9779087B2 (en) * 2013-12-13 2017-10-03 Google Inc. Cross-lingual discriminative learning of sequence models with posterior regularization
US9436676B1 (en) 2014-11-25 2016-09-06 Truthful Speaking, Inc. Written word refinement system and method
CN105005553B (zh) * 2015-06-19 2017-11-21 四川大学 基于情感词典的短文本情感倾向分析方法
CN105022805B (zh) * 2015-07-02 2018-05-04 四川大学 一种基于so-pmi商品评价信息的情感分析方法
US10515152B2 (en) * 2015-08-28 2019-12-24 Freedom Solutions Group, Llc Mitigation of conflicts between content matchers in automated document analysis
CN106021224B (zh) * 2016-05-13 2019-03-15 中国科学院自动化研究所 一种双语篇章标注方法
US10635863B2 (en) 2017-10-30 2020-04-28 Sdl Inc. Fragment recall and adaptive automated translation
CN109726386B (zh) * 2017-10-30 2023-05-09 中国移动通信有限公司研究院 一种词向量模型生成方法、装置和计算机可读存储介质
US10817676B2 (en) 2017-12-27 2020-10-27 Sdl Inc. Intelligent routing services and systems
US10741176B2 (en) 2018-01-31 2020-08-11 International Business Machines Corporation Customizing responses to users in automated dialogue systems
US10430447B2 (en) 2018-01-31 2019-10-01 International Business Machines Corporation Predicting intent of a user from anomalous profile data
US11449533B2 (en) * 2018-02-22 2022-09-20 Entigenlogic Llc Curating knowledge for storage in a knowledge database
US11087098B2 (en) * 2018-09-18 2021-08-10 Sap Se Computer systems for classifying multilingual text
US11256867B2 (en) 2018-10-09 2022-02-22 Sdl Inc. Systems and methods of machine learning for digital assets and message creation
CN110991196B (zh) * 2019-12-18 2021-10-26 北京百度网讯科技有限公司 多义词的翻译方法、装置、电子设备及介质
US11651156B2 (en) * 2020-05-07 2023-05-16 Optum Technology, Inc. Contextual document summarization with semantic intelligence

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6140672A (ja) * 1984-07-31 1986-02-26 Hitachi Ltd 多品詞解消処理方式
JPS61105671A (ja) * 1984-10-29 1986-05-23 Hitachi Ltd 自然言語処理装置
JPH083815B2 (ja) * 1985-10-25 1996-01-17 株式会社日立製作所 自然言語の共起関係辞書保守方法
US4930077A (en) * 1987-04-06 1990-05-29 Fan David P Information processing expert system for text analysis and predicting public opinion based information available to the public
US4868750A (en) * 1987-10-07 1989-09-19 Houghton Mifflin Company Collocational grammar system
US5146405A (en) * 1988-02-05 1992-09-08 At&T Bell Laboratories Methods for part-of-speech determination and usage
US4914590A (en) * 1988-05-18 1990-04-03 Emhart Industries, Inc. Natural language understanding system
NL8900587A (nl) * 1989-03-10 1990-10-01 Bso Buro Voor Systeemontwikkel Werkwijze voor het bepalen van de semantische verwantheid van lexicale componenten in een tekst.
US5170349A (en) * 1989-03-14 1992-12-08 Canon Kabushiki Kaisha Text processing apparatus using modification relation analysis
JPH02242372A (ja) * 1989-03-15 1990-09-26 Toshiba Corp 文生成装置
JPH02308370A (ja) * 1989-05-24 1990-12-21 Toshiba Corp 機械翻訳システム
US5056021A (en) * 1989-06-08 1991-10-08 Carolyn Ausborn Method and apparatus for abstracting concepts from natural language
US5243520A (en) * 1990-08-21 1993-09-07 General Electric Company Sense discrimination system and method
EP0494573A1 (en) * 1991-01-08 1992-07-15 International Business Machines Corporation Method for automatically disambiguating the synonymic links in a dictionary for a natural language processing system

Also Published As

Publication number Publication date
EP0550160A2 (en) 1993-07-07
JPH05242138A (ja) 1993-09-21
US5541836A (en) 1996-07-30
EP0550160A3 (ko) 1994-01-12
CA2083733A1 (en) 1993-07-01

Similar Documents

Publication Publication Date Title
KR930014120A (ko) 단어의 모호성을 제거하는 방법과 그 장치
Hu et al. Ocnli: Original chinese natural language inference
Hogaboam et al. Lexical ambiguity and sentence comprehension
Al-Hashemi Text Summarization Extraction System (TSES) Using Extracted Keywords.
Leacock et al. Corpus-based statistical sense resolution
US5062143A (en) Trigram-based method of language identification
US7587420B2 (en) System and method for question answering document retrieval
Madabushi et al. AStitchInLanguageModels: Dataset and methods for the exploration of idiomaticity in pre-trained language models
CN109460552A (zh) 基于规则和语料库的汉语语病自动检测方法及设备
Theeramunkong et al. Non-dictionary-based Thai word segmentation using decision trees
Gupta et al. Automatic text summarization system for Punjabi language
CN109446393B (zh) 一种网络社区话题分类方法及装置
JPH11120183A (ja) キーワード抽出方法及び装置
Pedersen A baseline methodology for word sense disambiguation
Smadja et al. Translating collocations for use in bilingual lexicons
KR20020010226A (ko) 자연어로 입력된 사용자의 질문을 인공지능 시스템이분석하여 인터넷에 존재하는 정보를 효과적으로 제시하는서비스에 대한방법
Mahmoodi et al. Design a Persian automated plagiarism detector (AMZPPD)
Reddon et al. Readability of three adult personality tests: Basic personality inventory, Jackson personality inventory, and personality research form-E
JP2002222193A (ja) 情報自動フィルタリング方法、情報自動フィルタリングシステム及び情報自動フィルタリングプログラム
JP2000148770A (ja) 問合せ文書の分類装置および方法ならびに当該方法を記述したプログラムを記録した記録媒体
Suchomel et al. Website Properties in Relation to the Quality of Text Extracted for Web Corpora.
Chen et al. Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability
JPH07244665A (ja) 機械翻訳システム用辞書・ルール学習方法及び機械翻訳システム用辞書・ルール学習装置
CN111191465A (zh) 一种问答匹配方法、装置、设备及存储介质
JPH10171807A (ja) 語義曖昧性解消装置及び方法

Legal Events

Date Code Title Description
WITN Application deemed withdrawn, e.g. because no request for examination was filed or no examination fee was paid