KR0122518B1 - 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조 - Google Patents

데이타 처리 시스템의 단어 분리 방법 및 데이타 구조

Info

Publication number
KR0122518B1
KR0122518B1 KR1019940003001A KR19940003001A KR0122518B1 KR 0122518 B1 KR0122518 B1 KR 0122518B1 KR 1019940003001 A KR1019940003001 A KR 1019940003001A KR 19940003001 A KR19940003001 A KR 19940003001A KR 0122518 B1 KR0122518 B1 KR 0122518B1
Authority
KR
South Korea
Prior art keywords
words
word
data structure
adjacent
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
KR1019940003001A
Other languages
English (en)
Korean (ko)
Other versions
KR940022314A (ko
Inventor
자모라 안토니오
Original Assignee
윌리엄 티.엘리스
인터네셔널 비지네스 머신즈 코포레이션
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 윌리엄 티.엘리스, 인터네셔널 비지네스 머신즈 코포레이션 filed Critical 윌리엄 티.엘리스
Publication of KR940022314A publication Critical patent/KR940022314A/ko
Application granted granted Critical
Publication of KR0122518B1 publication Critical patent/KR0122518B1/ko
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/53Processing of non-Latin text

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)
KR1019940003001A 1993-03-03 1994-02-19 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조 Expired - Fee Related KR0122518B1 (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/025,464 US5448474A (en) 1993-03-03 1993-03-03 Method for isolation of Chinese words from connected Chinese text
US08/025,464 1993-03-03
US8/025,464 1993-03-03

Publications (2)

Publication Number Publication Date
KR940022314A KR940022314A (ko) 1994-10-20
KR0122518B1 true KR0122518B1 (ko) 1997-11-20

Family

ID=21826213

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019940003001A Expired - Fee Related KR0122518B1 (ko) 1993-03-03 1994-02-19 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조

Country Status (5)

Country Link
US (1) US5448474A (enExample)
JP (1) JP2741835B2 (enExample)
KR (1) KR0122518B1 (enExample)
CN (2) CN1168029C (enExample)
TW (1) TW261677B (enExample)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110057495A (ko) * 2009-11-24 2011-06-01 한국전자통신연구원 중국어 구문 분절 방법 및 장치

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997040452A1 (en) * 1996-04-23 1997-10-30 Language Engineering Corporation Automated natural language translation
US6760695B1 (en) 1992-08-31 2004-07-06 Logovista Corporation Automated natural language processing
US6278967B1 (en) 1992-08-31 2001-08-21 Logovista Corporation Automated system for generating natural language translations that are domain-specific, grammar rule-based, and/or based on part-of-speech analysis
JPH07182465A (ja) * 1993-12-22 1995-07-21 Hitachi Ltd 文字認識方法
US5806021A (en) * 1995-10-30 1998-09-08 International Business Machines Corporation Automatic segmentation of continuous text using statistical approaches
US6470306B1 (en) 1996-04-23 2002-10-22 Logovista Corporation Automated translation of annotated text based on the determination of locations for inserting annotation tokens and linked ending, end-of-sentence or language tokens
CN1114165C (zh) * 1998-02-13 2003-07-09 微软公司 中文文本中的字词分割方法
US6640006B2 (en) 1998-02-13 2003-10-28 Microsoft Corporation Word segmentation in chinese text
US6175834B1 (en) 1998-06-24 2001-01-16 Microsoft Corporation Consistency checker for documents containing japanese text
US6694055B2 (en) * 1998-07-15 2004-02-17 Microsoft Corporation Proper name identification in chinese
JP2000132560A (ja) 1998-10-23 2000-05-12 Matsushita Electric Ind Co Ltd 中国語テレテキスト処理方法及び装置
CN1143232C (zh) 1998-11-30 2004-03-24 皇家菲利浦电子有限公司 正文的自动分割
US7099876B1 (en) 1998-12-15 2006-08-29 International Business Machines Corporation Method, system and computer program product for storing transliteration and/or phonetic spelling information in a text string class
US6389386B1 (en) 1998-12-15 2002-05-14 International Business Machines Corporation Method, system and computer program product for sorting text strings
US6460015B1 (en) 1998-12-15 2002-10-01 International Business Machines Corporation Method, system and computer program product for automatic character transliteration in a text string object
US6496844B1 (en) 1998-12-15 2002-12-17 International Business Machines Corporation Method, system and computer program product for providing a user interface with alternative display language choices
US6185524B1 (en) 1998-12-31 2001-02-06 Lernout & Hauspie Speech Products N.V. Method and apparatus for automatic identification of word boundaries in continuous text and computation of word boundary scores
US6731802B1 (en) 2000-01-14 2004-05-04 Microsoft Corporation Lattice and method for identifying and normalizing orthographic variations in Japanese text
US6968308B1 (en) 1999-11-17 2005-11-22 Microsoft Corporation Method for segmenting non-segmented text using syntactic parse
US6678409B1 (en) * 2000-01-14 2004-01-13 Microsoft Corporation Parameterized word segmentation of unsegmented text
US6513003B1 (en) 2000-02-03 2003-01-28 Fair Disclosure Financial Network, Inc. System and method for integrated delivery of media and synchronized transcription
JP4048169B2 (ja) * 2001-06-11 2008-02-13 博 石倉 スペースの自動生成によって文章入力を支援するシステム
US20050060150A1 (en) * 2003-09-15 2005-03-17 Microsoft Corporation Unsupervised training for overlapping ambiguity resolution in word segmentation
US20070214189A1 (en) * 2006-03-10 2007-09-13 Motorola, Inc. System and method for consistency checking in documents
US8539349B1 (en) 2006-10-31 2013-09-17 Hewlett-Packard Development Company, L.P. Methods and systems for splitting a chinese character sequence into word segments
US8428932B2 (en) * 2006-12-13 2013-04-23 Nathan S. Ross Connected text data stream comprising coordinate logic to identify and validate segmented words in the connected text
US9767095B2 (en) 2010-05-21 2017-09-19 Western Standard Publishing Company, Inc. Apparatus, system, and method for computer aided translation
JP5372110B2 (ja) * 2011-10-28 2013-12-18 シャープ株式会社 情報出力装置、情報出力方法、及びコンピュータプログラム
IL224482B (en) 2013-01-29 2018-08-30 Verint Systems Ltd System and method for keyword spotting using representative dictionary
CN103679165B (zh) * 2013-12-31 2017-02-08 北京百度网讯科技有限公司 Ocr字符识别方法及系统
JP6476618B2 (ja) * 2014-07-07 2019-03-06 富士通株式会社 伸長方法、伸長プログラムおよび伸長装置
IL242218B (en) 2015-10-22 2020-11-30 Verint Systems Ltd A system and method for maintaining a dynamic dictionary
IL242219B (en) 2015-10-22 2020-11-30 Verint Systems Ltd System and method for keyword searching using both static and dynamic dictionaries
CN107168952B (zh) * 2017-05-15 2021-06-04 北京百度网讯科技有限公司 基于人工智能的信息生成方法和装置

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4327421A (en) * 1976-05-13 1982-04-27 Transtech International Corporation Chinese printing system
US4679951A (en) * 1979-11-06 1987-07-14 Cornell Research Foundation, Inc. Electronic keyboard system and method for reproducing selected symbolic language characters
US4365235A (en) * 1980-12-31 1982-12-21 International Business Machines Corporation Chinese/Kanji on-line recognition system
US4484305A (en) * 1981-12-14 1984-11-20 Paul Ho Phonetic multilingual word processor
JPH0724055B2 (ja) * 1984-07-31 1995-03-15 株式会社日立製作所 単語分割処理方法
JPS61105671A (ja) * 1984-10-29 1986-05-23 Hitachi Ltd 自然言語処理装置
US4742516A (en) * 1985-01-14 1988-05-03 Sumitomo Electric Industries, Ltd. Method for transmitting voice information
KR880001588Y1 (ko) * 1985-02-18 1988-05-04 최영수 단어 암기 용구
JPS61255468A (ja) * 1985-05-08 1986-11-13 Toshiba Corp 機械翻訳処理装置
JPS6231467A (ja) * 1985-08-01 1987-02-10 Toshiba Corp 文章作成装置
US4669901A (en) * 1985-09-03 1987-06-02 Feng I Ming Keyboard device for inputting oriental characters by touch
GB8629908D0 (en) * 1986-12-15 1987-01-28 Kemano Ltd Words & characters computer input device
JPS63284676A (ja) * 1987-05-16 1988-11-21 Ricoh Co Ltd 文字列処理装置
US5079702A (en) * 1990-03-15 1992-01-07 Paul Ho Phonetic multi-lingual word processor
JPH04299767A (ja) * 1991-03-28 1992-10-22 Ricoh Co Ltd 形態素解析装置
US5161245A (en) * 1991-05-01 1992-11-03 Apple Computer, Inc. Pattern recognition system having inter-pattern spacing correction

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110057495A (ko) * 2009-11-24 2011-06-01 한국전자통신연구원 중국어 구문 분절 방법 및 장치

Also Published As

Publication number Publication date
JPH06325076A (ja) 1994-11-25
JP2741835B2 (ja) 1998-04-22
CN1095576C (zh) 2002-12-04
US5448474A (en) 1995-09-05
CN1168029C (zh) 2004-09-22
CN1254891A (zh) 2000-05-31
TW261677B (enExample) 1995-11-01
KR940022314A (ko) 1994-10-20
CN1100542A (zh) 1995-03-22

Similar Documents

Publication Publication Date Title
KR0122518B1 (ko) 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조
JP4162711B2 (ja) Nグラム・ワード分解を用いた携帯型文書索引付け用のシステム及び方法
JP3143079B2 (ja) 辞書索引作成装置と文書検索装置
US5704060A (en) Text storage and retrieval system and method
US7197449B2 (en) Method for extracting name entities and jargon terms using a suffix tree data structure
KR101122942B1 (ko) 단어-분해에 사용하기 위한 새로운 단어 수집 방법 및 시스템
US4991094A (en) Method for language-independent text tokenization using a character categorization
Angell et al. Automatic spelling correction using a trigram similarity measure
JPH08249354A (ja) 単語索引および単語索引作成装置および文書検索装置
JPS63231569A (ja) 複合語の解析方法
WO2011086637A1 (ja) 要求抽出システム、要求抽出方法および要求抽出プログラム
JP2001175661A (ja) 全文検索装置及び全文検索方法
JP3727995B2 (ja) 文書処理方法及び装置
JP3489237B2 (ja) 文書検索方法
JP3376996B2 (ja) フルテキストサーチ方法
JPH07230468A (ja) キーワード自動抽出装置およびキーワード自動抽出方法
JPH0574858B2 (enExample)
Marukawa et al. A High Speed Word Matching Algorithm for Handwritten Chinese Character Recognition.
JP3241854B2 (ja) 単語スペル自動補正装置
JPH11191107A (ja) 文書処理方法とその装置
JPH0748218B2 (ja) 情報処理装置
JPH09138809A (ja) 全文検索方法
Lap et al. Indexing multilingual information on the web
Lucas Spatially aware rapid retrieval system (SPARRS)
Downton et al. Interactive archive card index conversion and verification

Legal Events

Date Code Title Description
A201 Request for examination
PA0109 Patent application

St.27 status event code: A-0-1-A10-A12-nap-PA0109

PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

R17-X000 Change to representative recorded

St.27 status event code: A-3-3-R10-R17-oth-X000

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

R17-X000 Change to representative recorded

St.27 status event code: A-3-3-R10-R17-oth-X000

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

St.27 status event code: A-1-2-D10-D22-exm-PE0701

GRNT Written decision to grant
PR0701 Registration of establishment

St.27 status event code: A-2-4-F10-F11-exm-PR0701

PR1002 Payment of registration fee

St.27 status event code: A-2-2-U10-U11-oth-PR1002

Fee payment year number: 1

PG1601 Publication of registration

St.27 status event code: A-4-4-Q10-Q13-nap-PG1601

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 4

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 5

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 6

FPAY Annual fee payment

Payment date: 20030711

Year of fee payment: 7

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 7

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

LAPS Lapse due to unpaid annual fee
PC1903 Unpaid annual fee

St.27 status event code: A-4-4-U10-U13-oth-PC1903

Not in force date: 20040906

Payment event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE

PC1903 Unpaid annual fee

St.27 status event code: N-4-6-H10-H13-oth-PC1903

Ip right cessation event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE

Not in force date: 20040906

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

P22-X000 Classification modified

St.27 status event code: A-4-4-P10-P22-nap-X000

P22-X000 Classification modified

St.27 status event code: A-4-4-P10-P22-nap-X000