KR0122518B1 - 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조 - Google Patents
데이타 처리 시스템의 단어 분리 방법 및 데이타 구조Info
- Publication number
- KR0122518B1 KR0122518B1 KR1019940003001A KR19940003001A KR0122518B1 KR 0122518 B1 KR0122518 B1 KR 0122518B1 KR 1019940003001 A KR1019940003001 A KR 1019940003001A KR 19940003001 A KR19940003001 A KR 19940003001A KR 0122518 B1 KR0122518 B1 KR 0122518B1
- Authority
- KR
- South Korea
- Prior art keywords
- words
- word
- data structure
- adjacent
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/53—Processing of non-Latin text
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Document Processing Apparatus (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US08/025,464 US5448474A (en) | 1993-03-03 | 1993-03-03 | Method for isolation of Chinese words from connected Chinese text |
| US08/025,464 | 1993-03-03 | ||
| US8/025,464 | 1993-03-03 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| KR940022314A KR940022314A (ko) | 1994-10-20 |
| KR0122518B1 true KR0122518B1 (ko) | 1997-11-20 |
Family
ID=21826213
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1019940003001A Expired - Fee Related KR0122518B1 (ko) | 1993-03-03 | 1994-02-19 | 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US5448474A (enExample) |
| JP (1) | JP2741835B2 (enExample) |
| KR (1) | KR0122518B1 (enExample) |
| CN (2) | CN1168029C (enExample) |
| TW (1) | TW261677B (enExample) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20110057495A (ko) * | 2009-11-24 | 2011-06-01 | 한국전자통신연구원 | 중국어 구문 분절 방법 및 장치 |
Families Citing this family (34)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1997040452A1 (en) * | 1996-04-23 | 1997-10-30 | Language Engineering Corporation | Automated natural language translation |
| US6760695B1 (en) | 1992-08-31 | 2004-07-06 | Logovista Corporation | Automated natural language processing |
| US6278967B1 (en) | 1992-08-31 | 2001-08-21 | Logovista Corporation | Automated system for generating natural language translations that are domain-specific, grammar rule-based, and/or based on part-of-speech analysis |
| JPH07182465A (ja) * | 1993-12-22 | 1995-07-21 | Hitachi Ltd | 文字認識方法 |
| US5806021A (en) * | 1995-10-30 | 1998-09-08 | International Business Machines Corporation | Automatic segmentation of continuous text using statistical approaches |
| US6470306B1 (en) | 1996-04-23 | 2002-10-22 | Logovista Corporation | Automated translation of annotated text based on the determination of locations for inserting annotation tokens and linked ending, end-of-sentence or language tokens |
| CN1114165C (zh) * | 1998-02-13 | 2003-07-09 | 微软公司 | 中文文本中的字词分割方法 |
| US6640006B2 (en) | 1998-02-13 | 2003-10-28 | Microsoft Corporation | Word segmentation in chinese text |
| US6175834B1 (en) | 1998-06-24 | 2001-01-16 | Microsoft Corporation | Consistency checker for documents containing japanese text |
| US6694055B2 (en) * | 1998-07-15 | 2004-02-17 | Microsoft Corporation | Proper name identification in chinese |
| JP2000132560A (ja) | 1998-10-23 | 2000-05-12 | Matsushita Electric Ind Co Ltd | 中国語テレテキスト処理方法及び装置 |
| CN1143232C (zh) | 1998-11-30 | 2004-03-24 | 皇家菲利浦电子有限公司 | 正文的自动分割 |
| US7099876B1 (en) | 1998-12-15 | 2006-08-29 | International Business Machines Corporation | Method, system and computer program product for storing transliteration and/or phonetic spelling information in a text string class |
| US6389386B1 (en) | 1998-12-15 | 2002-05-14 | International Business Machines Corporation | Method, system and computer program product for sorting text strings |
| US6460015B1 (en) | 1998-12-15 | 2002-10-01 | International Business Machines Corporation | Method, system and computer program product for automatic character transliteration in a text string object |
| US6496844B1 (en) | 1998-12-15 | 2002-12-17 | International Business Machines Corporation | Method, system and computer program product for providing a user interface with alternative display language choices |
| US6185524B1 (en) | 1998-12-31 | 2001-02-06 | Lernout & Hauspie Speech Products N.V. | Method and apparatus for automatic identification of word boundaries in continuous text and computation of word boundary scores |
| US6731802B1 (en) | 2000-01-14 | 2004-05-04 | Microsoft Corporation | Lattice and method for identifying and normalizing orthographic variations in Japanese text |
| US6968308B1 (en) | 1999-11-17 | 2005-11-22 | Microsoft Corporation | Method for segmenting non-segmented text using syntactic parse |
| US6678409B1 (en) * | 2000-01-14 | 2004-01-13 | Microsoft Corporation | Parameterized word segmentation of unsegmented text |
| US6513003B1 (en) | 2000-02-03 | 2003-01-28 | Fair Disclosure Financial Network, Inc. | System and method for integrated delivery of media and synchronized transcription |
| JP4048169B2 (ja) * | 2001-06-11 | 2008-02-13 | 博 石倉 | スペースの自動生成によって文章入力を支援するシステム |
| US20050060150A1 (en) * | 2003-09-15 | 2005-03-17 | Microsoft Corporation | Unsupervised training for overlapping ambiguity resolution in word segmentation |
| US20070214189A1 (en) * | 2006-03-10 | 2007-09-13 | Motorola, Inc. | System and method for consistency checking in documents |
| US8539349B1 (en) | 2006-10-31 | 2013-09-17 | Hewlett-Packard Development Company, L.P. | Methods and systems for splitting a chinese character sequence into word segments |
| US8428932B2 (en) * | 2006-12-13 | 2013-04-23 | Nathan S. Ross | Connected text data stream comprising coordinate logic to identify and validate segmented words in the connected text |
| US9767095B2 (en) | 2010-05-21 | 2017-09-19 | Western Standard Publishing Company, Inc. | Apparatus, system, and method for computer aided translation |
| JP5372110B2 (ja) * | 2011-10-28 | 2013-12-18 | シャープ株式会社 | 情報出力装置、情報出力方法、及びコンピュータプログラム |
| IL224482B (en) | 2013-01-29 | 2018-08-30 | Verint Systems Ltd | System and method for keyword spotting using representative dictionary |
| CN103679165B (zh) * | 2013-12-31 | 2017-02-08 | 北京百度网讯科技有限公司 | Ocr字符识别方法及系统 |
| JP6476618B2 (ja) * | 2014-07-07 | 2019-03-06 | 富士通株式会社 | 伸長方法、伸長プログラムおよび伸長装置 |
| IL242218B (en) | 2015-10-22 | 2020-11-30 | Verint Systems Ltd | A system and method for maintaining a dynamic dictionary |
| IL242219B (en) | 2015-10-22 | 2020-11-30 | Verint Systems Ltd | System and method for keyword searching using both static and dynamic dictionaries |
| CN107168952B (zh) * | 2017-05-15 | 2021-06-04 | 北京百度网讯科技有限公司 | 基于人工智能的信息生成方法和装置 |
Family Cites Families (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4327421A (en) * | 1976-05-13 | 1982-04-27 | Transtech International Corporation | Chinese printing system |
| US4679951A (en) * | 1979-11-06 | 1987-07-14 | Cornell Research Foundation, Inc. | Electronic keyboard system and method for reproducing selected symbolic language characters |
| US4365235A (en) * | 1980-12-31 | 1982-12-21 | International Business Machines Corporation | Chinese/Kanji on-line recognition system |
| US4484305A (en) * | 1981-12-14 | 1984-11-20 | Paul Ho | Phonetic multilingual word processor |
| JPH0724055B2 (ja) * | 1984-07-31 | 1995-03-15 | 株式会社日立製作所 | 単語分割処理方法 |
| JPS61105671A (ja) * | 1984-10-29 | 1986-05-23 | Hitachi Ltd | 自然言語処理装置 |
| US4742516A (en) * | 1985-01-14 | 1988-05-03 | Sumitomo Electric Industries, Ltd. | Method for transmitting voice information |
| KR880001588Y1 (ko) * | 1985-02-18 | 1988-05-04 | 최영수 | 단어 암기 용구 |
| JPS61255468A (ja) * | 1985-05-08 | 1986-11-13 | Toshiba Corp | 機械翻訳処理装置 |
| JPS6231467A (ja) * | 1985-08-01 | 1987-02-10 | Toshiba Corp | 文章作成装置 |
| US4669901A (en) * | 1985-09-03 | 1987-06-02 | Feng I Ming | Keyboard device for inputting oriental characters by touch |
| GB8629908D0 (en) * | 1986-12-15 | 1987-01-28 | Kemano Ltd | Words & characters computer input device |
| JPS63284676A (ja) * | 1987-05-16 | 1988-11-21 | Ricoh Co Ltd | 文字列処理装置 |
| US5079702A (en) * | 1990-03-15 | 1992-01-07 | Paul Ho | Phonetic multi-lingual word processor |
| JPH04299767A (ja) * | 1991-03-28 | 1992-10-22 | Ricoh Co Ltd | 形態素解析装置 |
| US5161245A (en) * | 1991-05-01 | 1992-11-03 | Apple Computer, Inc. | Pattern recognition system having inter-pattern spacing correction |
-
1993
- 1993-03-03 US US08/025,464 patent/US5448474A/en not_active Expired - Fee Related
-
1994
- 1994-01-25 JP JP6006143A patent/JP2741835B2/ja not_active Expired - Fee Related
- 1994-02-18 CN CNB991231104A patent/CN1168029C/zh not_active Expired - Fee Related
- 1994-02-18 CN CN94101382A patent/CN1095576C/zh not_active Expired - Fee Related
- 1994-02-19 KR KR1019940003001A patent/KR0122518B1/ko not_active Expired - Fee Related
- 1994-03-03 TW TW083101864A patent/TW261677B/zh active
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20110057495A (ko) * | 2009-11-24 | 2011-06-01 | 한국전자통신연구원 | 중국어 구문 분절 방법 및 장치 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPH06325076A (ja) | 1994-11-25 |
| JP2741835B2 (ja) | 1998-04-22 |
| CN1095576C (zh) | 2002-12-04 |
| US5448474A (en) | 1995-09-05 |
| CN1168029C (zh) | 2004-09-22 |
| CN1254891A (zh) | 2000-05-31 |
| TW261677B (enExample) | 1995-11-01 |
| KR940022314A (ko) | 1994-10-20 |
| CN1100542A (zh) | 1995-03-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR0122518B1 (ko) | 데이타 처리 시스템의 단어 분리 방법 및 데이타 구조 | |
| JP4162711B2 (ja) | Nグラム・ワード分解を用いた携帯型文書索引付け用のシステム及び方法 | |
| JP3143079B2 (ja) | 辞書索引作成装置と文書検索装置 | |
| US5704060A (en) | Text storage and retrieval system and method | |
| US7197449B2 (en) | Method for extracting name entities and jargon terms using a suffix tree data structure | |
| KR101122942B1 (ko) | 단어-분해에 사용하기 위한 새로운 단어 수집 방법 및 시스템 | |
| US4991094A (en) | Method for language-independent text tokenization using a character categorization | |
| Angell et al. | Automatic spelling correction using a trigram similarity measure | |
| JPH08249354A (ja) | 単語索引および単語索引作成装置および文書検索装置 | |
| JPS63231569A (ja) | 複合語の解析方法 | |
| WO2011086637A1 (ja) | 要求抽出システム、要求抽出方法および要求抽出プログラム | |
| JP2001175661A (ja) | 全文検索装置及び全文検索方法 | |
| JP3727995B2 (ja) | 文書処理方法及び装置 | |
| JP3489237B2 (ja) | 文書検索方法 | |
| JP3376996B2 (ja) | フルテキストサーチ方法 | |
| JPH07230468A (ja) | キーワード自動抽出装置およびキーワード自動抽出方法 | |
| JPH0574858B2 (enExample) | ||
| Marukawa et al. | A High Speed Word Matching Algorithm for Handwritten Chinese Character Recognition. | |
| JP3241854B2 (ja) | 単語スペル自動補正装置 | |
| JPH11191107A (ja) | 文書処理方法とその装置 | |
| JPH0748218B2 (ja) | 情報処理装置 | |
| JPH09138809A (ja) | 全文検索方法 | |
| Lap et al. | Indexing multilingual information on the web | |
| Lucas | Spatially aware rapid retrieval system (SPARRS) | |
| Downton et al. | Interactive archive card index conversion and verification |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A201 | Request for examination | ||
| PA0109 | Patent application |
St.27 status event code: A-0-1-A10-A12-nap-PA0109 |
|
| PA0201 | Request for examination |
St.27 status event code: A-1-2-D10-D11-exm-PA0201 |
|
| R17-X000 | Change to representative recorded |
St.27 status event code: A-3-3-R10-R17-oth-X000 |
|
| PG1501 | Laying open of application |
St.27 status event code: A-1-1-Q10-Q12-nap-PG1501 |
|
| E902 | Notification of reason for refusal | ||
| PE0902 | Notice of grounds for rejection |
St.27 status event code: A-1-2-D10-D21-exm-PE0902 |
|
| R17-X000 | Change to representative recorded |
St.27 status event code: A-3-3-R10-R17-oth-X000 |
|
| P11-X000 | Amendment of application requested |
St.27 status event code: A-2-2-P10-P11-nap-X000 |
|
| P13-X000 | Application amended |
St.27 status event code: A-2-2-P10-P13-nap-X000 |
|
| E701 | Decision to grant or registration of patent right | ||
| PE0701 | Decision of registration |
St.27 status event code: A-1-2-D10-D22-exm-PE0701 |
|
| GRNT | Written decision to grant | ||
| PR0701 | Registration of establishment |
St.27 status event code: A-2-4-F10-F11-exm-PR0701 |
|
| PR1002 | Payment of registration fee |
St.27 status event code: A-2-2-U10-U11-oth-PR1002 Fee payment year number: 1 |
|
| PG1601 | Publication of registration |
St.27 status event code: A-4-4-Q10-Q13-nap-PG1601 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 4 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 5 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 6 |
|
| FPAY | Annual fee payment |
Payment date: 20030711 Year of fee payment: 7 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 7 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| LAPS | Lapse due to unpaid annual fee | ||
| PC1903 | Unpaid annual fee |
St.27 status event code: A-4-4-U10-U13-oth-PC1903 Not in force date: 20040906 Payment event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE |
|
| PC1903 | Unpaid annual fee |
St.27 status event code: N-4-6-H10-H13-oth-PC1903 Ip right cessation event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE Not in force date: 20040906 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| P22-X000 | Classification modified |
St.27 status event code: A-4-4-P10-P22-nap-X000 |
|
| P22-X000 | Classification modified |
St.27 status event code: A-4-4-P10-P22-nap-X000 |