JP2018513453A - ワード間の不確定性に応じてワードスペースを非対称にフォーマットするシステムおよび方法 - Google Patents
ワード間の不確定性に応じてワードスペースを非対称にフォーマットするシステムおよび方法 Download PDFInfo
- Publication number
- JP2018513453A JP2018513453A JP2017545541A JP2017545541A JP2018513453A JP 2018513453 A JP2018513453 A JP 2018513453A JP 2017545541 A JP2017545541 A JP 2017545541A JP 2017545541 A JP2017545541 A JP 2017545541A JP 2018513453 A JP2018513453 A JP 2018513453A
- Authority
- JP
- Japan
- Prior art keywords
- word
- space
- text
- key
- vocabulary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/163—Handling of whitespace
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/106—Display of layout of documents; Previewing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/114—Pagination
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/117—Tagging; Marking up; Designating a block; Setting of attributes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Document Processing Apparatus (AREA)
- Machine Translation (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Circuits Of Receivers In General (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201562131187P | 2015-03-10 | 2015-03-10 | |
| US62/131,187 | 2015-03-10 | ||
| PCT/US2016/021381 WO2016144963A1 (en) | 2015-03-10 | 2016-03-08 | Systems and methods for asymmetrical formatting of word spaces according to the uncertainty between words |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2018513453A true JP2018513453A (ja) | 2018-05-24 |
| JP2018513453A5 JP2018513453A5 (enExample) | 2019-04-18 |
Family
ID=56879374
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2017545541A Pending JP2018513453A (ja) | 2015-03-10 | 2016-03-08 | ワード間の不確定性に応じてワードスペースを非対称にフォーマットするシステムおよび方法 |
Country Status (9)
| Country | Link |
|---|---|
| US (2) | US10599748B2 (enExample) |
| EP (1) | EP3268872A4 (enExample) |
| JP (1) | JP2018513453A (enExample) |
| KR (1) | KR20170140808A (enExample) |
| CN (1) | CN107615268B (enExample) |
| AU (1) | AU2016229923B2 (enExample) |
| BR (1) | BR112017017612A2 (enExample) |
| MX (1) | MX2017011452A (enExample) |
| WO (1) | WO2016144963A1 (enExample) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109190124B (zh) * | 2018-09-14 | 2019-11-26 | 北京字节跳动网络技术有限公司 | 用于分词的方法和装置 |
| CN111261162B (zh) * | 2020-03-09 | 2023-04-18 | 北京达佳互联信息技术有限公司 | 语音识别方法、语音识别装置及存储介质 |
| KR102209133B1 (ko) * | 2020-04-27 | 2021-01-28 | 주식회사 뉴로라인즈 | 물질안전보건자료를 위한 판독 및 처리 시스템 및 이를 위한 동작 방법 |
| CN112016322B (zh) * | 2020-08-28 | 2023-06-27 | 沈阳雅译网络技术有限公司 | 一种英文粘连词错误的还原方法 |
| US12086542B2 (en) * | 2021-04-06 | 2024-09-10 | Talent Unlimited Online Services Private Limited | System and method for generating contextualized text using a character-based convolutional neural network architecture |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130047078A1 (en) * | 2007-09-28 | 2013-02-21 | Thomas G. Bever | System, plug-in, and method for improving text composition by modifying character prominence according to assigned character information measures |
Family Cites Families (51)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5146405A (en) * | 1988-02-05 | 1992-09-08 | At&T Bell Laboratories | Methods for part-of-speech determination and usage |
| US20020052903A1 (en) * | 1993-05-31 | 2002-05-02 | Mitsuhiro Aida | Text input method |
| US5579466A (en) * | 1994-09-01 | 1996-11-26 | Microsoft Corporation | Method and system for editing and formatting data in a dialog window |
| AU5969896A (en) * | 1995-06-07 | 1996-12-30 | International Language Engineering Corporation | Machine assisted translation tools |
| US5857212A (en) * | 1995-07-06 | 1999-01-05 | Sun Microsystems, Inc. | System and method for horizontal alignment of tokens in a structural representation program editor |
| US5801679A (en) * | 1996-11-26 | 1998-09-01 | Novell, Inc. | Method and system for determining a cursor location with respect to a plurality of character locations |
| US6240430B1 (en) * | 1996-12-13 | 2001-05-29 | International Business Machines Corporation | Method of multiple text selection and manipulation |
| US20020116196A1 (en) * | 1998-11-12 | 2002-08-22 | Tran Bao Q. | Speech recognizer |
| AU5451800A (en) * | 1999-05-28 | 2000-12-18 | Sehda, Inc. | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
| US7346489B1 (en) * | 1999-07-16 | 2008-03-18 | Language Technologies, Inc. | System and method of determining phrasing in text |
| US7069508B1 (en) | 2000-07-13 | 2006-06-27 | Language Technologies, Inc. | System and method for formatting text according to linguistic, visual and psychological variables |
| US6282327B1 (en) * | 1999-07-30 | 2001-08-28 | Microsoft Corporation | Maintaining advance widths of existing characters that have been resolution enhanced |
| US6477488B1 (en) * | 2000-03-10 | 2002-11-05 | Apple Computer, Inc. | Method for dynamic context scope selection in hybrid n-gram+LSA language modeling |
| US7093240B1 (en) * | 2001-12-20 | 2006-08-15 | Unisys Corporation | Efficient timing chart creation and manipulation |
| US7385606B2 (en) * | 2002-12-18 | 2008-06-10 | Microsoft Corporation | International font measurement system and method |
| US7516404B1 (en) * | 2003-06-02 | 2009-04-07 | Colby Steven M | Text correction |
| US20040253568A1 (en) * | 2003-06-16 | 2004-12-16 | Shaver-Troup Bonnie S. | Method of improving reading of a text |
| US7773248B2 (en) * | 2003-09-30 | 2010-08-10 | Brother Kogyo Kabushiki Kaisha | Device information management system |
| US7292244B2 (en) * | 2004-10-18 | 2007-11-06 | Microsoft Corporation | System and method for automatic label placement on charts |
| US7865525B1 (en) * | 2007-08-02 | 2011-01-04 | Amazon Technologies, Inc. | High efficiency binary encoding |
| US8996682B2 (en) * | 2007-10-12 | 2015-03-31 | Microsoft Technology Licensing, Llc | Automatically instrumenting a set of web documents |
| US8417713B1 (en) * | 2007-12-05 | 2013-04-09 | Google Inc. | Sentiment detection as a ranking signal for reviewable entities |
| US9529974B2 (en) * | 2008-02-25 | 2016-12-27 | Georgetown University | System and method for detecting, collecting, analyzing, and communicating event-related information |
| US20110231755A1 (en) * | 2008-07-14 | 2011-09-22 | Daniel Herzner | Method of formatting text in an electronic document to increase reading speed |
| JP5226425B2 (ja) * | 2008-08-13 | 2013-07-03 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 情報処理装置、情報処理方法およびプログラム |
| US20100146444A1 (en) * | 2008-12-05 | 2010-06-10 | Microsoft Corporation | Motion Adaptive User Interface Service |
| US8819541B2 (en) * | 2009-02-13 | 2014-08-26 | Language Technologies, Inc. | System and method for converting the digital typesetting documents used in publishing to a device-specfic format for electronic publishing |
| US8306819B2 (en) * | 2009-03-09 | 2012-11-06 | Microsoft Corporation | Enhanced automatic speech recognition using mapping between unsupervised and supervised speech model parameters trained on same acoustic training data |
| US8712774B2 (en) * | 2009-03-30 | 2014-04-29 | Nuance Communications, Inc. | Systems and methods for generating a hybrid text string from two or more text strings generated by multiple automated speech recognition systems |
| US8543914B2 (en) * | 2009-05-22 | 2013-09-24 | Blackberry Limited | Method and device for proportional setting of font attributes |
| EP2517156A4 (en) * | 2009-12-24 | 2018-02-14 | Moodwire, Inc. | System and method for determining sentiment expressed in documents |
| US9026907B2 (en) * | 2010-02-12 | 2015-05-05 | Nicholas Lum | Indicators of text continuity |
| US8959427B1 (en) * | 2011-08-05 | 2015-02-17 | Google Inc. | System and method for JavaScript based HTML website layouts |
| US8862602B1 (en) * | 2011-10-25 | 2014-10-14 | Google Inc. | Systems and methods for improved readability of URLs |
| US9116654B1 (en) * | 2011-12-01 | 2015-08-25 | Amazon Technologies, Inc. | Controlling the rendering of supplemental content related to electronic books |
| CN103106227A (zh) * | 2012-08-03 | 2013-05-15 | 人民搜索网络股份公司 | 一种基于网页文本的新词查找系统及方法 |
| JP2016035607A (ja) * | 2012-12-27 | 2016-03-17 | パナソニック株式会社 | ダイジェストを生成するための装置、方法、及びプログラム |
| JP2014130445A (ja) * | 2012-12-28 | 2014-07-10 | Toshiba Corp | 情報抽出サーバ、情報抽出クライアント、情報抽出方法、及び、情報抽出プログラム |
| IN2013CH00469A (enExample) * | 2013-01-21 | 2015-07-31 | Keypoint Technologies India Pvt Ltd | |
| CN104063387B (zh) * | 2013-03-19 | 2017-07-28 | 三星电子(中国)研发中心 | 在文本中抽取关键词的装置和方法 |
| JP6136568B2 (ja) * | 2013-05-23 | 2017-05-31 | 富士通株式会社 | 情報処理装置および入力制御プログラム |
| EP2824586A1 (en) * | 2013-07-09 | 2015-01-14 | Universiteit Twente | Method and computer server system for receiving and presenting information to a user in a computer network |
| US20150371120A1 (en) * | 2014-06-18 | 2015-12-24 | Sarfaraz K. Niazi | Visual axis optimization for enhanced readability and comprehension |
| US20160301828A1 (en) * | 2014-06-18 | 2016-10-13 | Sarfaraz K. Niazi | Visual axis optimization for enhanced readability and comprehension |
| US20180018305A1 (en) * | 2015-02-05 | 2018-01-18 | Hewlett-Packard Development Company, L.P. | Character spacing adjustment of text columns |
| US10891699B2 (en) * | 2015-02-09 | 2021-01-12 | Legalogic Ltd. | System and method in support of digital document analysis |
| CN104915446B (zh) * | 2015-06-29 | 2019-01-29 | 华南理工大学 | 基于新闻的事件演化关系自动提取方法及其系统 |
| CN105373614B (zh) * | 2015-11-24 | 2018-09-28 | 中国科学院深圳先进技术研究院 | 一种基于用户账号的子用户识别方法及系统 |
| US10235348B2 (en) * | 2016-04-12 | 2019-03-19 | Microsoft Technology Licensing, Llc | Assistive graphical user interface for preserving document layout while improving the document readability |
| US10552217B2 (en) * | 2016-08-15 | 2020-02-04 | International Business Machines Corporation | Workload placement in a hybrid cloud environment |
| US10467241B2 (en) * | 2017-03-24 | 2019-11-05 | Ca, Inc. | Dynamically provisioning instances of a single-tenant application for multi-tenant use |
-
2016
- 2016-03-08 US US15/549,509 patent/US10599748B2/en active Active
- 2016-03-08 EP EP16762347.9A patent/EP3268872A4/en not_active Withdrawn
- 2016-03-08 KR KR1020177028553A patent/KR20170140808A/ko not_active Ceased
- 2016-03-08 CN CN201680027497.3A patent/CN107615268B/zh not_active Expired - Fee Related
- 2016-03-08 MX MX2017011452A patent/MX2017011452A/es unknown
- 2016-03-08 US US15/063,794 patent/US10157168B2/en active Active
- 2016-03-08 WO PCT/US2016/021381 patent/WO2016144963A1/en not_active Ceased
- 2016-03-08 JP JP2017545541A patent/JP2018513453A/ja active Pending
- 2016-03-08 BR BR112017017612A patent/BR112017017612A2/pt not_active Application Discontinuation
- 2016-03-08 AU AU2016229923A patent/AU2016229923B2/en not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130047078A1 (en) * | 2007-09-28 | 2013-02-21 | Thomas G. Bever | System, plug-in, and method for improving text composition by modifying character prominence according to assigned character information measures |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20170140808A (ko) | 2017-12-21 |
| US20170185566A1 (en) | 2017-06-29 |
| EP3268872A4 (en) | 2018-11-21 |
| EP3268872A1 (en) | 2018-01-17 |
| AU2016229923A1 (en) | 2017-09-07 |
| AU2016229923B2 (en) | 2021-01-21 |
| CN107615268A (zh) | 2018-01-19 |
| BR112017017612A2 (pt) | 2018-05-08 |
| US20180039617A1 (en) | 2018-02-08 |
| US10599748B2 (en) | 2020-03-24 |
| US10157168B2 (en) | 2018-12-18 |
| WO2016144963A1 (en) | 2016-09-15 |
| CN107615268B (zh) | 2021-08-24 |
| MX2017011452A (es) | 2018-06-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Pasha et al. | Madamira: A fast, comprehensive tool for morphological analysis and disambiguation of arabic. | |
| Trujillo | Translation engines: techniques for machine translation | |
| Hardie | Modest XML for Corpora: Not a standard, but a suggestion | |
| US20120290288A1 (en) | Parsing of text using linguistic and non-linguistic list properties | |
| Wilcock | Introduction to linguistic annotation and text analytics | |
| US10599748B2 (en) | Systems and methods for asymmetrical formatting of word spaces according to the uncertainty between words | |
| WO2002021324A1 (en) | Method and apparatus for summarizing multiple documents using a subsumption model | |
| WO2014000764A1 (en) | A system and method for automatic generation of a reference utility | |
| Alotaiby et al. | Arabic vs. English: Comparative statistical study | |
| JP2018513453A5 (enExample) | ||
| JP4904184B2 (ja) | 学習支援装置、学習支援方法およびそのプログラム | |
| Luo et al. | Web article extraction for web printing: a dom+ visual based approach | |
| KR101926669B1 (ko) | 텍스트 임베딩 모델을 이용한 객관식 빈칸 채우기 퀴즈 생성 장치 및 방법 | |
| Park et al. | Automatic analysis of thematic structure in written English | |
| Berg et al. | Are some morphological units more prone to spelling variation than others? A case study using spontaneous handwritten data | |
| Rakholia et al. | The design and implementation of diacritic extraction technique for Gujarati written script using Unicode Transformation Format | |
| Benko | The Aranea Corpora Family: Ten+ Years of Processing Web-Crawled Data | |
| JP2009265770A (ja) | 重要文提示システム | |
| Kulkarni et al. | Machine Learning Techniques for Fake News Detection in Low-Resource Hindi Language: A Comparative Study | |
| Kouroupetroglou et al. | DocEmoX: a system for the typography-derived emotional annotation of documents | |
| CN115017885A (zh) | 一种从电力领域的文本中抽取实体关系的方法 | |
| Ball | Enhancing digital text collections with detailed metadata to improve retrieval | |
| US20150019208A1 (en) | Method for identifying a set of sentences in a digital document, method for generating a digital document, and associated device | |
| JP5676552B2 (ja) | デイリーワード抽出装置、方法、及びプログラム | |
| Browne et al. | Mondeca for thesaurus, autotagging and ontology management in a public health statistics platform |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20170911 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20190304 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20190304 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20200127 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20200204 |
|
| A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20200501 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20200702 |
|
| A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20201117 |