DE60139323D1 - Vorrichtung und verfahren zur textsegmentierung auf der grundlage kohärenter einheiten - Google Patents

Vorrichtung und verfahren zur textsegmentierung auf der grundlage kohärenter einheiten

Info

Publication number
DE60139323D1
DE60139323D1 DE60139323T DE60139323T DE60139323D1 DE 60139323 D1 DE60139323 D1 DE 60139323D1 DE 60139323 T DE60139323 T DE 60139323T DE 60139323 T DE60139323 T DE 60139323T DE 60139323 D1 DE60139323 D1 DE 60139323D1
Authority
DE
Germany
Prior art keywords
segmentation based
text segmentation
coherent units
coherent
units
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE60139323T
Other languages
German (de)
English (en)
Inventor
Hiroyuki Shimizu
Shinya Nakagawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hewlett Packard Development Co LP
Original Assignee
Hewlett Packard Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Co filed Critical Hewlett Packard Co
Application granted granted Critical
Publication of DE60139323D1 publication Critical patent/DE60139323D1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/137Hierarchical processing, e.g. outlines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure
    • Y10S707/99945Object-oriented database structure processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
  • Facsimiles In General (AREA)
DE60139323T 2000-10-02 2001-10-02 Vorrichtung und verfahren zur textsegmentierung auf der grundlage kohärenter einheiten Expired - Lifetime DE60139323D1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2000302321A JP4299963B2 (ja) 2000-10-02 2000-10-02 意味的まとまりに基づいて文書を分割する装置および方法
PCT/US2001/030734 WO2002029547A1 (en) 2000-10-02 2001-10-02 Apparatus and method for text segmentation based on coherent units

Publications (1)

Publication Number Publication Date
DE60139323D1 true DE60139323D1 (de) 2009-09-03

Family

ID=18783693

Family Applications (1)

Application Number Title Priority Date Filing Date
DE60139323T Expired - Lifetime DE60139323D1 (de) 2000-10-02 2001-10-02 Vorrichtung und verfahren zur textsegmentierung auf der grundlage kohärenter einheiten

Country Status (5)

Country Link
US (1) US7113897B2 (enExample)
EP (1) EP1301853B1 (enExample)
JP (1) JP4299963B2 (enExample)
DE (1) DE60139323D1 (enExample)
WO (1) WO2002029547A1 (enExample)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050120011A1 (en) * 2003-11-26 2005-06-02 Word Data Corp. Code, method, and system for manipulating texts
JP2007241902A (ja) * 2006-03-10 2007-09-20 Univ Of Tsukuba テキストデータの分割システム及びテキストデータの分割及び階層化方法
JP5084297B2 (ja) * 2007-02-21 2012-11-28 株式会社野村総合研究所 会話解析装置および会話解析プログラム
JP4646078B2 (ja) * 2007-03-08 2011-03-09 日本電信電話株式会社 相互に関係する固有表現の組抽出装置及びその方法
JP5256654B2 (ja) * 2007-06-29 2013-08-07 富士通株式会社 文章分割プログラム、文章分割装置および文章分割方法
KR101472844B1 (ko) 2007-10-23 2014-12-16 삼성전자 주식회사 적응적 문서 디스플레이 장치 및 방법
US8977539B2 (en) 2009-03-30 2015-03-10 Nec Corporation Language analysis apparatus, language analysis method, and language analysis program
US8434001B2 (en) 2010-06-03 2013-04-30 Rhonda Enterprises, Llc Systems and methods for presenting a content summary of a media item to a user based on a position within the media item
US9326116B2 (en) 2010-08-24 2016-04-26 Rhonda Enterprises, Llc Systems and methods for suggesting a pause position within electronic text
US9069754B2 (en) 2010-09-29 2015-06-30 Rhonda Enterprises, Llc Method, system, and computer readable medium for detecting related subgroups of text in an electronic document
CN104468319B (zh) * 2013-09-18 2018-11-16 阿里巴巴集团控股有限公司 一种会话内容合并方法和系统
CN104090918B (zh) * 2014-06-16 2017-02-22 北京理工大学 一种基于信息量的句子相似度计算方法
US10402473B2 (en) * 2016-10-16 2019-09-03 Richard Salisbury Comparing, and generating revision markings with respect to, an arbitrary number of text segments
JP6815184B2 (ja) * 2016-12-13 2021-01-20 株式会社東芝 情報処理装置、情報処理方法、および情報処理プログラム
EP3616090A1 (en) * 2017-04-26 2020-03-04 Piksel, Inc. Multimedia stream analysis and retrieval
JP6564811B2 (ja) * 2017-05-18 2019-08-21 日本電信電話株式会社 パッセージ提示制御装置、パッセージ提示方法、及びパッセージ提示プログラム
CN109492659B (zh) * 2018-09-25 2021-10-01 维灵(杭州)信息技术有限公司 一种用于心电、脑电波形对比的计算曲线相似度的方法
JP7148077B2 (ja) * 2019-02-28 2022-10-05 日本電信電話株式会社 木構造解析装置、方法、及びプログラム
US11748571B1 (en) * 2019-05-21 2023-09-05 Educational Testing Service Text segmentation with two-level transformer and auxiliary coherence modeling
CN111797634B (zh) * 2020-06-04 2023-09-08 语联网(武汉)信息技术有限公司 文档分割方法及装置
CN112597422A (zh) * 2020-12-30 2021-04-02 深圳市世强元件网络有限公司 一种pdf文件分割方法和网页中pdf文件加载方法
CN118446213B (zh) * 2024-04-29 2025-01-14 北京医二科技有限公司 文本切分方法及装置、计算机程序产品、电子设备

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE260486T1 (de) * 1992-07-31 2004-03-15 Ibm Auffindung von zeichenketten in einer datenbank von zeichenketten
US5778397A (en) * 1995-06-28 1998-07-07 Xerox Corporation Automatic method of generating feature probabilities for automatic extracting summarization
US5761191A (en) * 1995-11-28 1998-06-02 Telecommunications Techniques Corporation Statistics collection for ATM networks
US6052657A (en) * 1997-09-09 2000-04-18 Dragon Systems, Inc. Text segmentation and identification of topic using language models
JPH11235574A (ja) 1998-02-24 1999-08-31 Hitachi Kasei Techno Plant Kk リサイクル装置及び廃パトローネのリサイクル装置
JP3578618B2 (ja) 1998-02-26 2004-10-20 株式会社リコー 文書分割装置
JP3597697B2 (ja) * 1998-03-20 2004-12-08 富士通株式会社 文書要約装置およびその方法
US6714909B1 (en) * 1998-08-13 2004-03-30 At&T Corp. System and method for automated multimedia content indexing and retrieval
US6185524B1 (en) * 1998-12-31 2001-02-06 Lernout & Hauspie Speech Products N.V. Method and apparatus for automatic identification of word boundaries in continuous text and computation of word boundary scores
US6317708B1 (en) * 1999-01-07 2001-11-13 Justsystem Corporation Method for producing summaries of text document
JP2000235574A (ja) 1999-02-16 2000-08-29 Ricoh Co Ltd 文書処理装置
US6611825B1 (en) * 1999-06-09 2003-08-26 The Boeing Company Method and system for text mining using multidimensional subspaces
US6411962B1 (en) * 1999-11-29 2002-06-25 Xerox Corporation Systems and methods for organizing text
US6675174B1 (en) * 2000-02-02 2004-01-06 International Business Machines Corp. System and method for measuring similarity between a set of known temporal media segments and a one or more temporal media streams

Also Published As

Publication number Publication date
US7113897B2 (en) 2006-09-26
JP2002117019A (ja) 2002-04-19
JP4299963B2 (ja) 2009-07-22
WO2002029547A1 (en) 2002-04-11
EP1301853A1 (en) 2003-04-16
US20030081811A1 (en) 2003-05-01
EP1301853B1 (en) 2009-07-22
WO2002029547A9 (en) 2005-03-17
EP1301853A4 (en) 2007-03-14

Similar Documents

Publication Publication Date Title
DE60139323D1 (de) Vorrichtung und verfahren zur textsegmentierung auf der grundlage kohärenter einheiten
DE69739437D1 (de) Vorrichtung und Verfahren zur Texteingabe
DE69736655D1 (de) Vorrichtung und Verfahren zur Texteingabe
DE60138564D1 (de) Verfahren und vorrichtung zur austenitischen schwe
DE50012668D1 (de) Verfahren und vorrichtung zur fahrweisenbewertung
DE69927457D1 (de) Verfahren und Vorrichtung zur Cache-Speicherung von Informationen im Netzwerk
DE50100861D1 (de) Verfahren und Vorrichtung zur Laser-Mikrodissektion
DE60026253D1 (de) Verfahren und Vorrichtung zur Verschlüsselung eines Dateninhalts
DE60133865D1 (de) Verfahren zur Ansteuerung einer elektrooptischen Vorrichtung, elektrooptische Vorrichtung und elektronisches Gerät
DE60030658D1 (de) Verfahren und Vorrichtung zur Überprüfung von Gegenständen
DE60036939D1 (de) Verfahren und vorrichtung zur markierung von fehlern
DE69938403D1 (de) Verfahren und Vorrichtung zur Routenberechnung
DE60040985D1 (de) Verfahren und vorrichtung zur internetanzeige
DE69931004D1 (de) Verfahren und Vorrichtung zur Datenverarbeitung
DE10196012T1 (de) Verfahren und Vorrichtung zur Viskositätsmessung
DE69923346D1 (de) Vorrichtung und verfahren zur ip kommunikation mit sprachgeneriertem text
DE50100784D1 (de) Verfahren und Vorrichtung zur Laser-Mikrodissektion
DE60033330D1 (de) Verfahren und Vorrichtung zur Blockrauschdetektion
DE69904764D1 (de) Verfahren und Vorrichtung zur Mustererkennung
DE60029716D1 (de) Verfahren und vorrichtung zur informationsaufzeichnung in einheiten
DE69942295D1 (de) Vorrichtung und verfahren zur informationsverarbeitung
DE60016639D1 (de) Verfahren und Vorrichtung zur Pfadsuche
DE10084702T1 (de) Verfahren und Vorrichtung zur Umweltüberwachung
DE60201620D1 (de) Vorrichtung und Verfahren zur Datenkommunikation basierend auf OFDM
DE60039778D1 (de) Verfahren und vorrichtung zur automatisierung der

Legal Events

Date Code Title Description
8327 Change in the person/name/address of the patent owner

Owner name: HEWLETT-PACKARD DEVELOPMENT CO., L.P., HOUSTON, US

8328 Change in the person/name/address of the agent

Representative=s name: SCHOPPE, ZIMMERMANN, STOECKELER & ZINKLER, 82049 P

8364 No opposition during term of opposition