CA2917153A1 - Procede et systeme permettant de simplifier une prediction de relations rhetoriques implicites dans un corpus annote a grande echelle - Google Patents
Procede et systeme permettant de simplifier une prediction de relations rhetoriques implicites dans un corpus annote a grande echelle Download PDFInfo
- Publication number
- CA2917153A1 CA2917153A1 CA2917153A CA2917153A CA2917153A1 CA 2917153 A1 CA2917153 A1 CA 2917153A1 CA 2917153 A CA2917153 A CA 2917153A CA 2917153 A CA2917153 A CA 2917153A CA 2917153 A1 CA2917153 A1 CA 2917153A1
- Authority
- CA
- Canada
- Prior art keywords
- computer
- text
- corpus
- relation
- discourse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
La présente invention se rapporte à un procédé et à un système permettant de prédire des relations rhétoriques implicites entre deux fragments d'un texte, par exemple dans un important corpus annoté, tel que le Penn Discourse Treebank (« PDTB »), le Rhetorical Structure Theory corpus et le Discourse Graph Bank, et permettent, en particulier, de déterminer une relation rhétorique en l'absence d'un marqueur de discours explicite. Des caractéristiques de niveau de surface peuvent être utilisées pour capturer des informations pragmatiques codées dans le marqueur absent. Selon une manière, une caractéristique simplifiée déterminée sur la base seulement d'un texte brut et de fonctions sémantiques est utilisée pour améliorer la performance de toutes les relations. En utilisant des caractéristiques de niveau de surface pour prédire des relations rhétoriques implicites pour l'important corpus annoté, l'invention se rapproche d'une performance maximale théorique, suggérant que davantage de données n'amélioreront pas nécessairement la performance sur la base de ces caractéristiques et de caractéristiques similaires.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361842635P | 2013-07-03 | 2013-07-03 | |
US61/842,635 | 2013-07-03 | ||
PCT/US2014/045432 WO2015003143A2 (fr) | 2013-07-03 | 2014-07-03 | Procédé et système permettant de simplifier une prédiction de relations rhétoriques implicites dans un corpus annoté à grande échelle |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2917153A1 true CA2917153A1 (fr) | 2015-01-08 |
CA2917153C CA2917153C (fr) | 2022-05-17 |
Family
ID=52144292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2917153A Active CA2917153C (fr) | 2013-07-03 | 2014-07-03 | Procede et systeme permettant de simplifier une prediction de relations rhetoriques implicites dans un corpus annote a grande echelle |
Country Status (3)
Country | Link |
---|---|
AU (1) | AU2014285073B9 (fr) |
CA (1) | CA2917153C (fr) |
WO (1) | WO2015003143A2 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209366A (zh) * | 2019-10-10 | 2020-05-29 | 天津大学 | 基于TransS驱动的互激励神经网络的隐式篇章关系识别方法 |
CN112257460A (zh) * | 2020-09-25 | 2021-01-22 | 昆明理工大学 | 基于枢轴的汉越联合训练神经机器翻译方法 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7187545B2 (ja) * | 2017-09-28 | 2022-12-12 | オラクル・インターナショナル・コーポレイション | 名前付きエンティティの構文解析および識別に基づくクロスドキュメントの修辞的つながりの判断 |
US11809825B2 (en) | 2017-09-28 | 2023-11-07 | Oracle International Corporation | Management of a focused information sharing dialogue based on discourse trees |
EP3791292A1 (fr) | 2018-05-09 | 2021-03-17 | Oracle International Corporation | Constructions d'arbres de discours imaginaires pour améliorer la réponse à des questions convergentes |
US11580298B2 (en) | 2019-11-14 | 2023-02-14 | Oracle International Corporation | Detecting hypocrisy in text |
CN113407713B (zh) * | 2020-10-22 | 2024-04-05 | 腾讯科技(深圳)有限公司 | 基于主动学习的语料挖掘方法、装置及电子设备 |
CN113535973B (zh) * | 2021-06-07 | 2023-06-23 | 中国科学院软件研究所 | 基于知识映射的事件关系抽取、语篇关系分析方法及装置 |
CN113377915B (zh) * | 2021-06-22 | 2022-07-19 | 厦门大学 | 对话篇章解析方法 |
CN113553830B (zh) * | 2021-08-11 | 2023-01-03 | 桂林电子科技大学 | 一种基于图的英语文本句子语篇连贯分析方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5659766A (en) * | 1994-09-16 | 1997-08-19 | Xerox Corporation | Method and apparatus for inferring the topical content of a document based upon its lexical content without supervision |
AU2001261506A1 (en) * | 2000-05-11 | 2001-11-20 | University Of Southern California | Discourse parsing and summarization |
US7062561B1 (en) * | 2000-05-23 | 2006-06-13 | Richard Reisman | Method and apparatus for utilizing the social usage learned from multi-user feedback to improve resource identity signifier mapping |
US7127208B2 (en) * | 2002-01-23 | 2006-10-24 | Educational Testing Service | Automated annotation |
US7305336B2 (en) * | 2002-08-30 | 2007-12-04 | Fuji Xerox Co., Ltd. | System and method for summarization combining natural language generation with structural analysis |
-
2014
- 2014-07-03 CA CA2917153A patent/CA2917153C/fr active Active
- 2014-07-03 WO PCT/US2014/045432 patent/WO2015003143A2/fr active Application Filing
- 2014-07-03 AU AU2014285073A patent/AU2014285073B9/en active Active
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209366A (zh) * | 2019-10-10 | 2020-05-29 | 天津大学 | 基于TransS驱动的互激励神经网络的隐式篇章关系识别方法 |
CN111209366B (zh) * | 2019-10-10 | 2023-04-21 | 天津大学 | 基于TransS驱动的互激励神经网络的隐式篇章关系识别方法 |
CN112257460A (zh) * | 2020-09-25 | 2021-01-22 | 昆明理工大学 | 基于枢轴的汉越联合训练神经机器翻译方法 |
CN112257460B (zh) * | 2020-09-25 | 2022-06-21 | 昆明理工大学 | 基于枢轴的汉越联合训练神经机器翻译方法 |
Also Published As
Publication number | Publication date |
---|---|
CA2917153C (fr) | 2022-05-17 |
WO2015003143A3 (fr) | 2015-05-14 |
WO2015003143A2 (fr) | 2015-01-08 |
AU2014285073B2 (en) | 2016-11-03 |
AU2014285073B9 (en) | 2017-04-06 |
AU2014285073A1 (en) | 2016-02-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9355372B2 (en) | Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus | |
CA2917153C (fr) | Procede et systeme permettant de simplifier une prediction de relations rhetoriques implicites dans un corpus annote a grande echelle | |
US9317498B2 (en) | Systems and methods for generating summaries of documents | |
Yi et al. | Sentiment mining in WebFountain | |
Sarawagi | Information extraction | |
Chen et al. | Towards robust unsupervised personal name disambiguation | |
Chali et al. | Query-focused multi-document summarization: Automatic data annotations and supervised learning approaches | |
Khan et al. | EnSWF: effective features extraction and selection in conjunction with ensemble learning methods for document sentiment classification | |
Zhang et al. | Enhancing keyphrase extraction from academic articles with their reference information | |
Laddha et al. | Aspect opinion expression and rating prediction via LDA–CRF hybrid | |
Fagan et al. | An introduction to textual econometrics | |
Zheng et al. | A review on authorship attribution in text mining | |
Sharma et al. | Diverse feature set based Keyphrase extraction and indexing techniques | |
Rajman et al. | From text to knowledge: Document processing and visualization: A text mining approach | |
You et al. | Joint learning-based heterogeneous graph attention network for timeline summarization | |
Zhou et al. | Semantic Smoothing of Document Models for Agglomerative Clustering. | |
Mason | An n-gram based approach to the automatic classification of web pages by genre | |
Tahmasebi | Models and algorithms for automatic detection of language evolution: towards finding and interpreting of content in long-term archives | |
Sizov | Extraction-based automatic summarization: Theoretical and empirical investigation of summarization techniques | |
Dalton | Entity-based enrichment for information extraction and retrieval | |
Brand et al. | N-gram representations for comment filtering | |
Ceylan | Investigating the extractive summarization of literary novels | |
Machova et al. | Selecting the Most Probable Author of Asocial Posting in Online Media | |
Uddin et al. | Short text classification using semantically enriched topic model | |
Gupta et al. | Machine learning-based authorship attribution using token n-grams and other time tested features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20190627 |