CN113196277A - 用于检索自然语言文档的系统 - Google Patents
用于检索自然语言文档的系统 Download PDFInfo
- Publication number
- CN113196277A CN113196277A CN201980082810.7A CN201980082810A CN113196277A CN 113196277 A CN113196277 A CN 113196277A CN 201980082810 A CN201980082810 A CN 201980082810A CN 113196277 A CN113196277 A CN 113196277A
- Authority
- CN
- China
- Prior art keywords
- graph
- natural language
- graphics
- node
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/418—Document matching, e.g. of document images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Geometry (AREA)
- Medical Informatics (AREA)
- Computer Graphics (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
- Devices For Executing Special Programs (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| FI20185863 | 2018-10-13 | ||
| FI20185863A FI20185863A1 (fi) | 2018-10-13 | 2018-10-13 | Järjestelmä luonnollisen kielen dokumenttien hakemiseksi |
| PCT/FI2019/050731 WO2020074786A1 (en) | 2018-10-13 | 2019-10-13 | System for searching natural language documents |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN113196277A true CN113196277A (zh) | 2021-07-30 |
Family
ID=68583451
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201980082810.7A Pending CN113196277A (zh) | 2018-10-13 | 2019-10-13 | 用于检索自然语言文档的系统 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20210350125A1 (https=) |
| EP (1) | EP3864564A1 (https=) |
| JP (1) | JP7801892B2 (https=) |
| CN (1) | CN113196277A (https=) |
| FI (1) | FI20185863A1 (https=) |
| WO (1) | WO2020074786A1 (https=) |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7172612B2 (ja) * | 2019-01-11 | 2022-11-16 | 富士通株式会社 | データ拡張プログラム、データ拡張方法およびデータ拡張装置 |
| US20200372019A1 (en) * | 2019-05-21 | 2020-11-26 | Sisense Ltd. | System and method for automatic completion of queries using natural language processing and an organizational memory |
| US12430335B2 (en) | 2019-05-21 | 2025-09-30 | Sisense Ltd. | System and method for improved cache utilization using an organizational memory to generate a dashboard |
| KR20210046178A (ko) * | 2019-10-18 | 2021-04-28 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
| US11403488B2 (en) * | 2020-03-19 | 2022-08-02 | Hong Kong Applied Science and Technology Research Institute Company Limited | Apparatus and method for recognizing image-based content presented in a structured layout |
| US11990214B2 (en) * | 2020-07-21 | 2024-05-21 | International Business Machines Corporation | Handling form data errors arising from natural language processing |
| US11605187B1 (en) * | 2020-08-18 | 2023-03-14 | Corel Corporation | Drawing function identification in graphics applications |
| US12541976B2 (en) * | 2021-12-07 | 2026-02-03 | Insight Direct Usa, Inc. | Relationship modeling and anomaly detection based on video data |
| EP4542463A4 (en) | 2022-06-15 | 2025-07-23 | Fujitsu Ltd | LEARNING PROGRAM, LEARNING METHOD AND INFORMATION PROCESSING DEVICE |
| US20230419045A1 (en) * | 2022-06-24 | 2023-12-28 | International Business Machines Corporation | Generating goal-oriented dialogues from documents |
| US12086557B1 (en) | 2023-10-06 | 2024-09-10 | Armada Systems, Inc. | Natural language statistical model with alerts |
| US12067041B1 (en) * | 2023-10-06 | 2024-08-20 | Armada Systems, Inc. | Time series data to statistical natural language interaction |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5963940A (en) * | 1995-08-16 | 1999-10-05 | Syracuse University | Natural language information retrieval system and method |
| CN1265209A (zh) * | 1997-07-22 | 2000-08-30 | 微软公司 | 使用自然语言处理技术用于处理文本输入的系统 |
| CN101685455A (zh) * | 2008-09-28 | 2010-03-31 | 华为技术有限公司 | 数据检索的方法和系统 |
| US20150142704A1 (en) * | 2013-11-20 | 2015-05-21 | Justin London | Adaptive Virtual Intelligent Agent |
| CN105900081A (zh) * | 2013-02-19 | 2016-08-24 | 谷歌公司 | 基于自然语言处理的搜索 |
| US9830315B1 (en) * | 2016-07-13 | 2017-11-28 | Xerox Corporation | Sequence-based structured prediction for semantic parsing |
| CN107844608A (zh) * | 2017-12-06 | 2018-03-27 | 湖南大学 | 一种基于词向量的句子相似度比较方法 |
| US20180189269A1 (en) * | 2016-12-30 | 2018-07-05 | Microsoft Technology Licensing, Llc | Graph long short term memory for syntactic relationship discovery |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003223466A (ja) | 2002-01-31 | 2003-08-08 | Seiko Epson Corp | 特許検索装置、特許検索装置の制御方法、制御プログラムおよび記録媒体 |
| US10810193B1 (en) * | 2013-03-13 | 2020-10-20 | Google Llc | Querying a data graph using natural language queries |
| JP2016110256A (ja) | 2014-12-03 | 2016-06-20 | 富士ゼロックス株式会社 | 情報処理装置及び情報処理プログラム |
| US10095689B2 (en) * | 2014-12-29 | 2018-10-09 | International Business Machines Corporation | Automated ontology building |
| US20170075877A1 (en) * | 2015-09-16 | 2017-03-16 | Marie-Therese LEPELTIER | Methods and systems of handling patent claims |
| US10078634B2 (en) * | 2015-12-30 | 2018-09-18 | International Business Machines Corporation | Visualizing and exploring natural-language text |
| US10891321B2 (en) * | 2018-08-28 | 2021-01-12 | American Chemical Society | Systems and methods for performing a computer-implemented prior art search |
-
2018
- 2018-10-13 FI FI20185863A patent/FI20185863A1/fi unknown
-
2019
- 2019-10-13 WO PCT/FI2019/050731 patent/WO2020074786A1/en not_active Ceased
- 2019-10-13 EP EP19805356.3A patent/EP3864564A1/en not_active Ceased
- 2019-10-13 US US17/284,796 patent/US20210350125A1/en active Pending
- 2019-10-13 CN CN201980082810.7A patent/CN113196277A/zh active Pending
- 2019-10-13 JP JP2021545331A patent/JP7801892B2/ja active Active
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5963940A (en) * | 1995-08-16 | 1999-10-05 | Syracuse University | Natural language information retrieval system and method |
| CN1265209A (zh) * | 1997-07-22 | 2000-08-30 | 微软公司 | 使用自然语言处理技术用于处理文本输入的系统 |
| CN101685455A (zh) * | 2008-09-28 | 2010-03-31 | 华为技术有限公司 | 数据检索的方法和系统 |
| CN105900081A (zh) * | 2013-02-19 | 2016-08-24 | 谷歌公司 | 基于自然语言处理的搜索 |
| US20150142704A1 (en) * | 2013-11-20 | 2015-05-21 | Justin London | Adaptive Virtual Intelligent Agent |
| US9830315B1 (en) * | 2016-07-13 | 2017-11-28 | Xerox Corporation | Sequence-based structured prediction for semantic parsing |
| US20180189269A1 (en) * | 2016-12-30 | 2018-07-05 | Microsoft Technology Licensing, Llc | Graph long short term memory for syntactic relationship discovery |
| CN107844608A (zh) * | 2017-12-06 | 2018-03-27 | 湖南大学 | 一种基于词向量的句子相似度比较方法 |
Non-Patent Citations (1)
| Title |
|---|
| KAI SHENG TAI,ET AL: "Improved semantic representations from tree-structured long short-term memory networks", 《ARXIV》, vol. 1, 30 May 2015 (2015-05-30), pages 1556 - 1566, XP055442054, DOI: 10.3115/v1/P15-1150 * |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2020074786A1 (en) | 2020-04-16 |
| EP3864564A1 (en) | 2021-08-18 |
| JP7801892B2 (ja) | 2026-01-19 |
| JP2022508737A (ja) | 2022-01-19 |
| FI20185863A1 (fi) | 2020-04-14 |
| US20210350125A1 (en) | 2021-11-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240370649A1 (en) | Method of training a natural language search system, search system and corresponding use | |
| CN113196277A (zh) | 用于检索自然语言文档的系统 | |
| CN113168499A (zh) | 检索专利文档的方法 | |
| Al-Hroob et al. | The use of artificial neural networks for extracting actions and actors from requirements document | |
| Zubrinic et al. | The automatic creation of concept maps from documents written using morphologically rich languages | |
| US20230138014A1 (en) | System and method for performing a search in a vector space based search engine | |
| US12124802B2 (en) | System and method for analyzing similarity of natural language data | |
| CN114840685A (zh) | 一种应急预案知识图谱构建方法 | |
| CN118551046A (zh) | 一种基于大语言模型增强文档处理流程的方法 | |
| CN119396997B (zh) | 大数据环境下的实时数据分析与可视化方法及系统 | |
| CN116108191B (zh) | 一种基于知识图谱的深度学习模型推荐方法 | |
| CN111831624A (zh) | 数据表创建方法、装置、计算机设备及存储介质 | |
| Sun | A natural language interface for querying graph databases | |
| CN117251567B (zh) | 多领域知识抽取方法 | |
| CN117829140A (zh) | 用于规章与法规的自动比对方法及其系统 | |
| Frasconi et al. | Text categorization for multi-page documents: A hybrid naive Bayes HMM approach | |
| CN113392183A (zh) | 一种儿童范畴图谱知识的表征与计算方法 | |
| Smrz et al. | Information extraction in semantic wikis | |
| Li et al. | Single Document Viewpoint Summarization based on Triangle Identification in Dependency Graph | |
| Jiang et al. | Effective use of phrases in language modeling to improve information retrieval | |
| CN119886139A (zh) | 一种co2催化领域的多层次、广类别命名实体识别方法 | |
| Jakubowski et al. | Extending FrameNet to Machine Learning Domain. | |
| CN118838997A (zh) | 智能招聘平台下的智能问答方法及系统 | |
| CN120821821A (zh) | 输入文本处理方法、装置、设备、存储介质及程序产品 | |
| CN119272869A (zh) | 一种卷烟制丝加工领域的知识抽取方法及装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |