JP7193000B2 - 類似文書検索方法、類似文書検索プログラム、類似文書検索装置、索引情報作成方法、索引情報作成プログラムおよび索引情報作成装置 - Google Patents
類似文書検索方法、類似文書検索プログラム、類似文書検索装置、索引情報作成方法、索引情報作成プログラムおよび索引情報作成装置 Download PDFInfo
- Publication number
- JP7193000B2 JP7193000B2 JP2021541969A JP2021541969A JP7193000B2 JP 7193000 B2 JP7193000 B2 JP 7193000B2 JP 2021541969 A JP2021541969 A JP 2021541969A JP 2021541969 A JP2021541969 A JP 2021541969A JP 7193000 B2 JP7193000 B2 JP 7193000B2
- Authority
- JP
- Japan
- Prior art keywords
- document
- word
- search target
- words
- hash function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2019/034306 WO2021038887A1 (ja) | 2019-08-30 | 2019-08-30 | 類似文書検索方法、類似文書検索プログラム、類似文書検索装置、索引情報作成方法、索引情報作成プログラムおよび索引情報作成装置 |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JPWO2021038887A1 JPWO2021038887A1 (https=) | 2021-03-04 |
| JPWO2021038887A5 JPWO2021038887A5 (https=) | 2022-01-25 |
| JP7193000B2 true JP7193000B2 (ja) | 2022-12-20 |
Family
ID=74683387
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2021541969A Active JP7193000B2 (ja) | 2019-08-30 | 2019-08-30 | 類似文書検索方法、類似文書検索プログラム、類似文書検索装置、索引情報作成方法、索引情報作成プログラムおよび索引情報作成装置 |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JP7193000B2 (https=) |
| WO (1) | WO2021038887A1 (https=) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12413413B2 (en) | 2021-05-13 | 2025-09-09 | Nec Corporation | Similarity degree derivation system and similarity degree derivation method |
| JP2024063280A (ja) * | 2022-10-26 | 2024-05-13 | 株式会社LegalOn Technologies | 情報処理方法、情報処理プログラム、情報処理システム、 |
| CN116302074B (zh) * | 2023-05-12 | 2023-07-28 | 卓望数码技术(深圳)有限公司 | 第三方组件识别方法、装置、设备及存储介质 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2010267108A (ja) | 2009-05-15 | 2010-11-25 | Nippon Telegr & Teleph Corp <Ntt> | 類似文書を検出するための文書署名生成装置、文書署名生成方法、文書署名生成プログラム |
| US20110087669A1 (en) | 2009-10-09 | 2011-04-14 | Stratify, Inc. | Composite locality sensitive hash based processing of documents |
| JP2015201042A (ja) | 2014-04-08 | 2015-11-12 | 日本電信電話株式会社 | ハッシュ関数生成方法、ハッシュ値生成方法、装置、及びプログラム |
| CN106156154A (zh) | 2015-04-14 | 2016-11-23 | 阿里巴巴集团控股有限公司 | 相似文本的检索方法及其装置 |
| CN107784110A (zh) | 2017-11-03 | 2018-03-09 | 北京锐安科技有限公司 | 一种索引建立方法及装置 |
-
2019
- 2019-08-30 WO PCT/JP2019/034306 patent/WO2021038887A1/ja not_active Ceased
- 2019-08-30 JP JP2021541969A patent/JP7193000B2/ja active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2010267108A (ja) | 2009-05-15 | 2010-11-25 | Nippon Telegr & Teleph Corp <Ntt> | 類似文書を検出するための文書署名生成装置、文書署名生成方法、文書署名生成プログラム |
| US20110087669A1 (en) | 2009-10-09 | 2011-04-14 | Stratify, Inc. | Composite locality sensitive hash based processing of documents |
| JP2015201042A (ja) | 2014-04-08 | 2015-11-12 | 日本電信電話株式会社 | ハッシュ関数生成方法、ハッシュ値生成方法、装置、及びプログラム |
| CN106156154A (zh) | 2015-04-14 | 2016-11-23 | 阿里巴巴集团控股有限公司 | 相似文本的检索方法及其装置 |
| CN107784110A (zh) | 2017-11-03 | 2018-03-09 | 北京锐安科技有限公司 | 一种索引建立方法及装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2021038887A1 (https=) | 2021-03-04 |
| WO2021038887A1 (ja) | 2021-03-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5257071B2 (ja) | 類似度計算装置及び情報検索装置 | |
| CN116932730B (zh) | 基于多叉树和大规模语言模型的文档问答方法及相关设备 | |
| CN109241243B (zh) | 候选文档排序方法及装置 | |
| CN103425727B (zh) | 上下文语音查询扩大方法和系统 | |
| JP6019604B2 (ja) | 音声認識装置、音声認識方法、及びプログラム | |
| CN105393248A (zh) | 非事实类提问应答系统以及方法 | |
| JP7193000B2 (ja) | 類似文書検索方法、類似文書検索プログラム、類似文書検索装置、索引情報作成方法、索引情報作成プログラムおよび索引情報作成装置 | |
| JP7139728B2 (ja) | 分類方法、装置、及びプログラム | |
| CN110096697B (zh) | 词向量矩阵压缩方法和装置、及获取词向量的方法和装置 | |
| CN114021541A (zh) | 演示文稿生成方法、装置、设备及存储介质 | |
| CN113672804B (zh) | 推荐信息生成方法、系统、计算机设备及存储介质 | |
| CN111373386A (zh) | 相似度指标值计算装置、相似检索装置及相似度指标值计算用程序 | |
| CN114706966A (zh) | 基于人工智能的语音交互方法、装置、设备及存储介质 | |
| CN116127066A (zh) | 文本聚类方法、文本聚类装置、电子设备及存储介质 | |
| CN114742062B (zh) | 文本关键词提取处理方法及系统 | |
| WO2014118978A1 (ja) | 学習方法、情報処理装置および学習プログラム | |
| CN114722188A (zh) | 基于运营数据的广告生成方法、装置、设备及存储介质 | |
| WO2008062822A1 (en) | Text mining device, text mining method and text mining program | |
| CN114328860A (zh) | 一种基于多模型匹配的交互咨询方法、装置和电子设备 | |
| JP4325370B2 (ja) | 文書関連語彙獲得装置及びプログラム | |
| CN113302601B (zh) | 含义关系学习装置、含义关系学习方法及记录了含义关系学习程序的记录介质 | |
| JP6973733B2 (ja) | 特許情報処理装置、特許情報処理方法およびプログラム | |
| JP2023021946A (ja) | データ検索方法及びシステム | |
| CN115730589A (zh) | 一种基于词向量的新闻传播路径生成方法以及相关装置 | |
| JP7032650B2 (ja) | 類似テキスト検索方法、類似テキスト検索装置および類似テキスト検索プログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20211022 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20211022 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20221108 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20221121 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7193000 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |