CN111144112B - 文本相似度分析方法、装置和存储介质 - Google Patents
文本相似度分析方法、装置和存储介质 Download PDFInfo
- Publication number
- CN111144112B CN111144112B CN201911394188.1A CN201911394188A CN111144112B CN 111144112 B CN111144112 B CN 111144112B CN 201911394188 A CN201911394188 A CN 201911394188A CN 111144112 B CN111144112 B CN 111144112B
- Authority
- CN
- China
- Prior art keywords
- text
- keyword set
- sentences
- topic
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 claims abstract description 34
- 238000012545 processing Methods 0.000 claims abstract description 17
- 238000001914 filtration Methods 0.000 claims description 42
- 230000011218 segmentation Effects 0.000 claims description 18
- 230000002596 correlated effect Effects 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 8
- 238000012216 screening Methods 0.000 claims 2
- 230000006870 function Effects 0.000 description 9
- 230000000875 corresponding effect Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 208000025174 PANDAS Diseases 0.000 description 2
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 2
- 240000004718 Panda Species 0.000 description 2
- 235000016496 Panda oleosa Nutrition 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008719 thickening Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911394188.1A CN111144112B (zh) | 2019-12-30 | 2019-12-30 | 文本相似度分析方法、装置和存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911394188.1A CN111144112B (zh) | 2019-12-30 | 2019-12-30 | 文本相似度分析方法、装置和存储介质 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111144112A CN111144112A (zh) | 2020-05-12 |
CN111144112B true CN111144112B (zh) | 2023-07-14 |
Family
ID=70521761
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911394188.1A Active CN111144112B (zh) | 2019-12-30 | 2019-12-30 | 文本相似度分析方法、装置和存储介质 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111144112B (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111831804B (zh) * | 2020-06-29 | 2024-04-26 | 深圳价值在线信息科技股份有限公司 | 一种关键短语的提取方法、装置、终端设备及存储介质 |
CN112712866A (zh) * | 2020-12-25 | 2021-04-27 | 医渡云(北京)技术有限公司 | 一种确定文本信息相似度的方法及装置 |
CN113011153B (zh) * | 2021-03-15 | 2022-03-29 | 平安科技(深圳)有限公司 | 文本相关性检测方法、装置、设备及存储介质 |
CN113051903A (zh) * | 2021-04-21 | 2021-06-29 | 哈尔滨工业大学 | 语句、案件经过、量刑情节和司法文书一致性比对方法 |
CN113392184A (zh) * | 2021-06-09 | 2021-09-14 | 平安科技(深圳)有限公司 | 一种相似文本的确定方法、装置、终端设备及存储介质 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109446332A (zh) * | 2018-12-25 | 2019-03-08 | 银江股份有限公司 | 一种基于特征迁移和自适应学习的人民调解案例分类系统及方法 |
WO2019149200A1 (zh) * | 2018-02-01 | 2019-08-08 | 腾讯科技(深圳)有限公司 | 文本分类方法、计算机设备及存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106484664B (zh) * | 2016-10-21 | 2019-03-01 | 竹间智能科技(上海)有限公司 | 一种短文本间相似度计算方法 |
CA3055379C (en) * | 2017-03-10 | 2023-02-21 | Eduworks Corporation | Automated tool for question generation |
US10891943B2 (en) * | 2018-01-18 | 2021-01-12 | Citrix Systems, Inc. | Intelligent short text information retrieve based on deep learning |
CN108595425A (zh) * | 2018-04-20 | 2018-09-28 | 昆明理工大学 | 基于主题与语义的对话语料关键词抽取方法 |
CN108804641B (zh) * | 2018-06-05 | 2021-11-09 | 鼎易创展咨询(北京)有限公司 | 一种文本相似度的计算方法、装置、设备和存储介质 |
CN109918660B (zh) * | 2019-03-04 | 2021-03-02 | 北京邮电大学 | 一种基于TextRank的关键词提取方法和装置 |
-
2019
- 2019-12-30 CN CN201911394188.1A patent/CN111144112B/zh active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019149200A1 (zh) * | 2018-02-01 | 2019-08-08 | 腾讯科技(深圳)有限公司 | 文本分类方法、计算机设备及存储介质 |
CN109446332A (zh) * | 2018-12-25 | 2019-03-08 | 银江股份有限公司 | 一种基于特征迁移和自适应学习的人民调解案例分类系统及方法 |
Also Published As
Publication number | Publication date |
---|---|
CN111144112A (zh) | 2020-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111144112B (zh) | 文本相似度分析方法、装置和存储介质 | |
CN107436922B (zh) | 文本标签生成方法和装置 | |
CN107329949B (zh) | 一种语义匹配方法和系统 | |
CN106528845B (zh) | 基于人工智能的检索纠错方法及装置 | |
Däubler et al. | Natural sentences as valid units for coded political texts | |
CN111104488B (zh) | 检索和相似度分析一体化的方法、装置和存储介质 | |
CN108776901B (zh) | 基于搜索词的广告推荐方法及系统 | |
CN107436864A (zh) | 一种基于Word2Vec的中文问答语义相似度计算方法 | |
US10824816B2 (en) | Semantic parsing method and apparatus | |
CN110297893B (zh) | 自然语言问答方法、装置、计算机装置及存储介质 | |
CN108846138B (zh) | 一种融合答案信息的问题分类模型构建方法、装置和介质 | |
CN112329824A (zh) | 多模型融合训练方法、文本分类方法以及装置 | |
CN110210022B (zh) | 标题识别方法及装置 | |
CN112732910B (zh) | 跨任务文本情绪状态评估方法、系统、装置及介质 | |
CN110222654A (zh) | 文本分割方法、装置、设备及存储介质 | |
CN109472022A (zh) | 基于机器学习的新词识别方法及终端设备 | |
CN112613315B (zh) | 一种文本知识自动抽取方法、装置、设备及存储介质 | |
Braz et al. | Document classification using a Bi-LSTM to unclog Brazil's supreme court | |
US20140289260A1 (en) | Keyword Determination | |
Castañeda‐Jiménez et al. | Exploring lexical diversity in second language Spanish | |
JP2014219872A (ja) | 発話選択装置、方法、及びプログラム、対話装置及び方法 | |
CN109657043B (zh) | 自动生成文章的方法、装置、设备及存储介质 | |
CN113240322B (zh) | 气候风险披露质量方法、装置、电子设备及存储介质 | |
CN113177061B (zh) | 一种搜索方法、装置和电子设备 | |
Johansson Falck et al. | Procedure for identifying metaphorical scenes (PIMS): The case of spatial and abstract relations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 510663 research institute office building, No.9, Kelin Road, Science City, Guangzhou high tech Industrial Development Zone, Guangzhou City, Guangdong Province Patentee after: GRG BANKING IT Co.,Ltd. Country or region after: China Patentee after: Guangdian Yuntong Group Co.,Ltd. Address before: 510663 research institute office building, No.9, Kelin Road, Science City, Guangzhou high tech Industrial Development Zone, Guangzhou City, Guangdong Province Patentee before: GRG BANKING IT Co.,Ltd. Country or region before: China Patentee before: GRG BANKING EQUIPMENT Co.,Ltd. |
|
CP03 | Change of name, title or address | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240625 Address after: Room 701, No. 11, Kelin Road, Science City, Huangpu District, Guangzhou City, Guangdong Province, 510663 Patentee after: GRG BANKING IT Co.,Ltd. Country or region after: China Address before: 510663 research institute office building, No.9, Kelin Road, Science City, Guangzhou high tech Industrial Development Zone, Guangzhou City, Guangdong Province Patentee before: GRG BANKING IT Co.,Ltd. Country or region before: China Patentee before: Guangdian Yuntong Group Co.,Ltd. |
|
TR01 | Transfer of patent right |