WO2021142968A1 - Procédé de calcul de similarité sémantique à orientation multilingue pour des noms de lieu généraux, et application associée - Google Patents
Procédé de calcul de similarité sémantique à orientation multilingue pour des noms de lieu généraux, et application associée Download PDFInfo
- Publication number
- WO2021142968A1 WO2021142968A1 PCT/CN2020/085814 CN2020085814W WO2021142968A1 WO 2021142968 A1 WO2021142968 A1 WO 2021142968A1 CN 2020085814 W CN2020085814 W CN 2020085814W WO 2021142968 A1 WO2021142968 A1 WO 2021142968A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- place
- names
- similarity
- name
- category
- Prior art date
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 claims abstract description 30
- 238000001914 filtration Methods 0.000 claims description 7
- 230000008520 organization Effects 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 238000013507 mapping Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- OXXJZDJLYSMGIQ-ZRDIBKRKSA-N 8-[2-[(e)-3-hydroxypent-1-enyl]-5-oxocyclopent-3-en-1-yl]octanoic acid Chemical compound CCC(O)\C=C\C1C=CC(=O)C1CCCCCCCC(O)=O OXXJZDJLYSMGIQ-ZRDIBKRKSA-N 0.000 description 2
- 101100397117 Arabidopsis thaliana PPA3 gene Proteins 0.000 description 2
- 101001057699 Homo sapiens Inorganic pyrophosphatase Proteins 0.000 description 2
- 102100027050 Inorganic pyrophosphatase Human genes 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Definitions
- the invention belongs to the field of geographic information science, and relates to a multilingual-oriented method for calculating the semantic similarity of general place names and its application in multilingual database place name queries.
- Geographical names are linguistic signs agreed upon by humans for geographic objects and geographic phenomena that have specific locations, ranges, and morphological characteristics of the geographic environment. Semantics is the meaning of the concepts represented by data (symbols) and the relationship between these meanings.
- place-name information databases With the development of computer technology and the popularization of mobile Internet, different countries, institutions or enterprises have established various types of place-name information databases, and most of the place-name information databases contain information such as place-name categories, latitude and longitude. However, these geographical name information databases are quite different in terms of coverage, data format, language type, and data content. Therefore, how to quickly and accurately calculate the similarity of place names in different place-name information databases has become an important topic in place-name research.
- the calculation methods of place name similarity are mainly divided into three categories.
- the first type is based on the place name string, that is, the place name similarity is calculated by comparing the string of place names.
- Smart, etc. combine the rule model with the hidden Markov model, which can effectively solve the inconsistency of place name spelling, format, character set, etc.
- Question: Zhan Binbin and others used the generic name dictionary and structural rule database based on geographical names to determine the type of geographical names, and then obtained the best geographical name data matching results through string similarity matching, which was well verified in the experimental area of Dezhou City Result: Ye Peng et al.
- the second category is based on geographic elements, that is, the similarity of place names is calculated using geometric information such as the spatial location, area, and shape of place names.
- geometric information such as the spatial location, area, and shape of place names.
- Egenhofer and Clementini proposed a standard to measure the inconsistency of spatial geometric data structure and the inconsistency of topological relations in multiple expressions, which can ideally judge the consistency of spatial geometric data; Van et al. used K center point clustering and naive Bayesian classification The method is able to process geographically-labeled photos for consistency in place names.
- the third category is the similarity calculation method based on the semantics of geographical names. For example, Chen Jiali’s multiple representations of spatial data may have inconsistencies in spatial relations, semantics, and geometry. Therefore, these inconsistencies must be evaluated and corrected, and the ontology must be introduced into geographic information modeling, combined with semantic consistency, and based on object matching. Method to achieve data matching.
- the present invention provides a multilingual-oriented method for calculating semantic similarity of general place names, which aims to solve the problems of low accuracy and weak versatility of existing method for calculating place names of similarity.
- the method for calculating the semantic similarity of universal place names for multiple languages includes the following steps:
- calculating the similarity of place-name categories according to the place-name classification system and place-name category similarity model includes:
- the category similarity model under the same subcategory is expressed as:
- S c (i, j) represents the similarity of the place names of place names i and j
- l represents the distance from the nearest common parent of the category of place names i and j to the root node
- d i represents the category of place names i and j.
- the distance from the nearest common parent to the category of i, d j represents the distance from the nearest common parent of the categories of place names i and j to the category of j
- ⁇ (i,j) represents the nearest common parent to i and j
- the category similarity model under different subcategories is expressed as:
- S c (i, j) represents the similarity of the place names of place names i and j
- d'i represents the nearest common parent category of the categories of i and j.
- the distance to the category of i, d' j represents the distance from the nearest common parent of the categories of i and j to the category of j
- ⁇ '(i,j) represents the distance from the nearest common parent to the category of i and j Sum.
- the similarity model of place name strings is expressed as:
- A(i,j) represents the similarity of the place name strings of place names i and j
- d[i,j] represents the edit distance of place names i and j
- ML represents the maximum string length of place names i and j
- Len represents The minimum matching length
- L(i) represents the length of the character string of the place name i
- L(j) represents the length of the character string of the place name j
- a and b indicate the weight.
- the spatial proximity model of place names is used to calculate the spatial proximity.
- the spatial proximity model of place names is expressed as:
- S E (i, j) represents the spatial proximity of place names i and j
- lon i , lon j , lat i and lat j are the latitude and longitude of place names i and j, respectively.
- the calculation model for the semantic similarity of geographical names is:
- F(i,j) represents the semantic similarity of place names i and j.
- the application of the geographical name semantic similarity calculation method in multilingual geographical name data query mainly includes the following steps:
- Phonographic characters are based on letter similarity, combined with the total number of letters, the number of letter radicals, the total number of words and the coding language characteristics of the first letter of the word, and the construction of the phonographic place name index based on the index organization method of the multi-dimensional feature statistical vector; ideographic characters Taking the local similarity of characters as the benchmark, combining the same characters, number of characters, and language characteristics of character positions in place names, constructing an ideographic place-name index based on the organization of single-word place-name indexes;
- the place name string similarity model is used for calculation, and the calculation result is higher than the set threshold. Filter conditions, otherwise filter the place name, if the string is empty, it will directly meet the filter conditions; according to the determined place name category, use the category similarity model to calculate, the calculation result is higher than the set threshold and meet the filter conditions, otherwise it will be filtered If the category is empty, the place name directly meets the filtering conditions; according to the determined place name latitude and longitude, the place name spatial proximity model is used for calculation. When the calculation result is higher than the set threshold, the filtering conditions are met. Otherwise, the place name is filtered. If it is empty, it will directly meet the filter conditions;
- place names to be queried and all candidate place names are calculated according to the multilingual-oriented common place name semantic similarity calculation method
- the present invention constructs a place name category similarity model, a place name string similarity model and a place name spatial proximity model according to the word formation characteristics, place name categories and location characteristics of place names, and proposes a model based on these three models.
- General method for calculating semantic similarity of geographical names The beneficial effect of the present invention is to improve the edit distance algorithm, so that the influence of the generic name and the proper name can be taken into consideration at the same time. Introduce the characteristics of place-name categories, and construct a place-name category similarity model based on the place-name category classification system.
- the spatial characteristics of geographical names are considered to construct a spatial proximity model of geographical names; finally, a comprehensive method for calculating the semantic similarity of geographical names is proposed by comprehensively considering the character strings, location and category characteristics of geographical names. Therefore, it has higher accuracy and universal applicability than the calculation method of place name similarity for a single feature.
- Fig. 1 is a flowchart of a method according to an embodiment of the present invention.
- Figure 2 is a schematic diagram of the structure of place name categories in an embodiment of the present invention.
- the method for calculating the semantic similarity of universal place names for multilingualism disclosed in the embodiment of the present invention mainly includes the following steps:
- Step 1 Identify the languages of place names i and j according to the coding interval of place names, and normalize place names i and j into romanized place names based on document information.
- place names need to be preprocessed to find the corresponding place name category and other information in the place name information database.
- the geographical name coding interval refers to the different coding intervals corresponding to each language, that is, the Unicode hexadecimal coding interval of each language is unique, so the geographical name language can be determined according to the geographical name coding interval.
- Romanized place names refer to the Roman place names corresponding to the place names contained in the latest official gazetteers, place-name dictionaries, and local chronicles of each country.
- Step 2 Obtain the categories of place names i and j from the place name information database, and calculate the category similarity of place names i and j according to the place name category similarity model.
- place name category similarity refers to the degree of correlation between the categories of two place name data in the same classification system.
- category of place-name data refers to the classification of data according to thematic elements.
- the classification system can use a hierarchical tree structure to describe the logical relationship between classes. Place name categories are based on the place name classification system, and the classification comparison table is shown in Table 1.
- the GNIS data source directly provides the full name of the category. You can refer to the above classification standards to summarize the geographical name element categories included in each category, and design the GNIS category and standard classification mapping table, as shown in Table 2. Through the mapping relationship in the table, add the GNIS feature category code attribute. Table 3 is a part of the geographical name category code table.
- ⁇ ' represents the correlation degree of the subcategories of the categories of i and j, and the value is [0,1], which can be given by domain experts according to practical applications.
- d' i represents the nearest common parent of the categories of i and j.
- d' j represents the distance from the nearest common parent category of the categories of i and j to the category of j (the number of edges);
- ⁇ '(i,j) represents the nearest The sum of the distances from the common parent category to the categories i and j.
- Step 3 Calculate the name similarity of the romanized place names i and j according to the place-name string similarity model.
- Edit distance also known as Levenshtein distance
- Levenshtein distance is a distance measurement function used to measure the similarity of two sequences. In natural language processing, the edit distance is used to calculate the minimum number of insertion, deletion, and replacement operations required to convert from the original string to the target string.
- the distance d[i,j] is the minimum operation used to edit the string S j to the string T j
- the number, d[i,j] indicates the edit distance of place names i and j, which can effectively reflect the similarity of characters between place names.
- the formula is as follows:
- Edit distance is a distance measurement function used to measure the similarity of two sequences. It is commonly used to calculate the similarity of place name strings. However, this algorithm cannot effectively reduce the impact of generic names. Therefore, the algorithm has been improved.
- the model is as follows:
- d[i,j] represents the edit distance of place name i, j
- ML represents the maximum length of the place name i, j string
- Len represents the minimum matching length (Len ⁇ 1)
- L(i) represents the i string Length
- L(j) represents the length of the j string
- a and b represent the weight, which are 0.6 and 0.4, respectively.
- Table 4 shows the comparison between the improved model and the existing model name similarity calculation results.
- Step 4 Obtain the latitude and longitude of place names i and j from the place name information database, and calculate the spatial proximity of place names according to the place name spatial proximity model.
- place name can be a point element (such as the name of a small village), a line element (such as the name of a highway), or a polygon element (such as the name of an administrative district). Therefore, place name data
- the geometric similarity of the includes the measurement of the similarity of the position of the point element, the measurement of the similarity of the line element and the measurement of the geometric similarity of the area element, and the global place name data studied in the present invention are all the place names of the point elements.
- the measurement of the place names of point elements usually adopts the method of calculating the distance.
- the basic idea is to extract a set of feature vectors from the place names of two point elements, and calculate the distance of these two sets of vectors in a certain distance space. The smaller the distance, the more similar the two place names; conversely, the larger the distance, the greater the difference between the two place names. Euclidean distance is often used to represent the distance between two points.
- Euclidean Distance is the ordinary straight-line distance between two points in Euclidean space, which measures the absolute distance between points in a multidimensional space. Among them, if the Euclidean distance between place names is larger, the similarity of the place names described is lower.
- Set i, j represents a two place names, which are referred to as latitude and longitude lon i, lon j, lat i and lat j.
- the Euclidean spatial distance between two place names is recorded as dis ij .
- the spatial distance similarity model designed by the present invention for the spatial characteristics of place name data is as follows.
- S E (i, j) represents the degree of similarity in the spatial range of two place names. If the two are the same, the value is 1; if the distance between the two is farther, the consistency of the spatial range is closer to 0.
- Step 5 Calculate the semantic similarity of geographical names according to the semantic similarity model of geographical names.
- F(i,j) represents the semantic similarity of geographical names
- the three variables A(i,j), S E (i,j) and S c (i,j) respectively represent normalization to [0,1]
- the similarity of place-name strings in the range of the value range is similar to the spatial proximity of place-names and the similarity of place-name categories.
- the experimental results show that the universal semantic similarity calculation method for multilingual geographical names not only maintains an accuracy rate of 98% or more, but also achieves more than 97% of actual geographical name data matching.
- Step 1 Extract the character string, category, latitude and longitude attributes of all place names through the place name information database, determine the place name language based on the language coding interval and perform place name normalization processing, and divide it into phonological type and ideographic type according to the different characteristics of place name languages Indexing method, in which phonological characters are based on letter similarity, combined with language features such as the total number of letters, the number of letter radicals, the total number of words, and the code of the first letter of the word, and the phonological place name index is based on the index organization method of multi-dimensional feature statistical vectors Construction: Ideographic characters are based on the local similarity of characters, combined with the same characters, number of characters, character positions and other language characteristics of place names, and the ideographic place name index is constructed based on the organization of single word place names.
- Step 2 Determine all or part of the attributes such as the character string, category, latitude and longitude of the place name to be queried, and perform normalization processing.
- Step 3 According to the attributes such as the character string, category, latitude and longitude determined by the place name to be queried, all the items in the index are successively filtered. According to the determined place name string, the place name string similarity model is used for calculation, and the calculation result is higher than the set value. When the threshold is set, the filter condition is met, otherwise the place name is filtered.
- the filter condition is directly met; the category similarity model is used for calculation according to the determined place name category, and the calculation result is higher than the set threshold and the filter condition is met , Otherwise it will filter the place name, if the category is empty, it will directly meet the filtering conditions; according to the determined place name longitude and latitude, use the geographical name spatial proximity model to calculate, and the calculation result will meet the filtering conditions when the calculation result is higher than the set threshold, otherwise it will be filtered The place name, if the latitude and longitude is empty, it directly meets the filter conditions.
- Step 4 The place name to be queried and all candidate place names are calculated in turn using a multilingual universal place name semantic similarity calculation method.
- Step 5 Arrange the calculation results in reverse order. The higher the order, the more similar the place name to be queried.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Remote Sensing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010058317.6 | 2020-01-19 | ||
CN202010058317.6A CN111325235B (zh) | 2020-01-19 | 2020-01-19 | 面向多语种的通用地名语义相似度计算方法及其应用 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021142968A1 true WO2021142968A1 (fr) | 2021-07-22 |
Family
ID=71170946
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/085814 WO2021142968A1 (fr) | 2020-01-19 | 2020-04-21 | Procédé de calcul de similarité sémantique à orientation multilingue pour des noms de lieu généraux, et application associée |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN111325235B (fr) |
AU (1) | AU2020101024A4 (fr) |
WO (1) | WO2021142968A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076734B (zh) * | 2021-04-15 | 2023-01-20 | 云南电网有限责任公司电力科学研究院 | 一种项目文本的相似度检测方法及装置 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140229501A1 (en) * | 2011-10-20 | 2014-08-14 | Deutsche Post Ag | Comparing positional information |
CN107239442A (zh) * | 2017-05-09 | 2017-10-10 | 北京京东金融科技控股有限公司 | 一种计算地址相似度的方法和装置 |
CN107861947A (zh) * | 2017-11-07 | 2018-03-30 | 昆明理工大学 | 一种基于跨语言资源的柬语命名实体识别的方法 |
CN108171529A (zh) * | 2017-12-04 | 2018-06-15 | 昆明理工大学 | 一种地址相似度评估方法 |
CN108572960A (zh) * | 2017-03-08 | 2018-09-25 | 富士通株式会社 | 地名消岐方法和地名消岐装置 |
CN108804398A (zh) * | 2017-05-03 | 2018-11-13 | 阿里巴巴集团控股有限公司 | 地址文本的相似度计算方法及装置 |
CN110276021A (zh) * | 2019-04-29 | 2019-09-24 | 小轮(上海)网络科技有限公司 | 基于语义相似度的地名匹配方法及装置 |
CN110598791A (zh) * | 2019-09-12 | 2019-12-20 | 深圳前海微众银行股份有限公司 | 地址相似度评价方法、装置、设备及介质 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2158540A4 (fr) * | 2007-06-18 | 2010-10-20 | Geographic Services Inc | Système de recherche de nom de caractéristiques géographiques |
CN103605752A (zh) * | 2013-11-21 | 2014-02-26 | 武大吉奥信息技术有限公司 | 一种基于语义识别的地址匹配方法 |
-
2020
- 2020-01-19 CN CN202010058317.6A patent/CN111325235B/zh active Active
- 2020-04-21 WO PCT/CN2020/085814 patent/WO2021142968A1/fr active Application Filing
- 2020-04-21 AU AU2020101024A patent/AU2020101024A4/en not_active Ceased
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140229501A1 (en) * | 2011-10-20 | 2014-08-14 | Deutsche Post Ag | Comparing positional information |
CN108572960A (zh) * | 2017-03-08 | 2018-09-25 | 富士通株式会社 | 地名消岐方法和地名消岐装置 |
CN108804398A (zh) * | 2017-05-03 | 2018-11-13 | 阿里巴巴集团控股有限公司 | 地址文本的相似度计算方法及装置 |
CN107239442A (zh) * | 2017-05-09 | 2017-10-10 | 北京京东金融科技控股有限公司 | 一种计算地址相似度的方法和装置 |
CN107861947A (zh) * | 2017-11-07 | 2018-03-30 | 昆明理工大学 | 一种基于跨语言资源的柬语命名实体识别的方法 |
CN108171529A (zh) * | 2017-12-04 | 2018-06-15 | 昆明理工大学 | 一种地址相似度评估方法 |
CN110276021A (zh) * | 2019-04-29 | 2019-09-24 | 小轮(上海)网络科技有限公司 | 基于语义相似度的地名匹配方法及装置 |
CN110598791A (zh) * | 2019-09-12 | 2019-12-20 | 深圳前海微众银行股份有限公司 | 地址相似度评价方法、装置、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
AU2020101024A4 (en) | 2020-07-23 |
CN111325235B (zh) | 2023-04-25 |
CN111325235A (zh) | 2020-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111104794B (zh) | 一种基于主题词的文本相似度匹配方法 | |
US10515090B2 (en) | Data extraction and transformation method and system | |
US7827125B1 (en) | Learning based on feedback for contextual personalized information retrieval | |
CN111291161A (zh) | 法律案件知识图谱查询方法、装置、设备及存储介质 | |
Matci et al. | Address standardization using the natural language process for improving geocoding results | |
CN111522910B (zh) | 一种基于文物知识图谱的智能语义检索方法 | |
CN109190117A (zh) | 一种基于词向量的短文本语义相似度计算方法 | |
CN106844331A (zh) | 一种句子相似度计算方法和系统 | |
CN110147421B (zh) | 一种目标实体链接方法、装置、设备及存储介质 | |
CN110413787B (zh) | 文本聚类方法、装置、终端和存储介质 | |
JP5057474B2 (ja) | オブジェクト間の競合指標計算方法およびシステム | |
CN114254653A (zh) | 一种科技项目文本语义抽取与表示分析方法 | |
Fu et al. | Automatic record linkage of individuals and households in historical census data | |
CN103646112A (zh) | 利用了网络搜索的依存句法的领域自适应方法 | |
WO2021114825A1 (fr) | Procédé et dispositif de normalisation d'institutions, dispositif électronique et support de stockage | |
Hossny et al. | Feature selection methods for event detection in Twitter: a text mining approach | |
CN106886565B (zh) | 一种基础房型自动聚合方法 | |
CN116848490A (zh) | 使用模型相交进行文档分析 | |
Sebti et al. | A new word sense similarity measure in WordNet | |
CN111553160A (zh) | 一种获取法律领域问句答案的方法和系统 | |
CN114997288A (zh) | 一种设计资源关联方法 | |
CN112486919A (zh) | 文档管理方法、系统及存储介质 | |
Pang et al. | A text similarity measurement based on semantic fingerprint of characteristic phrases | |
WO2021142968A1 (fr) | Procédé de calcul de similarité sémantique à orientation multilingue pour des noms de lieu généraux, et application associée | |
CN112784049B (zh) | 一种面向文本数据的在线社交平台多元知识获取方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20913178 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20913178 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20913178 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.01.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20913178 Country of ref document: EP Kind code of ref document: A1 |