CN106611038A - Ontology concept-based lexical semantic similarity solving method - Google Patents
Ontology concept-based lexical semantic similarity solving method Download PDFInfo
- Publication number
- CN106611038A CN106611038A CN201610833103.5A CN201610833103A CN106611038A CN 106611038 A CN106611038 A CN 106611038A CN 201610833103 A CN201610833103 A CN 201610833103A CN 106611038 A CN106611038 A CN 106611038A
- Authority
- CN
- China
- Prior art keywords
- similarity
- ontological
- concept
- depth
- compared
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an ontology concept-based lexical semantic similarity solving method, which comprises the steps of mapping to-be-compared words input in a statistical method module into an ontology concept; selecting ontology concepts, with corresponding maximum depths, of the to-be-compared words from an ontology concept module, calculating the distance between the ontology concepts and calculating the most recent common ancestor depth; and finally calculating the similarity between the two to-be-compared words. The ontology concept-based lexical semantic similarity solving method is closer to an empirical value of an expert in quantitative concept; the factors of the distance between the ontology concepts, with the corresponding maximum depths, of the to-be-compared words (c1, c2), the depths and the like are more fully and comprehensively considered, so that the accuracy of the semantic similarity result is greatly improved; and the ontology reasoning effect is better improved.
Description
Technical field
The present invention relates to Semantic Web technology field, and in particular to a kind of Similarity of Words based on Ontological concept is asked
Solution method.
Background technology
At present, many scholars are paying close attention to the computational methods of Ontological concept similarity, and similarity problem is in philosophy, semanticss etc.
By in-depth study and analysis in multiple subjects.Consider in terms of forefathers' title, attribute, structure mainly from concept etc. general
The similarity of thought.Have first to calculate concept similarity before and be divided into two-layer:" initial similarity " and " by non-hyponymy body
Existing similarity ", the former is mainly calculated using the distance between concept, and the latter is then on the basis of forefathers calculate, to lead to
The non-hyponymy for crossing concept is calculated;Again comprehensively the two is just obtained the actual similarity of concept in domain body.Remove
Outside this, also mainly by the hyponymy and other factorses between concept come the semanteme between the concept of calculating field inside
Similarity.It has been proposed, for example, that a kind of comprehensive similarity calculating method, i.e., first according to the similarity mistake of two concept names
Leach maximally related concept;Conceptual example, concept attribute and conceptual relation are based respectively on again and calculate concept similarity, and carry out comprehensive
Close.Although now many applications using mass data due to can to a certain extent cover this problem, in many situations
Under, my mode of mass data simultaneously do not apply to, and have ignored Study on Semantic so that the subjective feeling of calculated result and people
Difference is often leading to great error.So the Similarity Measure of semanteme is just particularly important in this case, if can be with
The similar word of each word is obtained, by the inquiry to similar word, the shared effect of user profile can be undoubtedly improved,
In order to meet the demand, the present invention proposes a kind of Similarity of Words method for solving based on Ontological concept.
The content of the invention
Term similar problem for how to obtain each term, the invention provides the vocabulary language based on Ontological concept
Adopted similarity method for solving.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Initialization statistical method module.
Step 2:By word (c to be compared1, c2) in input initialization statistical method module.
Step 3:By word (c to be compared1, c2) it is mapped to Ontological concept module.
Step 4:Word (c to be compared is chosen respectively1, c2) the maximum Ontological concept g of correspondence depth1、g2。
Step 5:Calculate word (c to be compared1, c2) correspond between two maximum Ontological concepts of depth apart from dis (g1, g2)。
Step 6:Through above-mentioned steps, two word (c to be compared are calculated1, c2) most recent co mmon ancestor depth D (c1, c2)。
Step 7:Calculate two word (c to be compared1, c2) similarity sim (c1, c2)。
Present invention has the advantages that:
1st, this calculates Lexical Similarity method in the empirical value for quantifying conceptive closer expert
2nd, the method more fully, has more considered word (c to be compared1, c2) between the maximum Ontological concept of correspondence depth away from
From factors such as, depth, the accuracy of semantic similarity result is greatly improved.
3rd, preferably improve the effect of ontology inference.
Description of the drawings
Similarity of Words method for solving structure flow charts of the Fig. 1 based on Ontological concept
Specific embodiment
To solve the problems, such as how to obtain the term similar of each term, the present invention is described in detail with reference to Fig. 1,
Its specific implementation step is as follows:
Step 1:Initialization statistical method module.
Step 2:By word (c to be compared1, c2) in input initialization statistical method module.
Step 3:By word (c to be compared1, c2) it is mapped to Ontological concept module.
Step 4:Word (c to be compared is chosen respectively1, c2) the maximum Ontological concept g of correspondence depth1、g2, it is specifically described such as
Under:
Word C ∈ (c to be compared1, c2) and concept between be one-to-many relation, when the concept depth chosen is deeper, then wait to compare
Compared with word C ∈ (c1,c2) then more concrete, it is more convenient to calculate word C ∈ (c to be compared1, c2) semantic similarity.This depth is in statistics
It is easily found in module block, for example, exists《Hownet》In find the corresponding Ontological concept of word.
Step 5:Calculate word (c to be compared1, c2) correspond between two maximum Ontological concepts of depth apart from dis (g1, g2), need elder generation
Seek the similarity sim (g of justice original item between two Ontological concepts1, g2), then calculate relative depth deepth (g between two Ontological concepts1,
g2), concrete calculating process is as follows:
5.1) between two Ontological concepts justice original item similarity sim (g1, g2)
If c1The maximum Ontological concept g of correspondence depth1In containing n justice original, i.e. g1∈(y1, y2..., yn), c2Correspondence is deep
The maximum Ontological concept g of degree2In containing m justice original, i.e. g2∈(y1', y2' ..., ym′)。
Calculate g two-by-two respectively1With g2The former similarity of middle justice, i.e. sim (yi, yj'), i ∈ (1,2 ..., n), j ∈ (1,
2 ..., m), g can be obtained1With g2Middle justice original item similarity matrix J (g1, g2), it is as follows:
Justice original similarity S maximum in each row vector is found out according to above-mentioned matrixi, i.e.,
Finally obtain the similarity sim (g of justice original item between two Ontological concepts1, g2), it is as follows:
sim(g1, g2)=max (S1, S2..., Sn)
5.2) relative depth deepth (g between two Ontological concepts is calculated1, g2)
deepth(g1, g2)=d1-d2
Above formula d1For c1The maximum Ontological concept g of correspondence depth1Depth value in the module, in the same manner d2For c2Correspondence depth is most
Big Ontological concept g2Depth value in the module, this can be easy to draw according to module.
5.3) word (c to be compared is calculated1,c2) correspond between two maximum Ontological concepts of depth apart from dis (g1, g2)
Above formula α is smoothing factor, and this as the case may be, is specifically given by expert.
Step 6:Through above-mentioned steps, two word (c to be compared are calculated1, c2) most recent co mmon ancestor depth D (c1, c2), tool
Body is described as follows:
According to module, two word (c to be compared can be found1, c2) most recent co mmon ancestor depth D (c1,c2).Here two treat
Comparing word (c1, c2) most recent co mmon ancestor depth, the closer to bottom, represents two word (c to be compared1, c2) more close.
Step 7:Calculate two word (c to be compared1, c2) similarity sim (c1, c2), its concrete calculating process is as follows:
Above formula β is weight factor, as β > 0.5, the depth D (c of common ancestor1, c2) to similarity sim (c1, c2)
Affect larger, otherwise, apart from dis (g between two Ontological concepts1, g2) to similarity sim (c1, c2) impact it is larger.Rule of thumb
Can obtain, the latter is to sim (c1, c2) affect bigger.
Based on the Similarity of Words method for solving of Ontological concept, its false code calculating process:
Input:Initialization module, word (c to be compared1, c2)
Output:Word (c to be compared1, c2) similarity sim (c1, c2)。
Claims (3)
1. the Similarity of Words method for solving of Ontological concept is based on, the present invention relates to Semantic Web technology field, specifically relates to
And a kind of Similarity of Words method for solving based on Ontological concept, it is characterized in that, comprise the steps:
Step 1:Initialization statistical method module
Step 2:By word to be comparedIn input initialization statistical method module
Step 3:By word to be comparedIn being mapped to Ontological concept module
Step 4:Word to be compared is chosen respectivelyThe maximum Ontological concept of correspondence depth
Step 5:Calculate word to be comparedDistance between two maximum Ontological concepts of correspondence depth
Step 6:Through above-mentioned steps, two words to be compared are calculatedThe depth of most recent co mmon ancestor
Step 7:Calculate two words to be comparedSimilarity。
2., according to the Similarity of Words method for solving based on Ontological concept described in claim 1, it is characterized in that, the above
Concrete calculating process in the step 5 is as follows:
Step 5:Calculate word to be comparedDistance between two maximum Ontological concepts of correspondence depth, need to first ask two
The similarity of justice original item between body conceptRelative depth between two Ontological concepts is calculated againConcrete meter
Calculation process is as follows:
5.1)The similarity of justice original item between two Ontological concepts
IfThe maximum Ontological concept of correspondence depthIn containing n justice original, i.e.,Correspondence depth
Maximum Ontological conceptIn containing m justice original, i.e.,
Calculate two-by-two respectivelyWithThe former similarity of middle justice, i.e.,
Can obtainWithMiddle justice original item similarity matrixIt is as follows:
Justice original similarity maximum in each row vector is found out according to above-mentioned matrixI.e.
Finally obtain the similarity of justice original item between two Ontological conceptsIt is as follows:
5.2)Calculate relative depth between two Ontological concepts
Above formulaForThe maximum Ontological concept of correspondence depthDepth value in the module, in the same mannerForCorrespondence depth is most
Big Ontological conceptDepth value in the module, this can be easy to draw according to module
5.3)Calculate word to be comparedDistance between two maximum Ontological concepts of correspondence depth
Above formulaFor smoothing factor, this as the case may be, is specifically given by expert.
3., according to the Similarity of Words method for solving based on Ontological concept described in claim 1, it is characterized in that, the above
Concrete calculating process in the step 7 is as follows:
Step 7:Calculate two words to be comparedSimilarityIts concrete calculating process is as follows:
Above formulaFor weight factor, whenWhen, the depth of common ancestorTo similarityImpact
It is larger, otherwise, distance between two Ontological conceptsTo similarityImpact it is larger, rule of thumb may be used
, the latter coupleAffect bigger.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2016106059718 | 2016-07-28 | ||
CN201610605971 | 2016-07-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106611038A true CN106611038A (en) | 2017-05-03 |
Family
ID=58615030
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610833103.5A Pending CN106611038A (en) | 2016-07-28 | 2016-09-20 | Ontology concept-based lexical semantic similarity solving method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106611038A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107895012A (en) * | 2017-11-10 | 2018-04-10 | 上海电机学院 | A kind of body constructing method based on Topic Model |
CN109308352A (en) * | 2018-08-01 | 2019-02-05 | 昆明理工大学 | A kind of word correlation prediction method based on shortest path |
CN110580338A (en) * | 2019-06-11 | 2019-12-17 | 福建奇点时空数字科技有限公司 | Context relation algorithm of clustered entity based on semantic iteration extraction technology |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101114275A (en) * | 2006-07-24 | 2008-01-30 | 同济大学 | Main body complexity analyzing evaluation method based on concepts model |
CN102779288A (en) * | 2012-06-26 | 2012-11-14 | 中国矿业大学 | Ontology analysis method based on field theory |
-
2016
- 2016-09-20 CN CN201610833103.5A patent/CN106611038A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101114275A (en) * | 2006-07-24 | 2008-01-30 | 同济大学 | Main body complexity analyzing evaluation method based on concepts model |
CN102779288A (en) * | 2012-06-26 | 2012-11-14 | 中国矿业大学 | Ontology analysis method based on field theory |
Non-Patent Citations (4)
Title |
---|
曾建勋: "《知识链接及其服务研究》", 30 August 2012, 北京:科学技术文献出版社 * |
王颖 等: "一种基于RDF图的本体匹配方法", 《计算机应用》 * |
葛斌 等: "基于知网的词汇语义相似度计算方法研究", 《计算机应用研究》 * |
郑旭东: "《军队信息化建设加速发展策略 中国电子学会电子系统工程分会 第十九届军队信息化理论学术研讨会论文集》", 31 October 2012 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107895012A (en) * | 2017-11-10 | 2018-04-10 | 上海电机学院 | A kind of body constructing method based on Topic Model |
CN109308352A (en) * | 2018-08-01 | 2019-02-05 | 昆明理工大学 | A kind of word correlation prediction method based on shortest path |
CN109308352B (en) * | 2018-08-01 | 2021-10-22 | 昆明理工大学 | Word correlation determination method based on shortest path |
CN110580338A (en) * | 2019-06-11 | 2019-12-17 | 福建奇点时空数字科技有限公司 | Context relation algorithm of clustered entity based on semantic iteration extraction technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Teng et al. | Context-sensitive lexicon features for neural sentiment analysis | |
WO2020062770A1 (en) | Method and apparatus for constructing domain dictionary, and device and storage medium | |
CN109325229B (en) | Method for calculating text similarity by utilizing semantic information | |
Ren et al. | Deep attention-based neural networks for explainable heart sound classification | |
Li et al. | Improving convolutional neural network for text classification by recursive data pruning | |
CN111858940B (en) | Multi-head attention-based legal case similarity calculation method and system | |
Jiang et al. | Transformer based memory network for sentiment analysis of web comments | |
CN107358315A (en) | A kind of information forecasting method and terminal | |
CN107305539A (en) | A kind of text tendency analysis method based on Word2Vec network sentiment new word discoveries | |
Zhang et al. | Speech emotion recognition using an enhanced kernel isomap for human-robot interaction | |
CN106611038A (en) | Ontology concept-based lexical semantic similarity solving method | |
CN106489148A (en) | A kind of intention scene recognition method that is drawn a portrait based on user and system | |
Kang et al. | ConnotationWordNet: Learning connotation over the word+ sense network | |
CN109726391A (en) | The method, apparatus and terminal of emotional semantic classification are carried out to text | |
CN106844587A (en) | A kind of data processing method and device for talking with interactive system | |
WO2023116572A1 (en) | Word or sentence generation method and related device | |
CN109033318A (en) | Intelligent answer method and device | |
CN103729431B (en) | Massive microblog data distributed classification device and method with increment and decrement function | |
Jurgovsky et al. | Evaluating memory efficiency and robustness of word embeddings | |
CN103164394A (en) | Text similarity calculation method based on universal gravitation | |
Linders et al. | Zipf's Law in Human-Machine Dialog | |
CN113806545B (en) | Comment text emotion classification method based on label description generation | |
Boran et al. | Understanding Customers' Affective Needs with Linguistic Summarization | |
CN106610939A (en) | Improved lexical semantic similarity solution method of ontology concept | |
Sun et al. | Chinese microblog sentiment classification based on deep belief nets with extended multi-modality features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170503 |
|
WD01 | Invention patent application deemed withdrawn after publication |