CN106611038A - Ontology concept-based lexical semantic similarity solving method - Google Patents

Ontology concept-based lexical semantic similarity solving method Download PDF

Info

Publication number
CN106611038A
CN106611038A CN201610833103.5A CN201610833103A CN106611038A CN 106611038 A CN106611038 A CN 106611038A CN 201610833103 A CN201610833103 A CN 201610833103A CN 106611038 A CN106611038 A CN 106611038A
Authority
CN
China
Prior art keywords
similarity
ontological
concept
depth
compared
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610833103.5A
Other languages
Chinese (zh)
Inventor
金平艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yonglian Information Technology Co Ltd
Original Assignee
Sichuan Yonglian Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yonglian Information Technology Co Ltd filed Critical Sichuan Yonglian Information Technology Co Ltd
Publication of CN106611038A publication Critical patent/CN106611038A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an ontology concept-based lexical semantic similarity solving method, which comprises the steps of mapping to-be-compared words input in a statistical method module into an ontology concept; selecting ontology concepts, with corresponding maximum depths, of the to-be-compared words from an ontology concept module, calculating the distance between the ontology concepts and calculating the most recent common ancestor depth; and finally calculating the similarity between the two to-be-compared words. The ontology concept-based lexical semantic similarity solving method is closer to an empirical value of an expert in quantitative concept; the factors of the distance between the ontology concepts, with the corresponding maximum depths, of the to-be-compared words (c1, c2), the depths and the like are more fully and comprehensively considered, so that the accuracy of the semantic similarity result is greatly improved; and the ontology reasoning effect is better improved.

Description

Similarity of Words method for solving based on Ontological concept
Technical field
The present invention relates to Semantic Web technology field, and in particular to a kind of Similarity of Words based on Ontological concept is asked Solution method.
Background technology
At present, many scholars are paying close attention to the computational methods of Ontological concept similarity, and similarity problem is in philosophy, semanticss etc. By in-depth study and analysis in multiple subjects.Consider in terms of forefathers' title, attribute, structure mainly from concept etc. general The similarity of thought.Have first to calculate concept similarity before and be divided into two-layer:" initial similarity " and " by non-hyponymy body Existing similarity ", the former is mainly calculated using the distance between concept, and the latter is then on the basis of forefathers calculate, to lead to The non-hyponymy for crossing concept is calculated;Again comprehensively the two is just obtained the actual similarity of concept in domain body.Remove Outside this, also mainly by the hyponymy and other factorses between concept come the semanteme between the concept of calculating field inside Similarity.It has been proposed, for example, that a kind of comprehensive similarity calculating method, i.e., first according to the similarity mistake of two concept names Leach maximally related concept;Conceptual example, concept attribute and conceptual relation are based respectively on again and calculate concept similarity, and carry out comprehensive Close.Although now many applications using mass data due to can to a certain extent cover this problem, in many situations Under, my mode of mass data simultaneously do not apply to, and have ignored Study on Semantic so that the subjective feeling of calculated result and people Difference is often leading to great error.So the Similarity Measure of semanteme is just particularly important in this case, if can be with The similar word of each word is obtained, by the inquiry to similar word, the shared effect of user profile can be undoubtedly improved, In order to meet the demand, the present invention proposes a kind of Similarity of Words method for solving based on Ontological concept.
The content of the invention
Term similar problem for how to obtain each term, the invention provides the vocabulary language based on Ontological concept Adopted similarity method for solving.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Initialization statistical method module.
Step 2:By word (c to be compared1, c2) in input initialization statistical method module.
Step 3:By word (c to be compared1, c2) it is mapped to Ontological concept module.
Step 4:Word (c to be compared is chosen respectively1, c2) the maximum Ontological concept g of correspondence depth1、g2
Step 5:Calculate word (c to be compared1, c2) correspond between two maximum Ontological concepts of depth apart from dis (g1, g2)。
Step 6:Through above-mentioned steps, two word (c to be compared are calculated1, c2) most recent co mmon ancestor depth D (c1, c2)。
Step 7:Calculate two word (c to be compared1, c2) similarity sim (c1, c2)。
Present invention has the advantages that:
1st, this calculates Lexical Similarity method in the empirical value for quantifying conceptive closer expert
2nd, the method more fully, has more considered word (c to be compared1, c2) between the maximum Ontological concept of correspondence depth away from From factors such as, depth, the accuracy of semantic similarity result is greatly improved.
3rd, preferably improve the effect of ontology inference.
Description of the drawings
Similarity of Words method for solving structure flow charts of the Fig. 1 based on Ontological concept
Specific embodiment
To solve the problems, such as how to obtain the term similar of each term, the present invention is described in detail with reference to Fig. 1, Its specific implementation step is as follows:
Step 1:Initialization statistical method module.
Step 2:By word (c to be compared1, c2) in input initialization statistical method module.
Step 3:By word (c to be compared1, c2) it is mapped to Ontological concept module.
Step 4:Word (c to be compared is chosen respectively1, c2) the maximum Ontological concept g of correspondence depth1、g2, it is specifically described such as Under:
Word C ∈ (c to be compared1, c2) and concept between be one-to-many relation, when the concept depth chosen is deeper, then wait to compare Compared with word C ∈ (c1,c2) then more concrete, it is more convenient to calculate word C ∈ (c to be compared1, c2) semantic similarity.This depth is in statistics It is easily found in module block, for example, exists《Hownet》In find the corresponding Ontological concept of word.
Step 5:Calculate word (c to be compared1, c2) correspond between two maximum Ontological concepts of depth apart from dis (g1, g2), need elder generation Seek the similarity sim (g of justice original item between two Ontological concepts1, g2), then calculate relative depth deepth (g between two Ontological concepts1, g2), concrete calculating process is as follows:
5.1) between two Ontological concepts justice original item similarity sim (g1, g2)
If c1The maximum Ontological concept g of correspondence depth1In containing n justice original, i.e. g1∈(y1, y2..., yn), c2Correspondence is deep The maximum Ontological concept g of degree2In containing m justice original, i.e. g2∈(y1', y2' ..., ym′)。
Calculate g two-by-two respectively1With g2The former similarity of middle justice, i.e. sim (yi, yj'), i ∈ (1,2 ..., n), j ∈ (1, 2 ..., m), g can be obtained1With g2Middle justice original item similarity matrix J (g1, g2), it is as follows:
Justice original similarity S maximum in each row vector is found out according to above-mentioned matrixi, i.e.,
Finally obtain the similarity sim (g of justice original item between two Ontological concepts1, g2), it is as follows:
sim(g1, g2)=max (S1, S2..., Sn)
5.2) relative depth deepth (g between two Ontological concepts is calculated1, g2)
deepth(g1, g2)=d1-d2
Above formula d1For c1The maximum Ontological concept g of correspondence depth1Depth value in the module, in the same manner d2For c2Correspondence depth is most Big Ontological concept g2Depth value in the module, this can be easy to draw according to module.
5.3) word (c to be compared is calculated1,c2) correspond between two maximum Ontological concepts of depth apart from dis (g1, g2)
Above formula α is smoothing factor, and this as the case may be, is specifically given by expert.
Step 6:Through above-mentioned steps, two word (c to be compared are calculated1, c2) most recent co mmon ancestor depth D (c1, c2), tool Body is described as follows:
According to module, two word (c to be compared can be found1, c2) most recent co mmon ancestor depth D (c1,c2).Here two treat Comparing word (c1, c2) most recent co mmon ancestor depth, the closer to bottom, represents two word (c to be compared1, c2) more close.
Step 7:Calculate two word (c to be compared1, c2) similarity sim (c1, c2), its concrete calculating process is as follows:
Above formula β is weight factor, as β > 0.5, the depth D (c of common ancestor1, c2) to similarity sim (c1, c2) Affect larger, otherwise, apart from dis (g between two Ontological concepts1, g2) to similarity sim (c1, c2) impact it is larger.Rule of thumb Can obtain, the latter is to sim (c1, c2) affect bigger.
Based on the Similarity of Words method for solving of Ontological concept, its false code calculating process:
Input:Initialization module, word (c to be compared1, c2)
Output:Word (c to be compared1, c2) similarity sim (c1, c2)。

Claims (3)

1. the Similarity of Words method for solving of Ontological concept is based on, the present invention relates to Semantic Web technology field, specifically relates to And a kind of Similarity of Words method for solving based on Ontological concept, it is characterized in that, comprise the steps:
Step 1:Initialization statistical method module
Step 2:By word to be comparedIn input initialization statistical method module
Step 3:By word to be comparedIn being mapped to Ontological concept module
Step 4:Word to be compared is chosen respectivelyThe maximum Ontological concept of correspondence depth
Step 5:Calculate word to be comparedDistance between two maximum Ontological concepts of correspondence depth
Step 6:Through above-mentioned steps, two words to be compared are calculatedThe depth of most recent co mmon ancestor
Step 7:Calculate two words to be comparedSimilarity
2., according to the Similarity of Words method for solving based on Ontological concept described in claim 1, it is characterized in that, the above Concrete calculating process in the step 5 is as follows:
Step 5:Calculate word to be comparedDistance between two maximum Ontological concepts of correspondence depth, need to first ask two The similarity of justice original item between body conceptRelative depth between two Ontological concepts is calculated againConcrete meter Calculation process is as follows:
5.1)The similarity of justice original item between two Ontological concepts
IfThe maximum Ontological concept of correspondence depthIn containing n justice original, i.e.,Correspondence depth Maximum Ontological conceptIn containing m justice original, i.e.,
Calculate two-by-two respectivelyWithThe former similarity of middle justice, i.e.,
Can obtainWithMiddle justice original item similarity matrixIt is as follows:
Justice original similarity maximum in each row vector is found out according to above-mentioned matrixI.e.
Finally obtain the similarity of justice original item between two Ontological conceptsIt is as follows:
5.2)Calculate relative depth between two Ontological concepts
Above formulaForThe maximum Ontological concept of correspondence depthDepth value in the module, in the same mannerForCorrespondence depth is most Big Ontological conceptDepth value in the module, this can be easy to draw according to module
5.3)Calculate word to be comparedDistance between two maximum Ontological concepts of correspondence depth
Above formulaFor smoothing factor, this as the case may be, is specifically given by expert.
3., according to the Similarity of Words method for solving based on Ontological concept described in claim 1, it is characterized in that, the above Concrete calculating process in the step 7 is as follows:
Step 7:Calculate two words to be comparedSimilarityIts concrete calculating process is as follows:
Above formulaFor weight factor, whenWhen, the depth of common ancestorTo similarityImpact It is larger, otherwise, distance between two Ontological conceptsTo similarityImpact it is larger, rule of thumb may be used , the latter coupleAffect bigger.
CN201610833103.5A 2016-07-28 2016-09-20 Ontology concept-based lexical semantic similarity solving method Pending CN106611038A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2016106059718 2016-07-28
CN201610605971 2016-07-28

Publications (1)

Publication Number Publication Date
CN106611038A true CN106611038A (en) 2017-05-03

Family

ID=58615030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610833103.5A Pending CN106611038A (en) 2016-07-28 2016-09-20 Ontology concept-based lexical semantic similarity solving method

Country Status (1)

Country Link
CN (1) CN106611038A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895012A (en) * 2017-11-10 2018-04-10 上海电机学院 A kind of body constructing method based on Topic Model
CN109308352A (en) * 2018-08-01 2019-02-05 昆明理工大学 A kind of word correlation prediction method based on shortest path
CN110580338A (en) * 2019-06-11 2019-12-17 福建奇点时空数字科技有限公司 Context relation algorithm of clustered entity based on semantic iteration extraction technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114275A (en) * 2006-07-24 2008-01-30 同济大学 Main body complexity analyzing evaluation method based on concepts model
CN102779288A (en) * 2012-06-26 2012-11-14 中国矿业大学 Ontology analysis method based on field theory

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114275A (en) * 2006-07-24 2008-01-30 同济大学 Main body complexity analyzing evaluation method based on concepts model
CN102779288A (en) * 2012-06-26 2012-11-14 中国矿业大学 Ontology analysis method based on field theory

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
曾建勋: "《知识链接及其服务研究》", 30 August 2012, 北京:科学技术文献出版社 *
王颖 等: "一种基于RDF图的本体匹配方法", 《计算机应用》 *
葛斌 等: "基于知网的词汇语义相似度计算方法研究", 《计算机应用研究》 *
郑旭东: "《军队信息化建设加速发展策略 中国电子学会电子系统工程分会 第十九届军队信息化理论学术研讨会论文集》", 31 October 2012 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895012A (en) * 2017-11-10 2018-04-10 上海电机学院 A kind of body constructing method based on Topic Model
CN109308352A (en) * 2018-08-01 2019-02-05 昆明理工大学 A kind of word correlation prediction method based on shortest path
CN109308352B (en) * 2018-08-01 2021-10-22 昆明理工大学 Word correlation determination method based on shortest path
CN110580338A (en) * 2019-06-11 2019-12-17 福建奇点时空数字科技有限公司 Context relation algorithm of clustered entity based on semantic iteration extraction technology

Similar Documents

Publication Publication Date Title
Teng et al. Context-sensitive lexicon features for neural sentiment analysis
WO2020062770A1 (en) Method and apparatus for constructing domain dictionary, and device and storage medium
CN109325229B (en) Method for calculating text similarity by utilizing semantic information
Ren et al. Deep attention-based neural networks for explainable heart sound classification
Li et al. Improving convolutional neural network for text classification by recursive data pruning
CN111858940B (en) Multi-head attention-based legal case similarity calculation method and system
Jiang et al. Transformer based memory network for sentiment analysis of web comments
CN107358315A (en) A kind of information forecasting method and terminal
CN107305539A (en) A kind of text tendency analysis method based on Word2Vec network sentiment new word discoveries
Zhang et al. Speech emotion recognition using an enhanced kernel isomap for human-robot interaction
CN106611038A (en) Ontology concept-based lexical semantic similarity solving method
CN106489148A (en) A kind of intention scene recognition method that is drawn a portrait based on user and system
Kang et al. ConnotationWordNet: Learning connotation over the word+ sense network
CN109726391A (en) The method, apparatus and terminal of emotional semantic classification are carried out to text
CN106844587A (en) A kind of data processing method and device for talking with interactive system
WO2023116572A1 (en) Word or sentence generation method and related device
CN109033318A (en) Intelligent answer method and device
CN103729431B (en) Massive microblog data distributed classification device and method with increment and decrement function
Jurgovsky et al. Evaluating memory efficiency and robustness of word embeddings
CN103164394A (en) Text similarity calculation method based on universal gravitation
Linders et al. Zipf's Law in Human-Machine Dialog
CN113806545B (en) Comment text emotion classification method based on label description generation
Boran et al. Understanding Customers' Affective Needs with Linguistic Summarization
CN106610939A (en) Improved lexical semantic similarity solution method of ontology concept
Sun et al. Chinese microblog sentiment classification based on deep belief nets with extended multi-modality features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170503

WD01 Invention patent application deemed withdrawn after publication