CN102156726B - Geographic element querying and extending method based on semantic similarity - Google Patents

Geographic element querying and extending method based on semantic similarity Download PDF

Info

Publication number
CN102156726B
CN102156726B CN 201110081505 CN201110081505A CN102156726B CN 102156726 B CN102156726 B CN 102156726B CN 201110081505 CN201110081505 CN 201110081505 CN 201110081505 A CN201110081505 A CN 201110081505A CN 102156726 B CN102156726 B CN 102156726B
Authority
CN
China
Prior art keywords
concept
similarity
node
ontology
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201110081505
Other languages
Chinese (zh)
Other versions
CN102156726A (en
Inventor
张福浩
张东
刘纪平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese Academy of Surveying and Mapping
Original Assignee
Chinese Academy of Surveying and Mapping
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese Academy of Surveying and Mapping filed Critical Chinese Academy of Surveying and Mapping
Priority to CN 201110081505 priority Critical patent/CN102156726B/en
Publication of CN102156726A publication Critical patent/CN102156726A/en
Application granted granted Critical
Publication of CN102156726B publication Critical patent/CN102156726B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a geographic element querying and extending method based on semantic similarity, wherein the concept similarity is calculated through the shortest distance between two concept nodes of a geographic ontology while considering node depth, node density impact factor, common attribute proportion and public ancestors node density and other impact factors. In addition, the characteristics for differing geographic ontology from common domain ontology, namely the spatial relationships among the geographical concepts, are considered fully, for example, position relation, topological relation and metric relation and etc.; the characteristics, as important impact factors, are provided with different weights according to the different spatial relationships among the concepts, so that the impacts of spatial relationships to the semantic similarity are quantized. The geographic element querying and extending method based on semantic similarity provided by the invention has the advantages that the geographic element querying and extending purpose based on the semantic similarity of geographic ontology is achieved while fully considering and selecting the various of impact factors, the information searched by the users can be quickly and accurately obtained, the querying method based on simple morphology matching is changed, and the efficiency of querying is improved.

Description

The geographic element enquiry expanding method of semantic-based similarity
Technical field
The invention belongs to geospatial information system field, relate in particular to and a kind ofly based on the ontology semantic similarity, carry out the geographic element enquiry expanding method.
Background technology
" mankind are by data submerge, but the mankind are poor in knowledge." along with the develop rapidly of computing machine and Internet technology; the quantity of information sharp increase on network; in the face of huge information resources; people feel at a loss as to what to do; get lost in this huge Information Network of Internet, " information overload " and " information is got lost " become people and talked about one of maximum topic.In the face of vast as the open sea information, a hot issue of areas of information technology is exactly how to help the user to retrieve in high quality to obtain real Useful Information.But practical information retrieval system, mainly be based on manual sort's catalogue and keyword coupling at present.The former is due to the needs manual operations; efficiency and the degree of depth to the announcement of magnanimity information resource are all very limited; the latter only is confined on the aspect of character style coupling; announcement to the semanteme of information, pragmatic is very limited; the existence of the Heterogeneous Informations such as polysemant, synonym simultaneously makes the defect of this morphology matching way day by day obvious, and therefore many searching systems often there will be the situation that retrieval is incomplete, give an irrelevant answer.
And, for the spatial information that accounts for human information 80% left and right, owing to thering is distributivity, diversity, isomerism, brought much inconvenience to sharing with interoperability of spatial information.Simultaneously, be limited to development and the development of correlation technique, the supplier of space information resource only rests on the case shell of data the description of data at present, lack necessary association between data, cause computing machine can't understand the intension of data, be difficult to find exactly the data of meeting consumers' demand, thereby directly have influence on speed and the efficiency of user's obtaining information.
And, in traditional geographic information retrieval mode, following problem is day by day obvious:
(1) diversity of the geospatial information form of expression.Traditional geospatial information is often with the form appearance of place name or spatial data, and this is the most people approval and knows.But the geospatial information that this two class is common shows different because of the difference of country, culture, bearing medium.Along with the mankind, to deepening continuously of geographical information world understanding and updating of geospatial information solution, the form of expression of geospatial information is moved towards variation gradually, and this has brought many challenges for the retrieval of geospatial information.
(2) ambiguity that geospatial information is described.This is mainly manifested in two aspects.On the one hand that when describing geospatial information, used can't pinpoint title.That description to geospatial information exists ambiguity on the other hand.As identical as the place name appellation of part different regions or place name is identical with the orientation name, these titles very easily cause and obscure when traditional morphology matching way information retrieval, cause obstacle to the retrieval of geospatial information.
(3) the conventional information retrieval mainly realizes in the mode of keyword morphology coupling, and the existence of the Heterogeneous Informations such as polysemant, synonym makes the defect of this morphology matching way day by day obvious.This defect has been inherited in the retrieval of the geographical space attributive character information developed on conventional information retrieval basis equally.
In addition, the user is when inquiring about the geography information needed, often do not understand title and the storage organization of data in database, the different modes of people to world's cognition in addition, cause can laying particular emphasis on the different side of object to the observation description of same geographical phenomenon, thereby the difference on the generation viewpoint, and form Semantic Heterogeneous.Owing between the concept in user cognition and the concept world and the concept in system model, having semantic heterogeneity, the data that the user checks out are not often that they really need.This carries out similar coupling with regard to the query statement that requires system to express the user, and the information satisfied condition is returned to the user, therefore can utilize ontology to set up intelligent geographic information discover and inquiry system and service system of gis.
So-called ontology is exactly, the knowledge in relevant geographical science field, information and data abstraction are become by the object that has one by one common recognition (or entity), and the system formed according to certain relation, carry out generalities simultaneously and process and clear and definite definition, theory and the method finally expressed with formalization.
The computing method of similarity have three kinds of approach at present: the one, and the method for utilizing corpus to add up, the frequency occurred at context according to two concepts, calculate semantic similarity between concept; The 2nd, utilize semantic dictionary as the synonym of WorNet, HowNet etc. or the tree-shaped hierarchical system structure of adopted former composition, by calculating two information entropy or semantic distances between concept, calculate the semantic distance between concept; The 3rd, utilize in body semantic distance between two concept nodes to calculate the similarity of concept.
Method based on corpus is more objective, concentrated expression similarity and the difference of word at aspects such as syntax, semanteme, pragmatics.But this method relatively depends on training corpus used, calculated amount is large, and the computing method complexity in addition, is subject to the interference of Sparse and data noise larger, manifest error occurs sometimes.Method based on dictionary is simply effective, also relatively directly perceived, easy to understand, but that the result that this method obtains is affected by people's subjective consciousness is larger, sometimes can not accurately reflect objective fact.In addition, this method has reflected similarity and the difference of semantic aspect between word more exactly, and considers fewerly for the syntax between word and pragmatic Features.Although the 3rd class methods take in factors such as the node degree of depth that affects Concept Semantic Similarity, node density, attribute informations, but only consider basic attribute information, and ignored this factor of influence of spatial relationship between geographic concepts, the factor that this has the greatest impact to semantic similarity in ontology exactly, therefore be difficult to be applied calculating aspect the geographic element similarity.
The query expansion algorithm of the geographic concepts of the semantic-based similarity that the present invention proposes is on the basis of the 3rd class methods, the factor that affects semantic similarity calculating is considered fully and accepted or rejected, emphasis is considered the impact of spatial relationship on the geographic element semantic similarity, finally obtain comparatively ideal result, well solved semantic query and expanded this problem.
Summary of the invention
The present invention is analyzing, is estimating on the basis of existing similarity algorithm model, take into full account and choose various factors of influence, the geographic element query expansion of realization based on the ontology semantic similarity model, can obtain fast and accurately the information that the user searches, change the inquiry mode of simple morphology coupling, improved the efficiency of inquiry.
Thus, the invention provides a kind of similarity acquisition methods based on the ontology semanteme, comprise the steps:
(1) extract Fundamental Geographic Information System from Geographic Information System, therefrom obtain each required concept of ontology and the relation between each concept described;
(2) relation between described each concept and each concept is described by network topology structure, with different nodes, describes different concepts, with the line between node, describe the relation between concept;
(3) according to spatial relationships different between concept node, the line between concept node is carried out to assignment, differently concern that line gives different weight coefficients to every kind;
(4) search shortest path between two concept nodes to be analyzed, according to each different weight calculation that concern line on described shortest path, obtain the semantic distance between two concept nodes;
(5) search the common public ancestor node of two concept nodes to be analyzed, calculate and obtain the two concept nodes to be analyzed degree of depth to public ancestor node;
(6) calculate the public ancestor node density that obtains two concept nodes to be analyzed;
(7) calculate the total attribute ratio (being mainly spatial relationship) that obtains two concept nodes to be analyzed;
(8) obtain being with the similarity between two concepts of analyzing by described step (4) to the semantic distance between middle two concept nodes that obtain of step (7), two concept nodes to the degree of depth of public ancestor node, the public ancestor node density of two concept nodes, the total property calculation of two concept nodes.
In technique scheme, in described step (8), further comprise respectively the semantic distance between two concept nodes, two concept nodes are regulated to parameter to the degree of depth of public ancestor node, the public ancestor node density of two concept nodes, the total setup of attribute of two concept nodes.
In technique scheme, calculating formula of similarity is as follows:
Sim ( X , Y ) = ( a dis tan t ( X , Y ) + a ) A * ( depth ( X ) + depth ( Y ) | depth ( X ) - depth ( Y ) | ) B *
( parents ( X , Y ) count ) C * ( count ( Attri ( X ) IAttri ( Y ) ) count ( Attri ( X ) YAttri ( Y ) ) ) D
Wherein, X, Y mean to carry out two concepts of similarity calculating;
Figure BSA00000464627000053
mean that semantic distance is customized parameter on a in the impact of similarity, formula;
mean the impact of the directed edge degree of depth on similarity;
Figure BSA00000464627000055
mean the impact of common ancestor's number on similarity;
Figure BSA00000464627000056
mean the impact of concept node attribute on similarity;
Wherein, dis tan t (X, Y) is the semantic distance between X, Y,
Depth (X), depth (Y) are respectively the directed edge degree of depth of X, Y;
Parents (X, Y) means public ancestors' number of X, Y, and the count representative comprises the total number of node in X, Y node branch.
The number of attribute in the attribute intersection of number, X and the Y of attribute during the attribute that count (Attri (X) I Attri (Y)), count (Attri (X) Y Attri (Y)) are respectively X and Y occurs simultaneously.
A, B, C, D be for regulating parameter, is respectively semantic distance, the directed edge degree of depth, common ancestor's number, this four part of the concept node attribute factor of influence to similarity, and, regulate parameter A, B, C, D and meet following relation:
A+B+C+D=1
Can regulate above-mentioned semantic distance, the directed edge degree of depth, common ancestor's number, this four part of the concept node attribute weight size on the similarity impact by A, B, C, D.
The invention provides a kind of geographic element enquiry expanding method that uses the above-mentioned similarity acquisition methods based on the ontology semanteme, it is characterized in that: except the concept that inquiring user is directly inputted, also inquire about related notion, described related notion and the described user similarity that directly concept of input calculates by the described similarity acquisition methods based on the ontology semanteme is greater than the threshold value set.
In technique scheme, the Query Result of described related notion is sorted according to the similarity calculated, and the result ranking of the related notion that similarity is high is forward.
The present invention has obtained following technique effect:
(1) can obtain quickly and efficiently the information that the user wants.By the query expansion on semantic hierarchies, improved the semantic relevance between searching keyword, make Query Result more accurate, relevant.
(2) simple and practical.Do not rely on Large Scale Corpus, dependence reduces.
(3) promote particularly further developing of geographic element search technique of search engine technique.By analyzing the deficiency of existing algorithm, model, make corresponding improvement, and be applied in reality, make the existing technology can be more perfect, the requirement of realistic application.
The accompanying drawing explanation
The structural drawing that Fig. 1 is the ontology that obtains;
Fig. 2 is the semantic distance exemplary plot.
Embodiment
Understand and implement the present invention for the ease of those of ordinary skills, below in conjunction with the drawings and the specific embodiments, the present invention is described in further detail.
The present invention is on the basis of the deficiency of fully summing up above-mentioned existing enquiry expanding method existence, utilize clear and definite semantic association relation between the hierarchical structure of ontology and Ontological concept, proposed a kind of geographic element enquiry expanding method based on the ontology semantic similarity model.The method is on the basis of semantic-based Distance conformability degree computation model, take into full account different semantic relation in the density, ontology of the degree of depth, the node of node in the body layer aggregated(particle) structure and (mainly consider topological relation and the position relation of geographic element, as In, Disjoint, Touch, Overlap, Equal etc.), internodal attribute and public ancestor node density is to the factors of influence such as influence degree of child node, the geographic element query expansion function of realization based on the ontology semantic similarity, and solved following technical matters:
(1) solve Semantic Heterogeneous inquiry expansion problem
(2) reduce the complexity of query expansion
(3) break away from query expansion the dependence to Large Scale Corpus
(4) make query expansion result accurate and effective more
By solve above technical matters the technical scheme that adopts of reality as follows:
Step 1: obtain the semantic relation between each concept and concept in ontology
Extract Fundamental Geographic Information System from existing basic geographic database, therefrom obtain the semantic relation (being mainly spatial relationship) between the required concept of ontology and concept, mainly comprise place name, administrative division, traffic, landforms, water system etc., and the ownership between them, adjacency, pass through, the information such as position relation and tolerance.
Storing the Fundamental Geographic Information System such as administrative division, landforms, water system, settlement place, traffic, place name of multiple engineer's scale in the basic geographic database based on Geographic Information System (GIS), there is higher domain correlation degree and abundant data message, can guarantee to a great extent the accuracy of ontology relationship expression.And have certain contacting between basic geographic database model and ontology model.In basic geographic database containing a plurality of tables of data, between table and table existence one to one, the relation of one-to-many, multi-to-multi, each tables of data consists of a plurality of fields, the value of field is called record; Comprise a plurality of classes in ontology, have certain relation between class and class, each class has a plurality of attributes, and the set of property value is exactly example.And utilize the formalization corresponding relation existed between basic geographic database data model and ontology model, by the definition mapping ruler, obtain the fundamental geological body.
As shown in Figure 1, extracting method is as follows for the structure of the ontology that the present invention extracts:
1), according to the data message in basic geographic database, extract the spatial relationship between geographic element and geographic element.
The concrete grammar that extracts spatial relationship is as follows:
(I) extraction of topological relation
1. put the topological relation between key element and other key element:
Obtain the topological relation between them by the distance between calculation level key element and other key elements.
2. the spatial topotaxy between line feature:
If two line features have same node point, they have in abutting connection with (touch) relation, otherwise judge their topological relation by calculating distance between two line features.
3. the spatial topotaxy between line feature and face key element:
Two end points and face factor-factor relationship by line feature are determined, when two-end-point all is positioned at the face key element, have ownership (in) relation between line feature and face key element; If only have an end points to be positioned at the face key element, there are overlapping (overlap) relation in line feature and face key element; Otherwise, non-intersect (disjoint) relation between line feature and face key element.
4. the spatial topotaxy between the face key element:
Spatial topotaxy between the face key element, can be judged by the relation between the boundary line that forms face.When one and the pass of the boundary line of key element and another side key element is ownership (in), between the face key element, there is ownership (in) relation; The boundary line of key element exists overlapping (overlap) while concerning face to face, can judge that the face factor-factor relationship is as overlapping (overlap); When the boundary line of key element is identical, there is (equal) relation that is equal between the face key element face to face; In the time of will have the public boundary line face to face, between the face key element, exist in abutting connection with (Touch) relation; Otherwise the face factor-factor relationship is non-intersect (disjoint).
(II) extraction of position relation
Use for reference from all directions to defined all directions in cone-shaped model to direction in space is divided the corresponding corresponding angle value of each direction interval; In basic geographic database, polygon be take segmental arc and is stored as elementary cell, and every segmental arc has and only have two polygons institutes public.Can be in the hope of the inclined angle alpha of the corresponding string of this segmental arc according to the initial and end node coordinate of segmental arc on polygon; Define to such an extent that the position angle, space compares by α and institute, thereby judge the direction in space that this segmental arc is corresponding, and then the position relation between the polygon that obtains being formed by segmental arc.
Take the geographical space concept model data of the based target that geographical entity data-switching that topological relation is the basic described complexity beyond expression of words of data model understands for ease of the mode of thinking according to the people by above-mentioned steps by using in basic geographic database.
2) according to the geographic element of above extracting and the relation between geographic element, by GIS concept of space model conversion, be the relation between concept and concept, set up the mapping ruler from the basic geographic database to the ontology.
(I) tables of data in basic geographic database is analyzed, table name is drafted as the related concept of application, selected the geospatial information with the Expression and Application field of simplifying as far as possible of critical concept wherein.Can determine with reference to Classification of Fundamental Geographic Information their hierarchical relationship between concept, the class or the subclass that are finally ontology by the Conceptual Projection selected.
For convenience of carry out exchanges data between different system, use general network ontology language OWL (Web Ontology Language) to be described related notion.
1. being described below of tables of data T:
<owl:Class?rdf:ID=TName/>;
It means to use " class " of resource description framework (RDF) description to express concept, and its ID is TName.
2. a plurality of tables of data T1, T2, and between relation, as have " equivalence " relation, be described below:
Figure BSA00000464627000091
It means T1Name and T2Name equivalence.
(II) attribute that is the ontology concept by the field mappings of the description geodata attribute information stored in the Fundamental Geographic Information Data table and spatial information:
1. the field F that comprises attribute information in tables of data T is mapped as the data type attribute of ontology:
Figure BSA00000464627000102
2. the field F that comprises spatial information in tables of data T is mapped as to the object type attribute of ontology:
Figure BSA00000464627000103
3. two tables of data T1, T2 by foreign key field F, being associated are mapped as field F the object type attribute of ontology:
Figure BSA00000464627000104
Basic geographic database converts the geodata of basic geographic database to corresponding ontology example after converting to the pattern of ontology, completes data-switching.
Mapping ruler is:
1. each tuple in the basic geographic database table is mapped as to an example of ontology, and the major key of choosing tuple is as the instances of ontology name.
2. the attribute data of tuple in the basic geographic database table is mapped as to the data type property value of instances of ontology.
3. the spatial data of tuple in the basic geographic database table is mapped as to ontology example corresponding object type attribute value.
By above method, can obtain each conception of species in basic ontology and the semantic hierarchies relation between concept.
Step 2: be the various semantic distances that concern assignment and calculate two concepts in the body layer aggregated(particle) structure in ontology
In the ontology hierarchical structure, at first according to different relations, to the annexation between geographical Ontological concept, give different weights; Then calculate the semantic distance of two concepts in the ontology hierarchical structure, check the shortest route distance between two concepts, then according to the semantic distance of the different weight calculation point-to-point transmissions of every line.
Have certain semantic relation, example as shown in Figure 2 in ontology between concept: " China " and " china " is the different address of same title, is " Equal " relation between the two; Beijing, Shanghai, Shandong belong to China, have " In " relation between they and China, and they three are mutually disjointed, and is " Disjoint " relation; The Miyun Reservoir Miyun County of flowing through is " Overlap " relation with Miyun County.According to different corresponding relation between each concept in ontology, for every kind of relation is given weight, as equal concerns weight, be that 1, In concerns that weight is that 4/5, overlap concerns that weight is that 3/4, Disjoint concerns that weight is 1/6 etc.Thereby can, by searching the shortest route distance between two concepts, then according to the different weights of every line, calculate the semantic distance of point-to-point transmission.
The computing method of semantic distance are as follows, at first find the shortest route distance between two concepts in ontology, as the bee-line between " China " and " Miyun Reservoir " is " China-Beijing-Miyun-Miyun Reservoir ", if do not consider the weight on each limit, semantic distance is 1+1+1=3; And, according to every weight that limit is corresponding, obtain semantic distance and be: 4/5+4/5+3/4=47/20.When not considering each limit weight, " Shanghai " with the semantic distance of " Miyun Reservoir " is also: 1+1+1=3, " China " to arrive the semantic distance that the semantic distance of " Miyun Reservoir " arrives " Miyun Reservoir " with " Shanghai " identical; When considering each limit weight, " Shanghai " with the semantic distance of " Miyun Reservoir " is: 1/6+4/5+3/4=103/60, and it is different that " China " arrives the semantic distance that the semantic distance of " Miyun Reservoir " arrives " Miyun Reservoir " from " Shanghai "; The algorithm of obvious each limit weight of known consideration meets general user's cognition.
Step 3: weight is adjusted according to public ancestor node density and the attribute of two nodes in body
Using the public ancestor node density of two nodes in body as a regulatory factor.Because the common ancestor that two concepts in the body hierarchical network have is more much more similar; When calculating between concept semantic similarity, except semantic distance, public ancestors' density and total attribute between node, it is also the key factor that affects semantic similarity, the common ancestor that two concepts have is more, and two concepts are also just more similar, and the public attribute had between two concepts is more, the relation that two nodes so also are described is also just closer, and two concepts are more similar.Therefore public ancestor node and same node point attribute are carried out to computing as a part that affects similarity.
Step 4: regulate parameter, result of calculation is regulated for given one
Because, in the process of calculating in similarity, the weight of various semantic relations is all to carry out assignment according to experience, the possibility of result and the actual conditions that obtain are not inconsistent.Therefore, we need to set a regulated variable, the result of calculating is regulated, thus the ideal value of obtaining.A regulatory factor using the concept node attribute at directed edge two ends in Fig. 2 in semantic computation, because in the domain body hierarchical network structure, if the same alike result that the concept that the node at directed edge two ends is corresponding has is more, the relation that two nodes are described so is also just closer, and the weight of the directed edge consisted of them also just should be larger; According to the analysis-by-synthesis of the above factor on the semantic similarity impact, on the basis of existing calculating formula of similarity, derive the formula that similarity is calculated that gives based on body.
Wherein calculating formula of similarity is as follows:
Sim ( X , Y ) = ( a dis tan t ( X , Y ) + a ) A * ( depth ( X ) + depth ( Y ) | depth ( X ) - depth ( Y ) | ) B *
( parents ( X , Y ) count ) C * ( count ( Attri ( X ) IAttri ( Y ) ) count ( Attri ( X ) YAttri ( Y ) ) ) D
Wherein, X, Y mean to carry out two concepts of similarity calculating;
mean that semantic distance is customized parameter on a in the impact of similarity, formula;
Figure BSA00000464627000134
mean the impact of the directed edge degree of depth on similarity;
Figure BSA00000464627000135
mean the impact of common ancestor's number on similarity;
Figure BSA00000464627000136
mean the impact of concept node attribute on similarity.
Wherein, dis tant (X, Y) is the semantic distance between X, Y,
Figure BSA00000464627000137
weight ifor the weight of i bar line between X, Y, in formula, n means on the shortest path between X, Y that n bar line is arranged.
Wherein, depth (X), depth (Y) are respectively the directed edge degree of depth of X, Y;
Figure BSA00000464627000141
in formula, n is nodes X to the limit number comprised in the shortest path of the nearest father node of nodes X, Y;
In like manner,
Figure BSA00000464627000142
in formula, n is nodes X to the limit number comprised in the shortest path of the nearest father node of nodes X, Y.
Wherein, parents (X, Y) means public ancestors' number of X, Y, and the count representative comprises the total number of node in X, Y node branch.
Wherein, Attri (X), Attri (Y) are respectively the property set of X, Y;
The number of attribute in the attribute intersection of number, X and the Y of attribute during the attribute that count (Attri (X) IAttri (Y)), count (Attri (X) YAttri (Y)) are respectively X and Y occurs simultaneously.
A, B, C, D be for regulating parameter, is respectively semantic distance, the directed edge degree of depth, common ancestor's number, this four part of the concept node attribute factor of influence to similarity, by A, B, C, D, can regulate the weight size of above-mentioned four parts on the similarity impact.And adjusting parameter A, B, C, D meet following relation:
A+B+C+D=1
By using above-mentioned formula to calculate, just can obtain the similarity between different geographical Ontological concepts.Like this when the user need to be inquired about a certain concept X, except the information of returning to concept X, also concept X is expanded, (for example calculate concept X and the concept Y adjacent or close with concept X by above-mentioned formula, can travel through the concept that line number on the path with the bee-line of concept X is less than n) similarity, if similarity is greater than the threshold value of setting, also return to the information of concept Y simultaneously.Also can to a plurality of results of returning, be sorted according to the similarity data that calculate, by similarity, the highest result comes foremost simultaneously, to facilitate the user, selects.
The result of use of the present invention in actual geographic query and search system is, as the user in search interface during input inquiry key word " Miyun Reservoir ", realize geographic concepts query expansion effect according to Arithmetic of Semantic Similarity above, list the information that five large watersheds, basin, ,Ji canal, Caobai River basin, Yongdinghe River Plain etc. are associated with Miyun Reservoir in Search Results, and obtained the semantic similarity of itself and Miyun Reservoir; Obtain the geographic element relevant to Miyun Reservoir (being the ontology concept) by calculating formula of similarity: Luanping, Changping District, Pinggu District, Huairou District.Greatly expand the scope that the user searches, improved user's recall ratio.
Above embodiment is only one embodiment of the present invention; it describes comparatively concrete and detailed; but can not therefore be interpreted as the restriction to the scope of the claims of the present invention; for the person of ordinary skill of the art; without departing from the inventive concept of the premise; can also make some distortion and improvement, these all belong to protection scope of the present invention.

Claims (2)

1. the geographic element enquiry expanding method of the similarity acquisition methods of a use based on the ontology semanteme is characterized in that:
The described similarity acquisition methods based on the ontology semanteme comprises the steps:
(1) extract Fundamental Geographic Information System from geographic information database, therefrom obtain and describe each required concept of ontology and the spatial relationship between each concept, described spatial relationship comprises position relation, topological relation and metric relation;
(2) spatial relationship between described each concept and each concept is described by network topology structure, with different nodes, describes different concepts, with the line between node, describe the relation between concept;
(3) according to spatial relationships different between concept node, the line between concept node is carried out to assignment, differently concern that line gives different weight coefficients to every kind;
(4) search shortest path between two concept nodes to be analyzed, according to each different weight calculation that concern line on described shortest path, obtain the semantic distance between two concept nodes;
(5) search the common public ancestor node of two concept nodes to be analyzed, calculate and obtain the two concept nodes to be analyzed degree of depth to public ancestor node;
(6) calculate the public ancestor node density that obtains two concept nodes to be analyzed;
(7) calculate the total attribute ratio that obtains two concept nodes to be analyzed;
(8) obtain the similarity between two concepts to be analyzed by described step (4) to semantic distance, two concept nodes between middle two concept nodes that obtain of step (7) to the degree of depth of public ancestor node, the public ancestor node density of two concept nodes, the total property calculation of two concept nodes;
Wherein: in described step (8), further comprise respectively the semantic distance between two concept nodes, two concept nodes are regulated to parameter to the degree of depth of public ancestor node, the public ancestor node density of two concept nodes, the total setup of attribute of two concept nodes;
And, in described step (8), use following formula to calculate similarity:
Figure FSB00000887110000021
Figure FSB00000887110000022
Wherein, X, Y mean to carry out two concepts of similarity calculating;
Figure FSB00000887110000023
mean that semantic distance is customized parameter on a in the impact of similarity, formula;
Figure FSB00000887110000024
mean the impact of the directed edge degree of depth on similarity;
Figure FSB00000887110000025
mean the impact of common ancestor's number on similarity;
mean the impact of concept node attribute on similarity;
Wherein, dis tan t (X, Y) is the semantic distance between X, Y,
Depth (X), depth (Y) are respectively the directed edge degree of depth of X, Y;
Parents (X, Y) means public ancestors' number of X, Y, and the count representative comprises the total number of node in X, Y node branch;
The number of attribute in the attribute intersection of number, X and the Y of attribute during the attribute that count (Attri (X) I Attri (Y)), count (Attri (X) Y Attri (Y)) are respectively X and Y occurs simultaneously;
A, B, C, D be for regulating parameter, is respectively semantic distance, the directed edge degree of depth, common ancestor's number, this four part of the concept node attribute factor of influence to similarity, and, regulate parameter A, B, C, D and meet following relation:
A+B+C+D=1
Can regulate above-mentioned semantic distance, the directed edge degree of depth, common ancestor's number, this four part of the concept node attribute weight size on the similarity impact by A, B, C, D;
Wherein said geographic element enquiry expanding method, except the concept that inquiring user is directly inputted, expanding query related notion also, described related notion and the described user similarity that directly concept of input calculates by the described similarity acquisition methods based on the ontology semanteme is greater than the threshold value set; The inquiry mode that it has changed simple morphology coupling, improved the efficiency of inquiry.
2. geographic element enquiry expanding method as described in claim 1, it is characterized in that: the Query Result of described related notion is sorted according to the similarity calculated, and the result ranking of the related notion that similarity is high is forward.
CN 201110081505 2011-04-01 2011-04-01 Geographic element querying and extending method based on semantic similarity Expired - Fee Related CN102156726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110081505 CN102156726B (en) 2011-04-01 2011-04-01 Geographic element querying and extending method based on semantic similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110081505 CN102156726B (en) 2011-04-01 2011-04-01 Geographic element querying and extending method based on semantic similarity

Publications (2)

Publication Number Publication Date
CN102156726A CN102156726A (en) 2011-08-17
CN102156726B true CN102156726B (en) 2013-12-25

Family

ID=44438225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110081505 Expired - Fee Related CN102156726B (en) 2011-04-01 2011-04-01 Geographic element querying and extending method based on semantic similarity

Country Status (1)

Country Link
CN (1) CN102156726B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI486793B (en) * 2011-12-15 2015-06-01 Ind Tech Res Inst Geographical location rendering system, method applicable thereto, computer readable record media thereof and computer program product thereof
CN103106264B (en) * 2013-01-29 2016-04-27 河南理工大学 A kind of place name matching process and device
CN103309979A (en) * 2013-06-15 2013-09-18 昆明能讯科技有限责任公司 Knowledge cube model algorithm based on graph theory
CN103761283B (en) * 2014-01-10 2017-01-25 中国测绘科学研究院 Extension method of geographic information processing service
CN104008169B (en) * 2014-05-30 2017-02-22 中国测绘科学研究院 Semanteme based geographical label content safe checking method and device
CN104239546B (en) * 2014-09-23 2021-11-23 北京理工大学 Ontology concept similarity calculation method based on field theory
CN104536957B (en) * 2014-09-26 2017-11-24 北京师范大学 Agricultural land circulation information retrieval method and system
CN104636908A (en) * 2015-01-30 2015-05-20 北京完美和声信息技术有限公司 Information interaction platform and information interaction method
CN104765828B (en) * 2015-04-13 2018-06-19 天脉聚源(北京)传媒科技有限公司 A kind of generation of dictionary data table and application process and device
CN104915396A (en) * 2015-05-28 2015-09-16 杭州电子科技大学 Knowledge retrieving method
CN105808689A (en) * 2016-03-03 2016-07-27 中国地质大学(武汉) Drainage system entity semantic similarity measurement method based on artificial neural network
CN106294583A (en) * 2016-07-28 2017-01-04 南京邮电大学 A kind of OGC geographic information services semantic similarity calculation method
CN106610947A (en) * 2016-08-25 2017-05-03 四川用联信息技术有限公司 New Chinese automatic word segmentation algorithm
CN106610936A (en) * 2016-09-12 2017-05-03 四川用联信息技术有限公司 Improved automatic Chinese word segmentation algorithm
CN106897378A (en) * 2017-01-20 2017-06-27 浙江大学 A kind of semantic parameter search dynamic regulating method towards three-dimensional CAD model
CN107515902A (en) * 2017-07-26 2017-12-26 苏州希格玛科技有限公司 A kind of isomeric data distributed storage method based on semantic tagger
CN110287688B (en) * 2019-06-28 2020-11-24 京东数字科技控股有限公司 Associated account analysis method and device and computer-readable storage medium
CN110659350B (en) * 2019-09-24 2023-03-21 吉林大学 Semantic search system and search method based on domain ontology
CN112487260A (en) * 2020-12-07 2021-03-12 上海市研发公共服务平台管理中心 Instrument project declaration and review expert matching method, device, equipment and medium
CN114485684B (en) * 2021-12-31 2023-11-28 武汉中海庭数据技术有限公司 Map road matching method and system based on ground element topological relation
CN116663662B (en) * 2023-07-31 2023-10-20 航天宏图信息技术股份有限公司 Geographic entity multi-level relation construction method and device based on multi-source semantic network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633092A (en) * 2004-11-25 2005-06-29 武汉大学 Distributed GIS space information integration apparatus and method based on mobile Agent and GML
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633092A (en) * 2004-11-25 2005-06-29 武汉大学 Distributed GIS space information integration apparatus and method based on mobile Agent and GML
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept

Also Published As

Publication number Publication date
CN102156726A (en) 2011-08-17

Similar Documents

Publication Publication Date Title
CN102156726B (en) Geographic element querying and extending method based on semantic similarity
CN104182454B (en) The integrated model of multi-source heterogeneous data semantic based on domain body structure and method
CN110413732A (en) The knowledge searching method of software-oriented defect knowledge
CN102087669B (en) Intelligent search engine system based on semantic association
CN104239513A (en) Semantic retrieval method oriented to field data
CN107180045A (en) A kind of internet text contains the abstracting method of geographical entity relation
CN101350013A (en) Method and system for searching geographical information
CN101777082A (en) Correlation method of text information and geological information and system
CN107992608B (en) SPARQL query statement automatic generation method based on keyword context
CN102456064B (en) Method for realizing community discovery in social networking
CN104268230A (en) Method for detecting objective points of Chinese micro-blogs based on heterogeneous graph random walk
CN105678590A (en) topN recommendation method for social network based on cloud model
CN109992784A (en) A kind of heterogeneous network building and distance metric method for merging multi-modal information
CN102306182A (en) Method for excavating user interest based on conceptual semantic background image
CN113434659B (en) Implicit conflict sensing method in collaborative design process
CN104156431A (en) RDF keyword research method based on stereogram community structure
Li Research and analysis of semantic search technology based on knowledge graph
CN103294791A (en) Extensible markup language pattern matching method
Huang et al. Design and implementation of oil and gas information on intelligent search engine based on knowledge graph
Zhai et al. Geo-spatial query based on extended SPARQL
CN104765763B (en) A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice
CN104317853B (en) A kind of service cluster construction method based on Semantic Web
CN116258612A (en) Urban land utilization distribution evaluation method based on knowledge embedding model
CN115859963A (en) Similarity judgment method and system for new word and semantic recommendation
CN104598613A (en) Conceptual relation construction method and device for vertical field

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20131225

Termination date: 20140401