WO2012091539A1 - A semantic similarity matching system and a method thereof - Google Patents
A semantic similarity matching system and a method thereof Download PDFInfo
- Publication number
- WO2012091539A1 WO2012091539A1 PCT/MY2011/000150 MY2011000150W WO2012091539A1 WO 2012091539 A1 WO2012091539 A1 WO 2012091539A1 MY 2011000150 W MY2011000150 W MY 2011000150W WO 2012091539 A1 WO2012091539 A1 WO 2012091539A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- conceptual
- semantic similarity
- graphs
- conceptual graphs
- similarity matching
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Definitions
- U.S. 6,810,376 B1 describes a system and associated methods to determine the semantic similarity of different sentences to one another. Unfortunately, the prior art requires the sentences to be broken up into words before a similarity calculation may be done between a first and second set of words.
- U.S. 5331554 describes a method and apparatus using automated semantic pattern recognition where a user may query for information in a text and the described invention displays location of the information.
- the text must first be converted to a tree-like structure to form a knowledge base.
- a degree of similarity between a query and a node is measured based on a predetermined threshold value to display the actual location of the node.
- this method requires a threshold value to be determined ahead of the process. Determination of this threshold value would greatly affect the effectiveness of the method as it would depend on certain criteria to decide on the threshold value.
- a semantic similarity matching system of a plurality of conceptual graphs includes a semantic similarity matching component which further includes a conceptual graph processor, a database operatively connectable to the conceptual graph processor and a semantic similarity calculator wherein output of the semantic similarity calculator is a similarity index (SI) of the plurality of conceptual graphs.
- SI similarity index
- a method of semantic similarity matching of a plurality of conceptual graphs includes the steps of recerving the plurality of conceptual graphs, performing a count of matched concept nodes in the plurality of conceptual graphs, retrieving a total of concept nodes in each conceptual graph and calculating similarity index between the plurality of conceptual graphs.
- Figure 1 is a block diagram illustrating architecture of a preferred embodiment of a semantic similarity matching system of a plurality of conceptual graphs.
- the present invention relates to a semantic similarity matching system of a plurality of conceptual graphs and a method thereof.
- this specification will describe the present invention according to the preferred embodiment of the present invention.
- limiting the description to the preferred embodiment of the invention is merely to facilitate discussion of the present invention and it is envisioned that those skilled in the art may devise various modifications and equivalents without departing from the scope of the appended claims.
- the following detailed description of the preferred embodiment will now be described in accordance with the attached drawings, either individually or in combination.
- the present invention provides a semantic similarity matching system (100) of a plurality of conceptual graphs as seen in Figure 1.
- the system (100) includes a semantic similarity matching component (1 10).
- the system (100) further includes a conceptual graph processor (120) and a conceptual graph knowledge base (130) operatively connectable to the conceptual graph processor (120).
- a semantic similarity calculator (140) is also included in the system (100) wherein output of the semantic similarity calculator (140) is a similarity index (SI) (150) of the plurality of conceptual graphs.
- the conceptual graph processor (120) functions as a data layer in the system (100).
- the system (100) matches a plurality of conceptual graphs, such as two conceptual graphs by applying a maximal joint operation method on both conceptual graphs and returns a similarity index (SI) (150) in a range of 0 to 1.
- SI similarity index
- a value of 1 is defined as being most identical between the plurality of conceptual graphs and a value of 0 is defined as being completely non-identical.
- the semantic similarity matching component (1 10) takes two conceptual graphs, such as cg1 and cg2 as seen in Figure 1 and performs a maximal join operation by utilizing the conceptual graph processor (120) and returns a count of matched concept nodes in cg1 and cg2. A total of concept nodes in each conceptual graph are retrieved. The count is then used to calculate a similarity index using the formula:
- maxJoinSize number of concept which is maximally join in both graphs.
- cglsize a number of concept nodes in cg1
- cg2size a number of concept nodes in cg2
- the semantic similarity matching component (110) is able to accept two conceptual graphs at any given time to conduct matching and return one similarity index (SI).
- the system (100) uses conceptual graphs (CGs) representation to compare similarities between two CGs.
- a CG is made up of a combination of concept (C) nodes and relation (R) nodes.
- C concept
- R relation
- An example below shows a representation of 2 conceptual graphs namely cg1 and cg2:
- CGs express meaning in a form that is logically precise, humanly readable, and computationally tractable. With a direct mapping to language, conceptual graphs serve as an intermediate language for translating computer-oriented formalisms to and from natural languages. With graphic representation, the CGs function as a readable, but formal design and specification language. The described method and system can be applied, but not restricted to, a variety of applications for information retrieval, database design, expert systems, and natural language processing.
Abstract
A semantic similarity matching system (100) of a plurality of conceptual graphs is provided, the system (100) includes a semantic similarity matching component (110) which further includes a conceptual graph processor (120), a database (130) operatively connectable to the conceptual graph processor (120) and a semantic similarity calculator (140) wherein output of the semantic similarity calculator (140) is a similarity index (SI) of the plurality of conceptual graphs.
Description
A SEMANTIC SIMILARITY MATCHING SYSTEM AND A METHOD THEREOF
FIELD OF INVENTION The present invention relates to a semantic similarity matching system of a plurality of conceptual graphs and a method thereof
BACKGROUND OF INVENTION Retrieval of information from a knowledge base is able to satisfy a user when it is relevant to queries submitted by the user. However, solutions for retrieving relevant search information are still not well developed as this field is still in a developmental stage. U.S. 6,810,376 B1 describes a system and associated methods to determine the semantic similarity of different sentences to one another. Unfortunately, the prior art requires the sentences to be broken up into words before a similarity calculation may be done between a first and second set of words. U.S. 5331554 describes a method and apparatus using automated semantic pattern recognition where a user may query for information in a text and the described invention displays location of the information. In order to do this, the text must first be converted to a tree-like structure to form a knowledge base. A degree of similarity between a query and a node is measured based on a predetermined threshold value to display the actual location of the node. However, this method requires a threshold value to be determined ahead of the process. Determination of this threshold value
would greatly affect the effectiveness of the method as it would depend on certain criteria to decide on the threshold value.
John F. Sowa. (1984). "Conceptual Structures: Information Processing in Mind and Machine." Addison Wesley describes the Maximal Join Algorithm as when given two graphs that share compatible sub-graphs, the maximal join algorithm will attempt to build a new graph in which the two initial graphs are fused, according to their compatible sub-graph. Therefore, there is a need for an accurate and efficient solution to match similarities between natural language queries and a set of data found from any search methods.
SUMMARY OF INVENTION Accordingly there is provided a semantic similarity matching system of a plurality of conceptual graphs, the system includes a semantic similarity matching component which further includes a conceptual graph processor, a database operatively connectable to the conceptual graph processor and a semantic similarity calculator wherein output of the semantic similarity calculator is a similarity index (SI) of the plurality of conceptual graphs.
There is also provided a method of semantic similarity matching of a plurality of conceptual graphs, the method includes the steps of recerving the plurality of conceptual graphs, performing a count of matched concept nodes in the plurality of conceptual graphs, retrieving a total of concept nodes in each conceptual graph and calculating similarity index between the plurality of conceptual graphs.
The present invention consists of several novel features and a combination of parts hereinafter fully described and illustrated in the accompanying description and drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be fully understood from the detailed description given herein below and the accompanying drawings which are given by way of illustration only, and thus are not limitative of the present invention, wherein:
Figure 1 is a block diagram illustrating architecture of a preferred embodiment of a semantic similarity matching system of a plurality of conceptual graphs.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
The present invention relates to a semantic similarity matching system of a plurality of conceptual graphs and a method thereof. Hereinafter, this specification will describe the present invention according to the preferred embodiment of the present invention. However, it is to be understood that limiting the description to the preferred embodiment of the invention is merely to facilitate discussion of the present invention and it is envisioned that those skilled in the art may devise various modifications and equivalents without departing from the scope of the appended claims.
The following detailed description of the preferred embodiment will now be described in accordance with the attached drawings, either individually or in combination.
The present invention provides a semantic similarity matching system (100) of a plurality of conceptual graphs as seen in Figure 1. The system (100) includes a semantic similarity matching component (1 10). The system (100) further includes a conceptual graph processor (120) and a conceptual graph knowledge base (130) operatively connectable to the conceptual graph processor (120). A semantic similarity calculator (140) is also included in the system (100) wherein output of the semantic similarity calculator (140) is a similarity index (SI) (150) of the plurality of conceptual graphs. The conceptual graph processor (120) functions as a data layer in the system (100).
The system (100) matches a plurality of conceptual graphs, such as two conceptual graphs by applying a maximal joint operation method on both conceptual graphs and returns a similarity index (SI) (150) in a range of 0 to 1. A value of 1 is defined as being most identical between the plurality of conceptual graphs and a value of 0 is defined as being completely non-identical. For example, the semantic similarity matching component (1 10) takes two conceptual graphs, such as cg1 and cg2 as seen in Figure 1 and performs a maximal join operation by utilizing the conceptual graph processor (120) and returns a count of matched concept nodes in cg1 and cg2. A total of concept nodes in each conceptual graph are retrieved. The count is then used to calculate a similarity index using the formula:
Similarity index = (maxJoinSize) / (cglsize + cg2size - maxJoinSize)
Where;
maxJoinSize = number of concept which is maximally join in both graphs.
cglsize = a number of concept nodes in cg1
cg2size = a number of concept nodes in cg2
In this embodiment, the semantic similarity matching component (110) is able to accept two conceptual graphs at any given time to conduct matching and return one similarity index (SI).
The system (100) uses conceptual graphs (CGs) representation to compare similarities between two CGs. A CG is made up of a combination of concept (C) nodes and relation (R) nodes. An example below shows a representation of 2 conceptual graphs namely cg1 and cg2:
(Rl) * [CI]
CGs express meaning in a form that is logically precise, humanly readable, and computationally tractable. With a direct mapping to language, conceptual graphs serve as an intermediate language for translating computer-oriented formalisms to and from natural languages. With graphic representation, the CGs function as a readable, but formal design and specification language.
The described method and system can be applied, but not restricted to, a variety of applications for information retrieval, database design, expert systems, and natural language processing.
Claims
1. A semantic similarity matching system (100) of a plurality of conceptual graphs, the system (100) includes:
a semantic similarity matching component (110) which further includes a conceptual graph processor (120);
a database (130) operatively connectable to the conceptual graph processor (120); and
a semantic similarity calculator (140) wherein output of the semantic similarity calculator (140) is a similarity index (SI) of the plurality of conceptual graphs.
2. The system (100) as claimed in claim 1 , wherein the plurality of conceptual graphs are two conceptual graphs.
3. A method of semantic similarity matching of a plurality of conceptual graphs, the method includes the steps of:
i. receiving the plurality of conceptual graphs;
ii. performing a count of matched concept nodes in the plurality of conceptual graphs;
iii. retrieving a total of concept nodes in each conceptual graph; and iv. indexing similarity on concept nodes in the plurality of conceptual graphs.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MYPI2010006269 | 2010-12-28 | ||
MYPI2010006269 MY151371A (en) | 2010-12-28 | 2010-12-28 | A semantic similarity matching system and a method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012091539A1 true WO2012091539A1 (en) | 2012-07-05 |
Family
ID=46383349
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/MY2011/000150 WO2012091539A1 (en) | 2010-12-28 | 2011-06-24 | A semantic similarity matching system and a method thereof |
Country Status (2)
Country | Link |
---|---|
MY (1) | MY151371A (en) |
WO (1) | WO2012091539A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015178758A1 (en) * | 2014-05-19 | 2015-11-26 | Mimos Berhad | A system and method for analyzing concept evolution using network analysis |
CN105893671A (en) * | 2016-03-30 | 2016-08-24 | 浙江大学 | Complex mechanical and electrical product system design model verification method based on expansion concept map |
CN105900081A (en) * | 2013-02-19 | 2016-08-24 | 谷歌公司 | Natural language processing based search |
CN106610934A (en) * | 2016-07-08 | 2017-05-03 | 四川用联信息技术有限公司 | Novel semantic similarity solving method in intelligent manufacturing industry |
-
2010
- 2010-12-28 MY MYPI2010006269 patent/MY151371A/en unknown
-
2011
- 2011-06-24 WO PCT/MY2011/000150 patent/WO2012091539A1/en active Application Filing
Non-Patent Citations (1)
Title |
---|
ZHONG, J. ET AL.: "Conceptual Graph Matching for Semantic Search", LECTURE NOTES IN COMPUTER SCIENCE, vol. 2393, 2002, pages 92 - 106, XP002355172 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105900081A (en) * | 2013-02-19 | 2016-08-24 | 谷歌公司 | Natural language processing based search |
CN105900081B (en) * | 2013-02-19 | 2020-09-08 | 谷歌有限责任公司 | Search based on natural language processing |
WO2015178758A1 (en) * | 2014-05-19 | 2015-11-26 | Mimos Berhad | A system and method for analyzing concept evolution using network analysis |
CN105893671A (en) * | 2016-03-30 | 2016-08-24 | 浙江大学 | Complex mechanical and electrical product system design model verification method based on expansion concept map |
CN106610934A (en) * | 2016-07-08 | 2017-05-03 | 四川用联信息技术有限公司 | Novel semantic similarity solving method in intelligent manufacturing industry |
Also Published As
Publication number | Publication date |
---|---|
MY151371A (en) | 2014-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804641B (en) | Text similarity calculation method, device, equipment and storage medium | |
CN108182972B (en) | Intelligent coding method and system for Chinese disease diagnosis based on word segmentation network | |
KR102407510B1 (en) | Method, apparatus, device and medium for storing and querying data | |
CN105095204B (en) | The acquisition methods and device of synonym | |
CN108182207B (en) | Intelligent coding method and system for Chinese surgical operation based on word segmentation network | |
US8761512B1 (en) | Query by image | |
CN108154198B (en) | Knowledge base entity normalization method, system, terminal and computer readable storage medium | |
CN105224648A (en) | A kind of entity link method and system | |
JP2020500371A (en) | Apparatus and method for semantic search | |
CN105659225A (en) | Query expansion and query-document matching using path-constrained random walks | |
CN102402561B (en) | Searching method and device | |
CN104199965A (en) | Semantic information retrieval method | |
CN110569328A (en) | Entity linking method, electronic device and computer equipment | |
CN104112005B (en) | Distributed mass fingerprint identification method | |
CN103218373A (en) | System, method and device for relevant searching | |
CN103678336A (en) | Method and device for identifying entity words | |
CN105677725A (en) | Preset parsing method for tourism vertical search engine | |
CN111026877A (en) | Knowledge verification model construction and analysis method based on probability soft logic | |
Gross et al. | How do Computed Ontology Mappings Evolve?-A Case Study for Life Science Ontologies. | |
WO2012091539A1 (en) | A semantic similarity matching system and a method thereof | |
CN109872775A (en) | A kind of document mask method, device, equipment and computer-readable medium | |
US7734633B2 (en) | Listwise ranking | |
CN108287850B (en) | Text classification model optimization method and device | |
CN102314464B (en) | Lyrics searching method and lyrics searching engine | |
CN102915381B (en) | Visual network retrieval based on multi-dimensional semantic presents system and presents control method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11853675 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11853675 Country of ref document: EP Kind code of ref document: A1 |