KR101243056B1 - System and Method for searching of entity identification result - Google Patents

System and Method for searching of entity identification result Download PDF

Info

Publication number
KR101243056B1
KR101243056B1 KR1020110067703A KR20110067703A KR101243056B1 KR 101243056 B1 KR101243056 B1 KR 101243056B1 KR 1020110067703 A KR1020110067703 A KR 1020110067703A KR 20110067703 A KR20110067703 A KR 20110067703A KR 101243056 B1 KR101243056 B1 KR 101243056B1
Authority
KR
South Korea
Prior art keywords
entity
identification result
group
identifier
identification
Prior art date
Application number
KR1020110067703A
Other languages
Korean (ko)
Other versions
KR20130005967A (en
Inventor
김평
정한민
이미경
이승우
서동민
김진형
성원경
Original Assignee
한국과학기술정보연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국과학기술정보연구원 filed Critical 한국과학기술정보연구원
Priority to KR1020110067703A priority Critical patent/KR101243056B1/en
Priority to PCT/KR2011/007357 priority patent/WO2013008978A1/en
Publication of KR20130005967A publication Critical patent/KR20130005967A/en
Application granted granted Critical
Publication of KR101243056B1 publication Critical patent/KR101243056B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a system and a method for retrieving an object identification result, wherein a multi- ontology database in which attribute information of objects is stored as an ontology, and when a query is input, obtains an identification result of the query from a triple storage module. The identification result is compared with the identification result of the query word obtained from the multiple ontology database to determine whether individual identification is required, and if individual identification is required, the attribute information of the object to be identified is acquired and compared from the multiple ontology database. And an individual identification result retrieval apparatus for identifying the identification information and providing result information according to the individual identification.
Therefore, according to the present invention, various types of entities may be gradually identified using attribute information of multiple ontology included in Linked Data.

Description

System and Method for searching of entity identification result

The present invention relates to an object identification result retrieval system and method, and more particularly, when a query is input, the identification result of the query is obtained from a triple storage module, and the obtained identification result is obtained from the multiple ontology database. It is determined whether the object identification is necessary by comparing with the identification result of the query, and when the object identification is required, the object of the query is identified by obtaining and comparing the attribute information of the object to be identified, and the result information according to the object identification. An object identification result retrieval system and method for providing the same.

As the importance of information sharing and linking is highlighted, web services using ontology as a knowledge expression model are increasing. Recently, data model using ontology has been attracting attention in connection with research of semantic web.

Applications using the semantic web include automated communication between agents, web service automation, semantic-based search services in ubiquitous environments, and information retrieval from heterogeneous multimedia databases. However, these applications basically assume that all agents reference a common ontology. If these agents do not refer to a common ontology, such an application would be impossible. That is, since the semantic web is distributed and heterogeneous like the existing web, it is difficult to assume that all agents refer to a common ontology.

For example, if one agent refers to an ontology that describes the address of a person's home as "address", the other agent refers to an ontology that describes it as a "postal_address". Communication between them becomes impossible.

In addition, existing studies to ensure the uniformity of URI (Uniform Resource Identifier) required for linkage and utilization of semantic web data have limited types of entity identification, and do not effectively support real-time entity identification of ontology, which is gradually added. It has limitations such as using the error included in the entity identification relationship as it is.

In addition, it does not provide a management service that can visually check the object identification results, dynamically modify them, and immediately use the modified results for object identification.

Patent No. 866548 (2008.10.28), title of the invention: sameAS management system of ontology instance and method thereof Patent No. 842263 (2008.06.24), title of the invention: Method for mapping similar concepts between ontologies and apparatus therefor

SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object of the present invention is a system and method for retrieving an entity identification result that can gradually identify various types of entities using attribute information of multiple ontology included in linked data. To provide.

Another object of the present invention is to group the ontology into the same type and object by using the attribute information of multiple ontologies, and to improve the accuracy of the same object identification by identifying the object through the comparison of the main property of each group, as well as compared to the external service An object identification result retrieval system and method capable of supporting a gradual entity identification function targeting only URIs added through the present invention is provided.

Another object of the present invention is to identify an entity that a user can identify the entity-URI grant status and related attributes of the linked data by searching for the entity name or URI, and recommend the entity name or URI for linkage with the retained data. To provide a result retrieval system and method.

Another object of the present invention is to be able to edit dynamically identified entity identification results, and to use the edit results to identify added URIs, enabling fast, accurate identification of incrementally increasing URIs, and allowing users to identify entities. The present invention provides a system and method for identifying an object identification result that can visually confirm a result, correct a result, and provide a management service in which the modified result is immediately reflected in the object identification.

According to an aspect of the present invention, in order to achieve the above objects, if multiple ontologies database in which attribute information about objects is stored as an ontology, and a query is input, the identification result of the query is obtained from the triple storage module. The identification result is compared with the identification result of the query word obtained from the multiple ontology database to determine whether individual identification is required, and if individual identification is required, the attribute information of the object to be identified is acquired and compared from the multiple ontology database. An entity identification result retrieval system including an entity identification result retrieval apparatus for identifying entities for a query and providing result information according to the entity identification is provided.

The triple storage module stores object identification results for each entity in triple format.

The entity identification result retrieval apparatus compares the entity identifier list based on the identification result from the triple storage module with the entity identifier list based on the identification result from the multiple ontology database, and compares the entity identifier obtained from the multiple ontology database. If a new entity identifier is included in the list, it is determined that the entity identification is necessary and the new entity identifier is used as the entity to be identified.

In addition, the object identification result retrieval apparatus is grouped according to the object type using the attribute indicating the type in the attribute information of the object to be identified obtained from the multiple ontology database, and according to the object type based on the representative attribute of each object After regrouping the grouped groups, the property values of the entities for each of the regrouped groups are obtained and compared from the multiple ontology database, and the groups are divided into subgroups based on the comparison result. Create an entity identifier group for the.

In addition, the entity identification result retrieval apparatus selects a representative entity identifier or a representative entity name for each entity identifier group for the query, and includes at least one of the number of entities, entity type, representative entity name, and representative entity identifier for each entity identifier group. Provides the entity identifier group summary information including the result information according to the entity identification.

The object identification result search apparatus may store and update the edited information as the object identification result in the triple storage module when the editing by at least one of the modification, deletion, and movement of the object identification result is performed, and the update is performed. The identified individual identification result is used later for identifying the query.

According to another aspect of the present invention, it is determined whether the triple storage module for which the entity identification result for each entity is stored in the triple form and whether the identification result for the query word exists in the triple storage module and, if present, the triple storage module. An entity identifier list comparison module for comparing the entity identifier list according to the identification result of the query word obtained from the multi- ontology database with the entity identifier list comparison module, the result obtained from the multiple ontology database When a new entity identifier is included in the list of the entity identifiers, the entity identification for the query is performed by obtaining and comparing attribute information on the new entity identifier, and the entity identifier group for the query is determined by the entity identification. Object expression you create And a representative name selection module for selecting and applying a representative name for each generated entity identifier group, and storing and updating the entity identification result to which the representative name is applied to the triple storage module, and storing the triple information for the entity identification result. An object identification result retrieval apparatus including a visualization module obtained from a module and visualized is provided.

The entity identification result retrieval apparatus stores and updates the edit result as the entity identification result in the triple storage module when the editing by at least one of the visible, individual identification result is performed by the modification, deletion, and movement. The updated entity identification result may further include an editing module for later use in entity identification for the query.

The entity identifier list comparison module collects the identification result of the query word from the multiple ontology database and makes it visible through the visualization module when the identification result of the query word does not exist in the triple storage module.

The triple storage module stores entity identifier group summary information and edited entity identifier group summary information for each entity.

The query is in the form of an entity name or URI.

The entity identification module obtains the attribute information of the new entity identifier from the multiple ontology database, groups the entity based on the entity type, regroups the entity based on the representative attribute of each entity for each group according to the entity type, and then regroups the entity. An association attribute value of the entities for each grouped group is obtained and compared from the multiple ontology database, and the individual identifier group for the query is determined by dividing each group into subgroups based on the comparison result.

The representative name selection module selects a representative identifier or a representative entity name for each group by using the generated entity identifier and the statistical value of the entity name for each generated entity identifier group.

The information on the entity identification result includes entity identifier group summary information including at least one of entity number, entity type, representative entity name, and representative entity identifier for each entity identifier group.

The visualization module may include an object identification result providing screen including at least one of a query input area, an ontology database association command, an object identifier group summary list, an object identifier group summary graph, a detailed object identifier list for each group, an identification command, and an editing command. Outputs

When the identification command is selected on the object identification result providing screen, information about the updated object identification result is output to the triple storage module, and when one object identifier list is selected from the object identifier list for each group, The verification result is output in a predetermined area of the object identification result providing screen.

Editing of at least one of the representative entity name or representative entity identifier of the entity identifier group, merging different groups, dividing one group into different groups, and moving an entity identifier belonging to a specific group to another group by the editing command To perform.

According to another aspect of the present invention, in a method for providing an individual identification result for a query by the object identification result retrieval apparatus, (a) when a query is input, it is determined whether the identification result for the query exists in the triple storage module. (B) comparing the identification result of the query word obtained from the triple storage module with the identification result of the query word obtained from a multiple ontology database, if the determination result exists; When the comparison result includes the new entity identifier included in the identification result obtained from the multiple ontology database, the entity identification for the query is performed by acquiring and comparing attribute information on the new entity identifier, and by the entity identification Generating an entity identifier group for the query, (d) the generated entity expression Now select and apply the representative name for each group and the individual identifying how to search results were applied to the representative object identification result storing and updating in the triple storage module is provided.

The individual identification result searching method may further include obtaining and visualizing the information on the individual identification result from the triple storage module.

The object identification result retrieval method may further include storing and updating the edited result as an object identification result in the triple storage module when editing by at least one of the modified, deleted, and moved of the visualized object identification result is performed. It may further include.

In the step (b), if the identification result of the query word does not exist in the triple storage module, the identification result of the query word is collected and visualized from the multiple ontology database.

Step (b) compares the entity identifier list based on the identification result from the triple storage module and the entity identifier list based on the identification result from the multiple ontology database.

The query may be in the form of an entity name or a URI.

In the step (c), the attribute information of the new entity identifier is obtained from the multiple ontology database and grouped based on the entity type, and grouped again based on the representative attribute of each entity for each group according to the entity type. Relevant attribute values of entities for each regrouped group are obtained and compared from the multiple ontology database, and each group is divided into subgroups based on the comparison result to determine an entity identifier group for the query. .

In the step (d), the representative identifier or representative entity name for each group is selected by using the generated entity identifier and the statistical value of the entity name for each entity group.

The information on the entity identification result includes entity identifier group summary information including at least one of entity number, entity type, representative entity name, and representative entity identifier for each entity identifier group.

As described above, according to the present invention, various types of entities may be gradually identified by using attribute information of multiple ontologies included in linked data.

In addition, the ontology is grouped into the same type and object by using the property information of multiple ontology, and the object is identified through the comparison of the main property of each group to improve the accuracy of identifying the same object and added by comparing with the external service. Only URIs can be supported for progressive object identification.

In addition, a user can check the object-URI grant status and related attributes of linked data by searching for an object name or a URI, and recommend an object name or URI for linkage with retained data.

In addition, you can edit the results of dynamically identified entity identification, and use the edit results to identify additional URIs, enabling fast, accurate identification of incrementally increasing URIs, and allowing users to visually It can provide a management service that checks, modifies the results, and reflects the modified results directly on the object identification.

In addition, the URI management service allows you to dynamically modify and delete object-URI relationships, as well as select representative URIs and representative object names, and identify object identification results by applying object identification algorithms. By increasing the consistency, you can maximize the efficiency of the semantic web.

In addition, by using the object identification result and the user's edit result to identify the added URI, it is possible to quickly and accurately identify the incrementally increasing URI, and the user visually confirms the result of the object identification and corrects the result. Results can be directly reflected in object identification.

1 is a diagram showing an object identification result retrieval system according to the present invention;
Figure 2 is a block diagram schematically showing the configuration of the object identification result search apparatus according to the present invention.
3 is a block diagram specifically illustrating the entity identification module shown in FIG. 2;
4 is a flowchart illustrating a method for retrieving and providing an entity identification result by the entity identification result retrieval apparatus according to the present invention.
5 is a flowchart illustrating a method for identifying an object by the object identification result retrieval apparatus according to the present invention.
6 is an exemplary view showing an object identification result search screen according to the present invention.

The foregoing and other objects, features, and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings, in which: FIG.

First, the terms used in the present invention will be described.

Linked Data is a movement and code of conduct for the data network at the heart of the next generation of the Web. The goal is to open and connect data freely on the web, and to enable these data to collaborate again to realize a true data web. Therefore, HTTP (hypertext transfer protocol) is used for data distribution on the web, and Resource Description Framework (RDF) and Simple Protocol and RDF Query Language (SPARQL) are used to ensure connectivity and accessibility, which is included in the ontology. This is a standard query language used for querying collected information. It performs functions similar to the structured query language (SQL) of the database management system (DBMS).

In addition, Linked Data names specific concepts as Uniform Resource Identifiers (URIs), allows access to resources named as URIs over HTTP, and provides detailed information contained by RDF-based URIs when accessed. It provides access to other relevant concepts contained in RDF.

URIs are identifiers of classes, properties, and entities that are mapped to unique addresses on the Web to share information resources over the Web. In other words, it is an internet address and identifier used to access information of a class, property, or object through a URL. The URI is used in the term entity identifier.

Ontology is constructed by defining concepts and attributes and establishing the relationship between concepts and concepts by obtaining knowledge about the components of ontology that is to be constructed in ontology from related documents of specific field, namely concept, property of concept, connection relation between concepts. In the semantic web, concepts are represented by URIs. For example, the "Person" class is represented in the ontology as a URI such as "http://www.etri.re.kr/example#Person".

The "Person" class has properties such as name, age, birthplace, and so on. The "name" attribute can have a string as its value, the "age" attribute can have an integer as its value, and the "birthplace" attribute can have an instance of "Location" as its value. have.

Entity identification refers to the explicit establishment of a relationship to a URI in order to ensure that the classes, properties, and entities that make up the ontology are interrelated. For example, if the A-1 entity of ontology A is the same as the B-1 entity of ontology B, you can merge A-1 and B-1 related properties by specifying that the A-1 and B-1 entities have the same relationship. To help.

1 is a diagram illustrating an object identification result retrieval system according to the present invention.

Referring to FIG. 1, an object identification result retrieval system includes a plurality of ontology databases 300a, 300n, hereinafter referred to as 300, and attribute URIs associated with a specific URI. And an entity identification result retrieval apparatus (100) for performing entity identification to establish a relationship with each other and providing the result.

The multiple ontology database 300 may be a server that provides services such as Sindice.com service, sameAs.org service, and the like. Here, the Sindice.com service is a semantic web search engine, which continuously collects various types of ontology in real time and, when an object name is input, provides a URI and an associated ontology with the corresponding object name as a search result.

The sameAs.org service refers to a service that collects URIs representing the same entity relationships collected from various ontologies and provides a group of previously identified results when searching for entity names or URIs.

The entity identification result retrieval apparatus 100 obtains an identification result for the query word from a triple storage module when a query word based on an entity name or a URI is input, and obtains the obtained identification result from the multiple ontology database. The object identification is determined by comparison with the identification result of the query, and when the object identification is required, the object information associated with the query is identified by obtaining and comparing attribute information of the object to be identified from the multiple ontology database, Provide result information accordingly. Here, in the triple storage module, the object identification result for each individual is stored in the triple form. Identifying entities associated with the query term refers to acquiring entities associated with the query term and establishing a relationship of synonyms, synonyms, etc. between the query term and the acquired entities.

The entity identification result retrieval apparatus 100 gradually identifies various types of entities by using attribute information from the multiple ontology database 300 included in the linked data.

In addition, the entity identification result retrieval apparatus 100 compares the entity identifier list based on the identification result from the triple storage module and the entity identifier list based on the identification result from the multiple ontology database, and compares the multiple ontology database. If a new entity identifier is included in the list of entity identifiers obtained from the entity, it is determined that the entity identification is required, and the new entity identifier is designated as the entity to be identified.

Then, the entity identification result retrieval apparatus 100 collects attribute information of the new entity identifier from the multiple ontology database 300 through a SPARQL endpoint, and obtains attribute information of the new entity identifier and the triple storage module. Individual identification is performed using the identification result. Here, the SPARQL endpoint refers to a service point of contact that allows access to the multiple ontology database 300, and provides the ontology information in RDF or various other formats through a web service.

In addition, the entity identification result retrieval apparatus 100 stores and updates the edited information as the entity identification result in the triple storage module when editing by at least one of the modification, deletion, and movement of the entity identification result is performed. The updated entity identification result is later used for entity identification of the query.

A detailed description of the apparatus 100 for identifying an individual identification result performing the above role will be described with reference to FIG. 2.

FIG. 2 is a block diagram schematically showing the configuration of an apparatus for recognizing an object identification result according to the present invention, and FIG. 3 is a block diagram specifically showing the object identification module shown in FIG.

Referring to FIG. 2, the object identification result retrieval apparatus 100 may include a communication module 110 for communicating with a multiple ontology database, a user interface module 120 for receiving a query from a user, a triple storage module 130, and an entity. The identifier list comparison module 140, the object identification module 150, the representative name selection module 160, the visualization module 170, and the editing module 180 are included.

The triple storage module 130 stores entity identification results for each entity in triple form. That is, the triple storage module 130 stores entity identifier group summary information and edited entity identifier group summary information for each entity. The entity identifier group summary information includes the entity number, entity type, representative entity name, representative entity identifier, etc. for each entity identifier group.

In addition, the triple storage module 130 stores information about the object type, representative attribute values for each type, and lower attributes to be considered together in an ontology form. That is, the triple storage module 130 stores only the information necessary for identifying the individual and the identification result among the multiple ontology information, and other information is obtained by accessing the multiple ontology database in real time.

The entity identifier list comparison module 140 determines whether an identification result for a query received through the communication module 110 or input through the user interface module 120 exists in the triple storage module. If present, the identification result of the query word obtained from the triple storage module is compared with the identification result of the query word obtained from the multiple ontology database. The query term may be in the form of an entity name or a URI, and the identification result of the query term may be an entity identifier list generated by identification of entities associated with the query term.

In addition, the entity identifier list comparison module 140 obtains an identification result of the query word from the multiple ontology database when the identification result of the query word does not exist in the triple storage module 130. 170) to make it visible. In this case, the identification result visualized through the visualization module 170 may be entity identifier group summary information in which relationships of entities related to the query word are set, such as identity relations and similarity relations.

The entity identification module 150 obtains attribute information on the new entity identifier when the entity identifier list is included in the entity identifier list obtained from the multiple ontology database as a result of the comparison of the entity identifier list comparison module 140. And compare and perform entity identification and generate an entity identifier group for the query by the entity identification. That is, the entity identification module 150 determines that the entity identification for the new entity identifier is necessary when the entity identifier list is included in the entity identifier list obtained from the multiple ontology database. Perform entity identification.

The object identification module 150 as described above will be described in more detail with reference to FIG. 3.

Referring to FIG. 3, the entity identification module 150 includes an attribute information collecting unit 151, a first grouping unit 152, a second grouping unit 153, an entity identifier group determining unit 154, and a verification unit 155. ).

The attribute information collecting unit 151 collects attribute information of the new entity identifier from the multiple ontology database. For example, when the new entity identifier is a URI for "Hong Gil-dong", the attribution information collecting unit 151 collects attribution information such as age, occupation, place of birth, e-mail address, etc. for Hong Gil-dong from the multiple ontology database. do.

The first grouping unit 152 groups the new entity identifier according to the entity type by using the attribute representing the type among the attribute information collected by the attribute information collecting unit 151. That is, the first grouping unit 152 determines the entity type of the new entity identifier by using an attribute representing a type among the collected attribute information, and determines the entity type of the entity identifier group from the entity identifier group for the query. A corresponding group is selected and assigned to the group of the new entity identifier.

The second grouping unit 153 loads the same property mapping information and the representative property for each group according to the object type from the previously stored property information table, and regroups the groups according to the object type based on the representative property of each object. do. That is, the second grouping unit 153 calculates the similarity of the string corresponding to the attribute value based on the representative attribute name of the entity identifier for each object type, and sets the entity identifiers having the predetermined similarity or more to a same group as the same group. Group. For example, in the case of a person, since the representative attribute is a human name, the similarity of the string corresponding to the human name is calculated, and the individual identifiers whose calculated similarity has a predetermined threshold or more are made into the same group.

The entity identifier group determiner 154 obtains and compares an associated attribute value for each group of entities grouped by the second grouper 153 from the multiple ontology database, and based on the comparison result, The group is divided into subgroups to determine an entity identifier group for the query. That is, the entity identifier group determiner 154 obtains the association attribute values of the entities in each group from the multiple ontology database, checks whether there is an entity identifier having the same attribute value among the association attribute values of all entities, If there is an object identifier with a value, group the object identifiers with the same attribute value into a subgroup. In this case, when there is no entity identifier having the same attribute value, the second grouping unit 153 maintains the grouped group.

For example, in the case of an entity identifier (URI) having a representative attribute value of "Hong Gil-dong", the entity identifier group determination unit 154 may determine "work", "association attribute values of" Hong Gil-dong "from the multiple ontology database. e-mail address "is obtained, and Hong Gil-dong having the same" attribute "and" e-mail address "as the related attribute values is made into the same group. By the above method, the entity identifier group determiner 154 may group entity identifiers having the same attribute value into subgroups.

The verification unit 155 repeats the process of determining the entity identifier group for the new entity identifier and verifies the entity identifier group determined by the entity identifier group determiner 154.

Referring back to FIG. 2, the representative name selection module 160 selects and applies a representative name for each entity identifier group generated by the entity identification module 150, and applies the entity identification result to which the representative name is applied to the triple storage module. Save to 130 and update. In this case, the representative name selection module 160 selects the representative identifier or representative entity name for each group by using the generated entity identifier and the statistical value of the entity name for each generated entity identifier group.

The visualization module 170 obtains and visualizes information on the entity identification result updated by the representative name selection module 160 from the triple storage module 130. Here, the information on the object identification result may refer to entity identifier group summary information including the number of entities, entity type, representative entity name, representative entity identifier, etc. for each entity identifier group.

In addition, the visualization module 170 may identify an object including a query input area, an ontology database association command, an object identifier group summary list, an object identifier group summary graph, a detailed object identifier list for each group, an identification command, an editing command, and the like. Print the results provided screen.

When the identification command is selected on the entity identification result providing screen, information about the updated entity identification result is output to the triple storage module 130.

In addition, if one entity identifier list is selected from the entity identifier list for each group, the verification result for the entity identification is output to a predetermined area of the entity identification result providing screen.

In addition, the object identifier such as the representative object name or representative object identifier of the object identifier group, merging different groups or dividing one group into different groups by the editing command, and moving an object identifier belonging to a specific group to another group You can edit the information about the group.

The editing module 180 stores and updates the editing result in the triple storage module 130 when the editing of the object identifier summary information is performed on the object identification result providing screen visualized by the visualization module 170. do.

4 is a flowchart illustrating a method of searching for and providing an object identification result by an object identification result search apparatus according to the present invention, and FIG. 6 is an exemplary view showing an object identification result search screen according to the present invention.

Referring to FIG. 4, when the query by the entity name or the entity identifier (URI) is input (S402), the apparatus for recognizing the entity identification result determines whether an identification result for the input query word exists in the triple storage module. (S404). The triple storage module stores the entity identification result expressed in ontology, and the entity identification result retrieval apparatus uses a new entity name or a new entity identifier (ID) based on whether an identification result of the query word exists in the triple storage module. Determine whether or not URI) is input.

As a result of the determination in S404, when the identification result for the query word exists in the triple storage module, the entity identification result retrieval device obtains the identification result for the query word from the multiple ontology database (S406). That is, the individual identification result retrieval apparatus obtains the identification result for the query using Sindice.com, sameAs.org, and the like. The identification result for the query word refers to a list of entity identifiers of entities related to the query word.

After performing S406, the object identification result retrieval apparatus compares the identification result of the query word obtained from the triple storage module with the identification result of the query word obtained from the multiple ontology database (S408). It is determined whether the list of identifiers is the same (S410). That is, the entity identification result retrieval apparatus compares the entity identifier list according to the identification result from the triple storage module with the entity identifier list based on the identification result from the multiple ontology database, and compares the entity identifier list to the entity identifier list from the multiple ontology database. Determine whether a new entity identifier is included.

If the two lists are the same as the determination result of S410, the entity identification result search apparatus visualizes the identification result for the query word obtained from the triple storage module (S412). The identification result of the query word is entity identifier group summary information indicating summary information of all groups in which the query word is used, and the entity identifier group summary information includes the number of entity identifiers, entity types, representative entity names and representative entity identifiers of the group. Include. That is, the entity identification result retrieval apparatus includes at least one of a query input region, an external database association command, an entity identifier group summary list, an entity identifier group summary graph, a detailed entity identifier list for each group, an identification command, and an editing command. Outputs the screen for providing the object identification result.

When the user edits the visualized entity identification result (S414), the entity identification result search apparatus stores the edit result in the triple storage module (S416). That is, when the user edits at least one of the modification, deletion, and movement of the entity identification result, the entity identification result search apparatus stores and updates the edit result as the entity identification result in the triple storage module. do.

Referring to FIG. 6 for the entity identification result providing screen, the entity identification result providing screen includes a query entry area for inputting an entity identifier or a query name of an entity, an entity in which entity identifier group summary information for the query is displayed in a list and a graph. An identifier group summary information providing area, and a detailed entity identifier list providing area where a detailed entity identifier list for each entity identifier group is provided.

There is a command for linking with an external service in the query input region.

The entity identifier group summary information providing area includes an identification command selected when the entity identification using the updated information is desired in the triple storage module and an editing command for editing the entity identifier group information.

Select the edit command to edit the representative entity name or representative entity identifier of the entity identifier group, merge different groups or divide one group into different groups, and move entity identifiers belonging to a specific group to another group. Can be performed. When the user modifies the representative entity name and the representative entity identifier of the entity identifier group, the entity identification result retrieval apparatus maintains the representative entity identifier and the representative entity name modified by the user even if the statistics of the group are subsequently adjusted. This adjusted result is also used later in the identification process.

In addition, the user may move or delete an object identifier belonging to a specific group to another group by using an editing command. The edited result as described above is stored in the triple storage module again and later reflected in the entity identification process.

When a specific object identifier list is selected in the detailed object identifier list providing area, the object identification result retrieving device outputs a verification result for the object identification to a predetermined area of the object identification result providing screen. That is, the verification result shows a correlation between the various attributes used to identify the object in a graph, and the user can confirm the object identification result through the graph.

In addition, when the individual entity identifier is selected in the detailed entity identifier list providing area, the entity identification result retrieval device also provides a function of identifying related attributes through the SPARQL endpoint service provided by the ontology.

If the two lists are not the same as the result of the determination in S410, the entity identification result retrieval apparatus identifies the entity by acquiring and comparing the attribute information of the entity identifier with the new entity identifier as an object of identification, and by the entity identification, An entity identifier group for the query is generated (S418). That is, when a new entity name or a new entity identifier exists, the entity identification apparatus acquires attribute information of the new entity identifier for the multiple ontology database, and identifies the entity using the acquired attribute information. When an entity is identified as above, entity identifier groups containing new entity identifiers are created. The above object identification process is not performed on every object identifier every time, but on an object identifier that contains an updated object identifier from the list of object identifiers collected through a user's query or does not go through the object identification process. do. A detailed description of how the entity identification result retrieval apparatus identifies the entity will be described with reference to FIG. 5.

When S418 is performed, the entity identification result search apparatus updates the triple storage module by selecting a representative entity identifier or a representative entity name for each entity identifier group generated through the entity identification (S420). That is, the entity identification result retrieval apparatus selects a representative entity identifier and a representative entity name of each group by using an entity identifier belonging to each group and a statistical value of the entity name for the entity identifier group on which the entity identification is completed.

After performing the S420, the object identification result search apparatus obtains and visualizes the information on the updated object identification result from the triple storage module (S422), and performs the S414.

If the identification result of the query word does not exist in the triple storage module as a result of the determination in S404, the entity identification result retrieval apparatus obtains and visualizes the identification result of the query word from the multiple ontology database (S424). .

5 is a flowchart illustrating a method for identifying an entity by the entity identification result retrieval apparatus according to the present invention.

Referring to FIG. 5, the apparatus for recognizing an entity identification result acquires attribute information of a new entity identifier to be identified from a multiple ontology database (S502), and uses the attribute representing a type in the acquired attribute information according to the entity type. A group is assigned to the entity identifier (S504). That is, the object identification result retrieval apparatus checks the property that can distinguish the type of the object through the parent-child relationship between the rdf: type or the explicit type class among the property information of the new object identifier, and checks the identified object type. Assign the new entity identifier.

For example, referring to Table 1, if a subclass of foaf: person exists, the subclass can be determined to be a class meaning a lifespan. That is, if kisti: person, class 1 belonging to Ontology A, is a subclass of foaf: person, the object identification result retrieval device determines the type of the object later by identifying and storing that kisti: person is the object type corresponding to the person. Used for

kisti: person rdf: type Identity: Person foaf: person rdf: type Identity: Person kisti: institution rdf: type Identity: Institution

When the S504 is performed, the entity identification result retrieval apparatus loads the same attribute mapping information and the representative attribute for each group grouped according to the entity type in the attribute information table (S506). That is, the object identification result retrieval apparatus may use the same attribute mapping information (e.g., kisti: hasCreator and foaf: maker are the same attribute) and representative attributes (e.g. human The representative attribute of is loaded with “foaf: name” attribute in foaf ontology.

When S506 is performed, the entity identification result retrieval apparatus regroups the groups grouped by the entity type based on the representative attribute (S508). That is, the entity identification result retrieval device calculates the similarity of the string corresponding to the attribute value based on the representative attribute name of the entity identifier for each entity type, and groups the entity identifiers having the predetermined similarity greater than or equal to a predetermined threshold value into the same group. .

For example, in the case of people, the similarity is calculated for the human name and in the case of the company, and grouped using the calculated result. In this case, we use the Jaro Winkler Distance string comparison method, which is suitable for short string comparisons.

 In addition, the entity identification result retrieval apparatus forms the representative attribute names in a predetermined format according to a predetermined rule, and then groups them through similarity calculation. For example, in the case of human names, the similarity is calculated through string comparison considering the First Name, Middle Name, and Last Name.

In addition, in the case of the institution name, similarity is calculated through string comparisons considering notation names such as ".inc", "co", and "INC".

After performing the step S508, the entity identification result retrieval apparatus obtains and compares the associated attribute values of the entity identifiers belonging to each group from the multiple ontology database and divides each group into subgroups based on the comparison result. The entity identifier group for the is determined (S510).

That is, the entity identification result retrieval apparatus obtains an association attribute value for an entity identifier belonging to each group from the multiple ontology database and checks whether entity identifiers having the same association attribute value exist. If there are entity identifiers having the same attribute value as the result of the checking, the entity identification result retrieval apparatus determines the entity identifier group for the query by dividing the entity identifiers having the same attribute value into subgroups.

Through this process, the entity identification result retrieval device may identify entity identifiers classified into different groups or entity identifiers classified into different groups or entity identifiers having the same group but no mapping property in the process of grouping the entity identifiers based on the previous entity type or entity name. .

That is, when the representative attributes are mapped to the same association attribute value A of the entity identifiers 1 and 2 belonging to different groups, the entity identification result retrieval device obtains the association attribute value of A so that the entity identifiers 1 and 2 are both Check if it is included as an association property.

For example, if the individual identifiers 1 and 2 are different author names, but all have the author relationship of the A paper, the individual identification result retrieval apparatus collects the association attributes of the A paper, and the individual identifiers 1 and 2 are all the author relationship. Check that it is connected.

When the S510 is performed, the object identification result retrieval apparatus repeats the steps S502 to S510 to verify the object identification result and finalizes the object identifier group (S512).

Thus, those skilled in the art will appreciate that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. It is therefore to be understood that the embodiments described above are to be considered in all respects only as illustrative and not restrictive. The scope of the present invention is defined by the appended claims rather than the detailed description and all changes or modifications derived from the meaning and scope of the claims and their equivalents are to be construed as being included within the scope of the present invention do.

100: object identification result search device 110: communication module
120: user interface module 130: triple storage module
140: URI list comparison module 150: object identification module
160: representative name selection module 170: visualization module
180: edit module 300: multiple ontology database

Claims (28)

A multiple ontology database in which attribute information of entities is stored as an ontology; And
When a query is input, the identification result of the query is obtained from the triple storage module, and the obtained identification result is compared with the identification result of the query obtained from the multiple ontology database to determine whether individual identification is necessary. If the object identification is required, the object identification result search apparatus for identifying the objects by obtaining and comparing the attribute information of the object to be identified from the multiple ontology database, and provides the result information according to the object identification,
The object identification result retrieval apparatus may include at least one of a query input region, an ontology database association command, an object identifier group summary list, an object identifier group summary graph, a detailed object identifier list for each group, an identification command, and an editing command. Outputting a result providing screen, and when editing by at least one of modifying, deleting, and moving result information according to object identification is performed through the object identification result providing screen, identifying the edited object in the triple storage module And storing and updating the result, wherein the updated entity identification result is used later for entity identification of the query .
The method of claim 1,
The triple storage module, the object identification result retrieval system, characterized in that the object identification results for each object is stored in a triple form.
The method of claim 1,
The entity identification result retrieval apparatus compares the entity identifier list based on the identification result from the triple storage module with the entity identifier list based on the identification result from the multiple ontology database, and compares the entity identifier obtained from the multiple ontology database. If the list includes a new object identifier, the object identification result retrieval system characterized in that it is determined that the object identification is necessary and the new object identifier as the object to be identified.
The method of claim 1,
The object identification result retrieval apparatus is grouped according to the object type using an attribute indicating the type in the attribute information of the subjects to be identified obtained from the multiple ontology database, and grouped according to the object type based on the representative attributes of the respective entities. After regrouping the groups, the property values of the entities for each of the groups regrouped are obtained and compared from the multiple ontology database, and the individual groups are divided into subgroups based on the comparison result. Generating a group of identifiers.
The method of claim 1 ,
The object identification result search apparatus,
Representative entity identifiers or representative entity names are selected for each entity identifier group for the query, and entity identifier group summary information including at least one of entity number, entity type, representative entity name, and representative entity identifier for each entity identifier group is obtained. An object identification result retrieval system, characterized in that provided as the result information according to the object identification.
delete A triple storage module storing entity identification results for each entity in triple form;
It is determined whether an identification result of the query word exists in the triple storage module, and if present, the identification result of the query word obtained from the triple storage module is compared with the identification result of the query word obtained from a multiple ontology database. An object identifier list comparison module for determining whether object identification is necessary;
An entity identification module for performing entity identification by acquiring and comparing attribute information on a new entity identifier when the entity identification is necessary and generating an entity identifier group for the query by the entity identification;
A representative name selection module for selecting and applying a representative name for each generated entity identifier group, and storing and updating the entity identification result to which the representative name is applied to the triple storage module;
The information on the object identification result is obtained from the triple storage module, and a query input area, an ontology database linkage command, an object identifier group summary list, an object identifier group summary graph, a detailed object identifier list for each group, an identification command, and an edit are made. A visualization module that visualizes an object identification result providing screen including at least one of a command; And
When editing by at least one of the modification, deletion, and movement of the object identification result is performed through the visualized object identification result providing screen, the editing result is stored and updated as the object identification result in the triple storage module, and An editing module for updating the updated entity identification result to be later used for entity identification of the query;
Object identification result search device comprising a.
delete delete The method of claim 7, wherein
The entity identifier list comparison module obtains an identification result of the query word from the multiple ontology database and makes it visible through the visualization module when the identification result of the query word does not exist in the triple storage module. Object identification result retrieval device.
The method of claim 7, wherein
The entity identifier list comparison module compares the entity identifier list according to the identification result of the query word obtained from the triple storage module with the entity identifier list according to the identification result of the query word obtained from the multiple ontology database, and compares the entity identifier list. As a result, when the object identifier list obtained from the multiple ontology database includes the new entity identifier, it is determined that the entity identification is necessary, and the new entity identifier is used as the object to be identified.
The method of claim 7, wherein
And the triple storage module stores entity identifier group summary information and edited entity identifier group summary information for each entity.
The method of claim 7, wherein
And said query is in the form of an entity name or a URI.
The method of claim 7, wherein
The object identification module,
The attribute information of the new entity identifier is obtained from the multiple ontology database and grouped based on the entity type, grouped again based on the representative attribute of each entity for each group according to the entity type, and then for each regrouped group. The object identification result retrieval method may include obtaining and comparing an associated attribute value for each entity from the multiple ontology database, and dividing each group into subgroups based on the comparison result to determine an entity identifier group for the query. Device.
The method of claim 7, wherein
The representative name selection module
And a representative identifier or representative entity name for each group by using the generated entity identifier group and the statistical value of the entity name of each entity group.
The method of claim 7, wherein
The object identification result retrieval apparatus may include information on an object identifier group summary information including at least one of an object number, an object type, a representative entity name, and a representative entity identifier for each entity identifier group. .
delete 8. The method of claim 7,
When the identification command is selected on the object identification result providing screen, information about the updated entity identification result is output to the triple storage module.
And when a single entity identifier list is selected from the list of entity identifiers for each group, a verification result of the entity identification is output to a predetermined region of the object identification result providing screen.
The method of claim 7, wherein
Editing of at least one of the representative entity name or representative entity identifier of the entity identifier group, merging different groups, dividing one group into different groups, and moving an entity identifier belonging to a specific group to another group by the editing command Device identification result retrieval apparatus, characterized in that performing.
In the object identification result retrieval apparatus to provide the object identification results for the query,
(a) when a query is input, determining whether an identification result of the query exists in the triple storage module ;
(b) comparing the identification result of the query word obtained from the triple storage module with the identification result of the query word obtained from a multiple ontology database when the determination result exists;
(c) when the comparison result includes a new entity identifier in the identification result obtained from the multiple ontology database, obtains and compares attribute information on the new entity identifier, and performs entity identification on the query; Creating an entity identifier group for the query by identification;
(d) selecting and applying a representative name for each generated entity identifier group, and storing and updating the entity identification result to which the representative name is applied to the triple storage module;
(e) obtaining information on the object identification result from the triple storage module, querying input area, ontology database linkage command, object identifier group summary list, object identifier group summary graph, detailed object identifier list for each group, identification Visualizing an object identification result providing screen including at least one of a command and an editing command; And
(f) storing and updating the edited result as an object identification result in the triple storage module when editing by at least one of the modification, deletion, and movement of the object identification result is performed through the object identification result providing screen; ;
Object identification result search method comprising a.
delete delete 21. The method of claim 20,
In the step (b)
As a result of the determination in step (a), when the identification result for the query word does not exist in the triple storage module, the identification result search for the individual search word is collected and visualized from the multiple ontology database. Way.
21. The method of claim 20,
The step (b)
And comparing the entity identifier list according to the identification result from the triple storage module with the entity identifier list according to the identification result from the multiple ontology database.
21. The method of claim 20,
And said query is in the form of an entity name or a URI.
21. The method of claim 20,
The step (c)
The attribute information of the new entity identifier is obtained from the multiple ontology database and grouped based on the entity type, grouped again based on the representative attribute of each entity for each group according to the entity type, and then for each regrouped group. The object identification result retrieval method may include obtaining and comparing an associated attribute value for each entity from the multiple ontology database, and dividing each group into subgroups based on the comparison result to determine an entity identifier group for the query. Way.
21. The method of claim 20,
The step (d)
And a representative identifier or representative entity name for each group is selected for each generated entity identifier group using statistical values of entity identifiers and entity names belonging to each group.
21. The method of claim 20,
The object identification result retrieval method comprises the entity identifier group summary information including at least one of an entity number, an entity type, a representative entity name, and a representative entity identifier for each entity identifier group. .
KR1020110067703A 2011-07-08 2011-07-08 System and Method for searching of entity identification result KR101243056B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020110067703A KR101243056B1 (en) 2011-07-08 2011-07-08 System and Method for searching of entity identification result
PCT/KR2011/007357 WO2013008978A1 (en) 2011-07-08 2011-10-05 Object identification result searching system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020110067703A KR101243056B1 (en) 2011-07-08 2011-07-08 System and Method for searching of entity identification result

Publications (2)

Publication Number Publication Date
KR20130005967A KR20130005967A (en) 2013-01-16
KR101243056B1 true KR101243056B1 (en) 2013-03-13

Family

ID=47506243

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020110067703A KR101243056B1 (en) 2011-07-08 2011-07-08 System and Method for searching of entity identification result

Country Status (2)

Country Link
KR (1) KR101243056B1 (en)
WO (1) WO2013008978A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130275709A1 (en) 2012-04-12 2013-10-17 Micron Technology, Inc. Methods for reading data from a storage buffer including delaying activation of a column select
US10467235B2 (en) * 2013-04-11 2019-11-05 The Boeing Company Identifying contextual results within associative memories
KR101586258B1 (en) 2014-09-30 2016-01-18 경북대학교 산학협력단 Conflict resolution method of patterns for generating linked data, recording medium and device for performing the method
KR102309375B1 (en) * 2019-06-26 2021-10-06 주식회사 카카오 Apparatus and method for knowledge graph indexing
US20240095105A1 (en) * 2022-09-20 2024-03-21 Sap Se Message query service and message generation for a social network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090012467A (en) * 2007-07-30 2009-02-04 한국과학기술정보연구원 System and method for providing integrated search using uniform resource identifier database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100930617B1 (en) * 2008-04-08 2009-12-09 한국과학기술정보연구원 Multiple object-oriented integrated search system and method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090012467A (en) * 2007-07-30 2009-02-04 한국과학기술정보연구원 System and method for providing integrated search using uniform resource identifier database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
논문(2011.06) *

Also Published As

Publication number Publication date
KR20130005967A (en) 2013-01-16
WO2013008978A1 (en) 2013-01-17

Similar Documents

Publication Publication Date Title
US20210286848A1 (en) Query language interoperability in a graph database
CN105706078B (en) Automatic definition of entity collections
US7702685B2 (en) Querying social networks
CN107092666B (en) For the system, method and storage medium of network
US10579678B2 (en) Dynamic hierarchy generation based on graph data
CN108027818A (en) Inquiry based on figure
EP2365447A1 (en) Data integration system
CN102262650B (en) The data base of link
Yeganeh et al. A framework for data quality aware query systems
KR101243056B1 (en) System and Method for searching of entity identification result
Spirin et al. People search within an online social network: Large scale analysis of facebook graph search query logs
Ivánová et al. Searching for spatial data resources by fitness for use
Ioannou et al. Probabilistic entity linkage for heterogeneous information spaces
Drăgan et al. Linking semantic desktop data to the web of data
Matuszka et al. Geodint: towards semantic web-based geographic data integration
US10944756B2 (en) Access control
US9659059B2 (en) Matching large sets of words
CN114880483A (en) Metadata knowledge graph construction method, storage medium and system
Mathew et al. An efficient index based query handling model for neo4j
KR101521112B1 (en) Apparatus and method for data linking and merging
Zhang et al. Semantic web and geospatial unique features based geospatial data integration
Yong-Xin et al. A novel method for data conflict resolution using multiple rules
Kuznetsov Scientific data integration system in the linked open data space
CN115803731A (en) Database management system and method for graph view selection of relational database databases
KR20130076348A (en) Method and apparatus for managing foaf data

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20160202

Year of fee payment: 4

FPAY Annual fee payment

Payment date: 20161228

Year of fee payment: 5

LAPS Lapse due to unpaid annual fee