CN113392202A - Knowledge graph-based question-answering system and method - Google Patents

Knowledge graph-based question-answering system and method Download PDF

Info

Publication number
CN113392202A
CN113392202A CN202110690975.1A CN202110690975A CN113392202A CN 113392202 A CN113392202 A CN 113392202A CN 202110690975 A CN202110690975 A CN 202110690975A CN 113392202 A CN113392202 A CN 113392202A
Authority
CN
China
Prior art keywords
engine
layer
data
kbqa
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110690975.1A
Other languages
Chinese (zh)
Inventor
张晓阳
沈栋
刘漱琰
丁贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110690975.1A priority Critical patent/CN113392202A/en
Publication of CN113392202A publication Critical patent/CN113392202A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The specification relates to the technical field of knowledge maps, and particularly discloses a question-answering system and a question-answering method based on the knowledge maps, wherein the system comprises an application layer, a standard layer, an engine layer and a data layer, wherein the application layer receives question data input by a user and issues the question data to the standard layer; the standard layer issues the problem data to the engine layer; a KBQA engine in an engine layer acquires a plurality of conceptualized problem templates from a template database in a data layer, performs semantic analysis, linkage and intention identification on received problem data to determine a target conceptualized problem template, and issues a target query statement configured in the target conceptualized problem template to a real-time search engine in the data layer; the real-time search engine queries data in a graph database in a data layer according to the target query statement to obtain result data, and the KBQA engine generates a target answer according to the result data. The system can be suitable for relatively complicated inquiry methods in customer service scenes, and is favorable for improving user experience.

Description

Knowledge graph-based question-answering system and method
Technical Field
The specification relates to the technical field of knowledge graphs, in particular to a question-answering system and a question-answering method based on knowledge graphs.
Background
The knowledge graph is a large-scale semantic network, which is composed of concept entities and semantic relations, and describes various entities or concepts existing in the real world and relations thereof through node representation entities or concepts and edge representation relations, and is generally represented by triples, namely: head entity, relationship, and tail entity. As an important component of artificial intelligence technology, knowledge maps have been widely applied in intelligent search, man-machine question answering, personalized recommendation and other directions due to their strong interconnected organization, information retrieval and knowledge reasoning capabilities, and provide a technical basis for the intellectual organization and intelligent application in a plurality of fields such as medical treatment, finance and the like.
At present, the knowledge graph is applied to intelligent question answering of organizations such as banks and the like. The existing knowledge graph-based question-answering method mainly comprises two implementation modes of deep learning and template configuration. However, the deep learning implementation mode has poor interpretability and difficult model modification, and cannot meet the requirements of remote bank customer service scenes on high answer accuracy and rapid, accurate and real-time modification of model errors; the template-based method is fast and simple, but cannot meet the relatively complex customer service inquiry method in the bank customer service scene.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the specification provides a question-answering system and a question-answering method based on a knowledge graph, and the question-answering system and the question-answering method can quickly and conveniently meet the requirements of complex scenes.
The embodiment of the present specification further provides a question-answering system based on a knowledge graph, including: the system comprises an application layer, a standard layer, an engine layer and a data layer, wherein: the application layer is used for receiving question data input by a user and sending the question data to the standard layer; the standard layer is arranged between the application layer and the engine layer and used for sending the problem data to the engine layer; the data layer comprises a real-time search engine, a map database and a template database, wherein the map database is used for storing map data; the template database is used for storing conceptual problem templates; the engine layer comprises a KBQA engine, wherein the KBQA engine is used for acquiring a plurality of conceptualized problem templates from the template database, and is also used for performing semantic analysis, linkage and intention identification on received problem data so as to determine a target conceptualized problem template from the conceptualized problem templates and transmitting a target query statement configured in the target conceptualized problem template to a real-time search engine in the data layer; the real-time search engine is used for querying data in the graph database according to the target query statement to obtain result data corresponding to the question data and returning the result data to the KBQA engine, the KBQA engine is used for generating a target answer according to the result data and returning the target answer to the standard layer, and the standard layer is used for returning the target answer to the application layer so as to display the target answer to a user in the question-answer interaction page.
In one embodiment, the KBQA engine is configured to semantically parse the problem data based on a deep learning model.
In one embodiment, the engine layer further includes a graph pipeline for building a knowledge graph based on the structured data.
In one embodiment, the application layer is used for providing a question-answer interaction page and an operation and maintenance interface, the question-answer interaction page is used for receiving question data input by a user through the question-answer interaction page, and the operation and maintenance interface is used for adding and/or modifying synonyms and conceptualizing question templates.
In one embodiment, the standard layer is configured to perform first preprocessing on the received problem data, and send the preprocessed problem data to the engine layer; and the standard layer is used for carrying out second preprocessing on the received target answer and returning the preprocessed target answer to the application layer.
In one embodiment, the standards layer is also used to synchronously update the received entity synonyms into the graph pipeline and the KBQA engine in real time.
In one embodiment, the standard layer is further used for updating the newly added ontology synonym into the KBQA engine in real time; the KBQA engine is further configured to store updated ontology synonyms in the template database.
In one embodiment, the standard layer is also used for updating the newly added conceptualized question template to the KBQA engine in real time; the KBQA engine is also configured to store updated conceptualized problem templates in the template database.
The embodiment of the specification provides a knowledge graph-based question-answering method, which is applied to a knowledge graph-based question-answering system, wherein the question-answering system comprises an application layer, a standard layer, an engine layer and a data layer, and the method comprises the following steps: the application layer receives problem data input by a user and issues the problem data to the standard layer; the standard layer starts a KBQA engine in the engine layer and issues the problem data to the KBQA engine; the KBQA engine reads a plurality of conceptualized problem templates from a spectra database in the data layer; the KBQA engine carries out semantic analysis on the problem data and links the problem data so as to determine a target conceptualized problem template from the conceptualized problem templates; the KBQA engine issues the query statement configured by the target conceptualization problem template to a real-time search engine in the data layer; the real-time search engine searches the map data in the map database in the data layer according to the query statement and returns the queried result data to the KBQA engine; and the KBQA engine generates a target answer according to the result data and returns the target answer to the application layer.
In one embodiment, the KBQA engine semantically parses and links the question data to determine a target conceptual question template from the plurality of conceptual question templates, including: the KBQA engine carries out semantic analysis on the problem data based on a deep learning model so as to identify a plurality of characteristics in the problem data; linking each of the plurality of features to identify entities, relationships, and attributes in a knowledge graph referred to in the issue data; a target conceptualized problem template is determined from the plurality of conceptualized problem templates based on the identified entities, relationships, and attributes.
In the embodiment of the specification, a question-answering system based on a knowledge graph is provided, which comprises an application layer, a standard layer, an engine layer and a data layer, wherein the application layer receives question data input by a user, issues the question data to the standard layer, the standard layer issues the question data to a KBQA engine in the engine layer, the KBQA engine acquires a plurality of conceptual question templates from a template database, performs semantic analysis, link and intention identification on the received question data to determine a target conceptual question template from the plurality of conceptual question templates, issues a target query statement configured in the target conceptual question template to a real-time search engine in the data layer, the real-time search engine can query data in the database according to the target query statement to obtain result data corresponding to the question data, and returns the result data to the KBQA engine, and the KBQA engine generates a target answer according to the result data, and returning the target answers to the standard layer, and then returning the target answers to the application layer by the standard layer so as to display the target answers to the user in the question-answer interaction page. In the scheme, the application layer and the KBQA engine are isolated by arranging the standard layer, so that the returned result of the KBQA engine does not depend on the upper application layer, the upper application layer does not need to sense the bottom map database, the KBQA engine and the map database are interacted by using the returned result format of the standard layer, the KBQA engine and the map database are isolated by using the real-time search engine, and the KBQA engine and the map database are interacted by using the standard query language, so that the map database can be updated in real time. Because the bank customer service question-answering scene is complex, the question data input by the user may not meet the limitation in the template, and at the moment, the KBQA engine can carry out semantic recognition on the question data input by the user, so that the robustness of the whole system can be improved, the question-answering system has the interpretability and the maintainability of the template rule, has generalization and can deal with more complex question methods.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification, are incorporated in and constitute a part of this specification, and are not intended to limit the specification. In the drawings:
FIG. 1 is a schematic diagram of a knowledge-graph based question-answering system in one embodiment of the present specification;
FIG. 2 is a schematic diagram of a knowledge-graph based question-answering system in one embodiment of the present specification;
FIG. 3 is a schematic diagram of an architecture of a knowledge-graph based question-answering system in one embodiment of the present specification;
FIG. 4 illustrates a flow diagram of a knowledge-graph based question-answering method in one embodiment of the present description;
FIG. 5 is a diagram illustrating a new atlas flow in an embodiment of this description;
FIG. 6 is a diagram illustrating a graph-based question-answer usage flow in one embodiment of the present description;
FIG. 7 is a schematic diagram illustrating a graph real-time update flow in one embodiment of the present description;
FIG. 8 is a diagram illustrating an entity synonym batch update flow in one embodiment of the present description;
FIG. 9 is a diagram illustrating a real-time update flow of entity synonyms in one embodiment of the present description;
FIG. 10 is a diagram illustrating a bulk update flow of ontology synonyms in an embodiment of the present specification;
FIG. 11 is a diagram illustrating a real-time ontology synonym update flow in an embodiment of the present specification;
FIG. 12 is a diagram illustrating a problem template batch update flow in one embodiment of the present description;
fig. 13 is a schematic diagram illustrating a real-time problem template updating process in an embodiment of the present specification.
Detailed Description
The principles and spirit of the present description will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely to enable those skilled in the art to better understand and to implement the present description, and are not intended to limit the scope of the present description in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present description may be embodied as a system, an apparatus, a method, or a computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
The embodiment of the specification provides a question-answering system based on a knowledge graph. Fig. 1 is a schematic structural diagram of a knowledge-graph-based question-answering system in an embodiment of the present specification.
As shown in fig. 1, the knowledge-graph-based question-answering system in the present embodiment may include an application layer 101, a standard layer 102, an engine layer 103, and a data layer 104. The application layer 101 may be configured to receive question data input by a user, and send the question data to the standard layer 102. For example, the question data may include text information or voice information input by the user. The standard layer 102 is interposed between the application layer 101 and the engine layer 103, and can be used for sending the question data to the engine layer 103. For voice data, the standard layer 102 may parse the voice data into textual information.
The data layer 104 may include a real-time search engine, a profile database for storing profile data, and a template database. The template database is used for storing conceptualized problem templates. The conceptualization problem template is formed by conceptualizing the problem template, increasing conditions and command parameters and can quickly and conveniently adapt to the requirements of complex scenes.
The engine layer 103 may include a KBQA (KB Question Answering System (QA)) engine. The KBQA engine may be used to obtain a plurality of conceptual problem templates from the template database. The KBQA engine may also be configured to perform semantic parsing, linking, and intent recognition on the received problem data to determine a target conceptualized problem template from the plurality of conceptualized problem templates. The KBQA engine may perform semantic parsing on the problem data to obtain a plurality of features in the problem data. After obtaining the plurality of features, the KBQA engine links each of the plurality of features to identify entities, relationships, and attributes in the knowledge-graph mentioned in the issue data. The KBQA engine may then determine a target conceptualized problem template from the plurality of conceptualized problem templates based on the identified entities, relationships, and attributes.
The KBQA engine may be configured to issue the target query statement configured in the target conceptualized question template to a real-time search engine in the data layer 104.
And the real-time search engine is used for inquiring data in the map database according to the target inquiry statement to obtain result data corresponding to the problem data and returning the result data to the KBQA engine. The real-time search engine can search and query entities, relations and attributes, and search the result data corresponding to the problem data from the map database.
The KBQA engine may generate a target answer from the received result data and return the target answer to the standards layer 102. The standard layer 102 is configured to return the target answer to the application layer 101, so as to present the target answer to the user in the question-answer interaction page.
In the question answering system in the embodiment, the application layer and the KBQA engine are isolated by arranging the standard layer, so that the returned result of the KBQA engine does not depend on the upper application layer, the upper application layer does not need to sense the bottom map database, the two return result formats of the standard layer are used for interaction, the KBQA engine and the map database are isolated by using the real-time search engine, and the two return result formats are interacted by using the standard query language, so that the map database can be updated in real time. Because the bank customer service question-answering scene is complex, the question data input by the user may not meet the limitation in the template, and at the moment, the KBQA engine can carry out semantic recognition on the question data input by the user, so that the robustness of the whole system can be improved, the question-answering system has the interpretability and the maintainability of the template rule, has generalization and can deal with more complex question methods.
In some embodiments of the present description, the KBQA engine may be used to semantically parse the issue data based on a deep learning model. In the embodiment, the question data is subjected to semantic analysis by adopting the deep learning-based model, so that the question answering accuracy of the question answering system can be further improved, the robustness and the adaptability of the system are improved, and the user experience is improved.
In some embodiments in this specification, the engine layer may further include a graph pipeline to construct a knowledge graph based on the structured data. The graph pipeline may be used to construct a site knowledge graph from the structured data. The atlas pipeline may include a data pre-processing module, a search index building module, and an incremental data asynchronous update module. The data preprocessing module can store a structured processing script for performing structured processing on the data. The structured processing script and index building module run in the pipeline. The incremental data asynchronous update module is used for updating the newly added data to the KBQA engine. By the method, the knowledge graph can be constructed based on the structured data, and the newly added data can be updated to the KBQA engine in real time.
In some embodiments of the present description, the application layer may be configured to provide a question-answer interaction page and an operation and maintenance interface, the question-answer interaction page may be configured to receive question data input by a user via the question-answer interaction page, and the operation and maintenance interface may be configured to add and/or modify synonyms and conceptualized question templates. By providing the question-answer interaction service and the operation and maintenance interface, a user can conveniently input question data and display answers to the user, operation and maintenance personnel can conveniently update synonyms and question templates in real time, and model errors can be quickly modified.
In some embodiments in this specification, the standard layer may be configured to perform first preprocessing on the received problem data, and send the preprocessed problem data to the engine layer; the standard layer may be configured to perform second preprocessing on the received target answer, and return the preprocessed target answer to the application layer.
Specifically, the standard layer may perform first preprocessing on the received question and answer data sent by the application layer. The preprocessed problem data is in a format that can be recognized by the engine layer. And the standard layer issues the preprocessed problem data to the engine layer. The standard layer can perform second preprocessing on the target answers returned by the engine layer, and the preprocessed target answers can be displayed to the user by the application layer. The standard layer may return the preprocessed answer data to the application layer. By the method, the application layer is isolated from the engine layer through the standard layer, and the decoupling of the KBQA engine and the application layer is completed.
In some embodiments of the present description, the standard layer may also be used to synchronously update received entity synonyms into the graph pipeline and the KBQA engine in real time. When the entity synonyms need to be updated, the operation and maintenance personnel can input the newly added entity synonyms through the operation and maintenance page of the application layer. And the application layer issues the newly added entity synonyms to the standard layer. The standard layer can update the newly added entity synonyms to the map pipeline and the KBQA engine in real time. By updating the entity synonyms in real time, the knowledge map can be updated in real time, the accuracy of the question-answering system is further improved, and the user experience is improved.
In some embodiments in the present description, the standard layer may also be used to update the newly added ontology synonym to the KBQA engine in real time; the KBQA engine may also be used to store updated ontology synonyms into the template database. When the ontology synonym needs to be updated, the operation and maintenance personnel can input the newly added ontology synonym through the operation and maintenance page of the application layer. And the application layer issues the newly added ontology synonyms to the standard layer. The standard layer can update the newly added ontology synonym into the KBQA engine in real time. The KBQA engine may store the newly added ontology synonyms in the template database. By updating the ontology synonyms in real time, the knowledge map can be updated in real time, the accuracy of the question-answering system is further improved, and the user experience is improved.
In some embodiments in the present description, the standard layer may also be used to update the newly added conceptual problem template to the KBQA engine in real time; the KBQA engine may also be used to store updated conceptualized problem templates in the template database. When the problem template needs to be updated, the operation and maintenance personnel can input a newly added conceptualized template through the operation and maintenance page of the application layer. And the application layer issues the newly added conceptual template to the standard layer. The standard layer can update the newly added conceptualized template to the KBQA engine in real time. The KBQA engine may store the newly added conceptualized template in the template database. By updating the conceptualized template in real time, the knowledge graph can be updated in real time, the accuracy of the question-answering system is further improved, and the user experience is improved.
The above method is described below with reference to a specific example, however, it should be noted that the specific example is only for better describing the present specification and should not be construed as an undue limitation on the present specification.
Referring to fig. 2, a schematic structural diagram of a knowledge-graph-based question-answering system in an embodiment of the present specification is shown. As shown in fig. 2, the question-answering system may include an application layer, a standard layer, an engine layer, and a data layer. The application layer may include a question-answer interaction interface 201 and an operation interface 202. The standard layer may include a Manager 203. The engine layer may include a KBQA engine 204 and an atlas pipeline 205. The data layers may include a real-time search engine 206, a graph database 207, and MySQL (template database) 208. The functions of the respective modules are described below.
The question-answer interactive interface 201 is a main interactive page used by customer service staff or customers and is a question-answer entrance.
The operation and maintenance interface 202 is used to add, modify synonyms and templates in the knowledge graph.
The management layer 203 is an intermediate layer between the front-end interface (question-answer interface 201 and operation-maintenance interface 202) and the KBQA engine 204, and is responsible for merging with the results of the KBQA engine 204 and the real-time search engine 206, and the like.
The KBQA engine 204 is the core part of the knowledge-graph question-answer. The system comprises a dialogue management template, a semantic analysis module, a link module, an ES authentication forwarding module and the like. The dialogue management mainly stores data obtained by each bottom layer module and manages state transition between dialogues. Semantic parsing is responsible for analyzing incoming utterances and corresponding to stored gremlin query templates. The link module is used for entity identification link, ontology entity link and the like. Entity identification linking is primarily responsible for linking the entity in question to the corresponding entity in the graph. Ontology-recognition links are primarily responsible for linking ontology fields appearing in the problem to concepts in the graph. ES (Elasticsearch, a Lucene-based search) authenticated forwarding is a query request used to forward ES.
The graph pipeline 205 is used primarily to construct a site knowledge graph from structured data. The structured processing script and index building module run in the pipeline. The incremental data asynchronous update module is used for updating the newly added data to the KBQA engine.
The real-time search engine 206 is responsible for searching and querying entities, ontologies, attributes.
The profile database 207 is used to store profile data.
MySQL 208 is used to store synonyms and conceptualized problem templates.
Referring to FIG. 3, a schematic diagram illustrating an architecture of a knowledge-graph based question-answering system in one embodiment of this specification is shown. In the situation of customer service documents at bank outlets, the users, the maintainers and the data storage parties of the system are different, and a new architecture needs to be designed to remove coupling. As shown in fig. 3, the standard layer may isolate the application layer from the KBQA engine, so that the return structure of the KBQA engine does not depend on the upper application layer, the upper application layer does not sense the underlying spectrum database, and the two interact with each other using the return structure format of the standard layer;
further, as shown in FIG. 3, to isolate the KBQA engine from the graph database, the two interact using the standard Gremlin query language.
In the KBQA query result data processing process, an operation and maintenance party is a designer of application layer application, a user party is a user of the application layer application, the operation and maintenance party insights the query problem and expected result content of the user party, the operation and maintenance party designs a display front end for the problem type and the expected result form, the operation and maintenance party configures a template for each type of problem type, the form of the expected result content of each type of problem type in a standard layer is obtained, the operation and maintenance party designs a mapping layer, the form of the medium-term result content of the standard layer is mapped into the form (the expected result content is unchanged) required by the front end display, the user party uses the application to query, and the result corresponding to the problem is obtained.
Referring to tables 1 and 2, a conceptualized question template and its corresponding query statement are shown.
TABLE 1
Figure BDA0003126159320000091
As shown in Table 1, [ ENT ] [ Rel ] indicates that the first module of the template should be an entity in the knowledge-graph (ENT is an abbreviation for entity) and the second module should be a relationship in the knowledge-graph (Rel is an abbreviation for relationship). A constraint may also be added, for example, the ID of rel ═ XXX indicates that the ID of the relationship in the knowledge-graph corresponding to the template must be XXX. And finally, obtaining the query statement corresponding to the conceptualized template. The query statement can be used for directly querying a corresponding result in the knowledge graph.
TABLE 2
Figure BDA0003126159320000092
As shown in table 2, the template is mapped to a bank branch in the new country city of the south of the Henan province. Therefore, the three parts in the template restriction statement are entities in the map, and in a further restriction condition, the first and second entities should have labels of province, city, district, street and the like, and the third entity should have a website label.
Considering that the bank customer service question-answering scene is complex, and the customer service question method may not be as regular as the limit in the template, in this specific embodiment, a method of natural semantic analysis based on a model is introduced, which can improve the robustness of the whole system. The dual-channel design has the interpretability and maintainability of template rules and the generalization of a model method, and can deal with more complicated questions.
Based on the same inventive concept, the embodiment of the specification also provides a question-answering method based on the knowledge graph. Referring to fig. 4, a flowchart of a knowledge-graph-based question-answering method in an embodiment of the present specification is shown, which is applied to a knowledge-graph-based question-answering system that includes an application layer, a standard layer, an engine layer, and a data layer. Although the present specification provides method operational steps or apparatus configurations as illustrated in the following examples or figures, more or fewer operational steps or modular units may be included in the methods or apparatus based on conventional or non-inventive efforts. In the case of steps or structures which do not logically have the necessary cause and effect relationship, the execution sequence of the steps or the module structure of the apparatus is not limited to the execution sequence or the module structure described in the embodiments and shown in the drawings. When the described method or module structure is applied in an actual device or end product, the method or module structure according to the embodiments or shown in the drawings can be executed sequentially or executed in parallel (for example, in a parallel processor or multi-thread processing environment, or even in a distributed processing environment).
Specifically, as shown in fig. 4, the method for question answering based on a knowledge graph provided by one embodiment of the present specification may include the following steps.
Step S401, the application layer receives the question data input by the user and issues the question data to the standard layer.
Step S402, the standard layer starts a KBQA engine in the engine layer and issues the problem data to the KBQA engine.
In step S403, the KBQA engine reads a plurality of conceptualized problem templates from the spectrum database in the data layer.
Step S404, the KBQA engine carries out semantic analysis and linking on the problem data so as to determine a target conceptualized problem template from a plurality of conceptualized problem templates.
Step S405, the KBQA engine issues the query statement configured by the target conceptualization problem template to a real-time search engine in the data layer.
Step S406, the real-time search engine searches the map data in the map database in the data layer according to the query statement, and returns the queried result data to the KBQA engine.
Step S407, the KBQA engine generates a target answer according to the result data and returns the target answer to the application layer.
In the above embodiment, the application layer and the KBQA engine are isolated by setting the standard layer, so that the return result of the KBQA engine does not depend on the upper application layer, the upper application layer does not need to sense the underlying spectrum database, the two interact with each other by using the return result format of the standard layer, and the KBQA engine and the spectrum database are isolated by using the real-time search engine, and the two interact with each other by using the standard query language, so that the spectrum database can be updated in real time. Because the bank customer service question-answering scene is complex, the question data input by the user may not meet the limitation in the template, and at the moment, the KBQA engine can carry out semantic recognition on the question data input by the user, so that the robustness of the whole system can be improved, the question-answering system has the interpretability and the maintainability of the template rule, has generalization and can deal with more complex question methods.
In some embodiments of the present description, the KBQA engine semantically parsing and linking the question data to determine a target conceptual question template from the plurality of conceptual question templates may include: the KBQA engine carries out semantic analysis on the problem data based on a deep learning model to obtain a plurality of characteristics in the problem data; linking each of the plurality of features to identify entities, relationships, and attributes in a knowledge graph referred to in the issue data; a target conceptualized problem template is determined from the plurality of conceptualized problem templates based on the identified entities, relationships, and attributes. By the mode, the method for analyzing the natural semantics based on the model is introduced, the robustness of the whole system can be improved, the dual-channel design has the interpretability and the maintainability of the template rule and the generalization of the model method, and the method can cope with more complicated questions.
Referring to fig. 5, fig. 5 is a schematic diagram illustrating a new atlas flow in an embodiment of the present specification. As shown in FIG. 5, the specific steps of the new map creation process are described below.
Step 501, inputting a new map building request to a standard layer.
Step 502, map format (schema) configuration is carried out on the map pipeline.
Step 503, reading source data from the HDFS (Hadoop Distributed File System).
And step 504, knowledge extraction and source data preprocessing.
And 505, mapping knowledge, namely mapping the processed data to a map ontology.
And step 506, storing knowledge, namely storing the map data into a map database.
Referring to fig. 6, a schematic diagram of a graph question-answering using flow in an embodiment of the present specification is shown. As shown in fig. 6, the specific steps are explained as follows.
Step 601, inputting problem data to the standard layer from the front-end interface. Question data such as "what are the sites that are open to the sun in Beijing? ".
Step 602, the standard layer starts the KBQA engine, reads the conceptualization template, and issues a question to the KBQA engine.
In step 603, the input of the link module is a question (query) and the output is one or more masks that have been linked to the respective modules.
Time linking: the description of the question with respect to time is linked to the point in time. Such as noon meeting to 12: 00;
date linking: the date in question is linked to the week, holiday. For example, the question today is linked to year 2020, month 12, month 11, further to week and whether it is a holiday.
Entity linking: an entity of the question text is identified. Such as an entity in a question where the sunny region may be linked to the sunny region
Ontology linking: an ontology in the question text is identified. Such as where the sun-facing region may be linked to the body of the region.
Attribute linking: attributes in the question text are identified. Such as the attribute that business hours can be linked to business hours in the question.
Keyword linking: keywords in the question that have been configured in the keyword library are identified.
And step 604, the intention identification module matches the mask obtained by the link module with the template to obtain the most matched conceptualized template.
And step 605, the query module queries data in a spectrum database according to the Gremlin statement configured by the template.
And 606, the result generation module generates a result statement and feeds the result statement back to the front end for display.
Referring to fig. 7, a schematic diagram of a real-time map updating process in an embodiment of the present disclosure is shown. As shown in FIG. 7, many data updates no longer require a significant amount of time to reconstruct the map due to the architectural design of the KBQA engine decoupling in the system, and the specific update method in the embodiment of the present description is shown in FIG. 7. And after the standard layer receives the updated data, processing the updated data through the preprocessing script to obtain map updated data. The standard layer synchronizes the map update data to the map pipeline and the KBQA engine. And the map pipeline updates the map and stores the map updating data into the map database. And (5) performing searching epitome construction on the map assembly line, and updating the index to a real-time search engine.
Referring to fig. 8, a schematic diagram of a batch update process of entity synonyms in an embodiment of the present specification is shown. As shown in FIG. 8, the standards layer may bulk update entity synonyms to the graph pipeline. The standard layer may restart the KBQA engine and synchronize entity synonyms to the KBQA engine. Referring to fig. 9, a schematic diagram of a real-time update process of entity synonyms in an embodiment of the present specification is shown. As shown in FIG. 9, the standards layer may update entity synonyms to the graph pipeline and the KBQA engine in real-time.
Referring to fig. 10, a schematic diagram of a bulk update process of ontology synonyms in an embodiment of this specification is shown. As shown in FIG. 10, the standards layer may update ontology synonyms to the template database in bulk. The standard layer may restart KBAQ and may bulk update ontology synonyms to KBQA engine. Referring to fig. 11, a schematic diagram of a real-time updating process of ontology synonyms in an embodiment of the present specification is shown. As shown in FIG. 11, the standards layer may update ontology synonyms to the KBQA engine in real-time. The KBQA engine may update ontology synonyms to the template database in real-time.
Referring to fig. 12, a schematic diagram of a batch update flow of problem templates in an embodiment of the present specification is shown. As shown in FIG. 12, the standards layer may batch update the conceptualized problem templates to the KBQA engine. The KBQA engine can reload the templates and update the conceptualized problem templates into the target database in batches. Referring to fig. 13, a schematic diagram of a real-time problem template updating process in an embodiment of the present disclosure is shown. As shown in FIG. 13, the standardization layer updates the conceptualized problem object to the KBQA engine, which reloads the conceptualized problem template and updates the conceptualized problem template to the template database in real time.
From the above description, it can be seen that the embodiments of the present specification achieve the following technical effects: the application layer and the KBQA engine are isolated by arranging the standard layer, so that the returned result of the KBQA engine does not depend on the upper application layer, the upper application layer does not need to sense the bottom map database, the KBQA engine and the KBQA engine interact with each other by using the returned result format of the standard layer, the KBQA engine and the map database are isolated by using the real-time search engine, and the KBQA engine and the map database interact with each other by using the standard query language, so that the map database can be updated. Because the bank customer service question-answering scene is complex, the question data input by the user may not meet the limitation in the template, and at the moment, the KBQA engine can carry out semantic recognition on the question data input by the user, so that the robustness of the whole system can be improved, the question-answering system has the interpretability and the maintainability of the template rule, has generalization and can deal with more complex question methods.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the present specification described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed over a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different from that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present description are not limited to any specific combination of hardware and software.
It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided will be apparent to those of skill in the art upon reading the above description. The scope of the description should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
The above description is only a preferred embodiment of the present disclosure, and is not intended to limit the present disclosure, and it will be apparent to those skilled in the art that various modifications and variations can be made in the embodiment of the present disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present specification shall be included in the protection scope of the present specification.

Claims (10)

1. A question-answering system based on a knowledge graph is characterized by comprising an application layer, a standard layer, an engine layer and a data layer, wherein:
the application layer is used for receiving question data input by a user and sending the question data to the standard layer; the standard layer is arranged between the application layer and the engine layer and used for sending the problem data to the engine layer; the data layer comprises a real-time search engine, a map database and a template database, wherein the map database is used for storing map data; the template database is used for storing conceptual problem templates;
the engine layer comprises a KBQA engine, wherein the KBQA engine is used for acquiring a plurality of conceptualized problem templates from the template database, and is also used for performing semantic analysis, linkage and intention identification on received problem data so as to determine a target conceptualized problem template from the conceptualized problem templates and transmitting a target query statement configured in the target conceptualized problem template to a real-time search engine in the data layer; the real-time search engine is used for querying data in the graph database according to the target query statement to obtain result data corresponding to the question data and returning the result data to the KBQA engine, the KBQA engine is used for generating a target answer according to the result data and returning the target answer to the standard layer, and the standard layer is used for returning the target answer to the application layer so as to display the target answer to a user in a question-answer interaction page.
2. The system of claim 1, wherein the KBQA engine is configured to semantically parse the problem data based on a deep learning model.
3. The system of claim 1, wherein the engine layer further comprises a graph pipeline to construct a knowledge graph based on the structured data.
4. The system of claim 1, wherein the application layer is configured to provide a question-answer interaction page and an operation and maintenance interface, the question-answer interaction page is configured to receive question data input by a user via the question-answer interaction page, and the operation and maintenance interface is configured to add and/or modify synonyms and conceptualize question templates.
5. The system of claim 1, wherein the standard layer is configured to perform a first preprocessing on the received problem data and send the preprocessed problem data to the engine layer;
and the standard layer is used for carrying out second preprocessing on the received target answer and returning the preprocessed target answer to the application layer.
6. The system of claim 3, wherein the standards layer is further configured to synchronously update the received entity synonyms into the graph pipeline and the KBQA engine in real time.
7. The system of claim 3, wherein the standard layer is further configured to update the newly added ontology synonym to the KBQA engine in real time; the KBQA engine is further configured to store updated ontology synonyms in the template database.
8. The system of claim 3, wherein the standard layer is further configured to update the added conceptualized problem template into the KBQA engine in real time; the KBQA engine is also configured to store updated conceptualized problem templates in the template database.
9. A knowledge graph-based question-answering method is applied to a knowledge graph-based question-answering system, the question-answering system comprises an application layer, a standard layer, an engine layer and a data layer, and the method comprises the following steps:
the application layer receives problem data input by a user and issues the problem data to the standard layer;
the standard layer starts a KBQA engine in the engine layer and issues the problem data to the KBQA engine;
the KBQA engine reads a plurality of conceptualized problem templates from a spectra database in the data layer;
the KBQA engine carries out semantic analysis on the problem data and links the problem data so as to determine a target conceptualized problem template from the conceptualized problem templates;
the KBQA engine issues the query statement configured by the target conceptualization problem template to a real-time search engine in the data layer;
the real-time search engine searches the map data in the map database in the data layer according to the query statement and returns the queried result data to the KBQA engine;
and the KBQA engine generates a target answer according to the result data and returns the target answer to the application layer.
10. The method of claim 9, wherein the KBQA engine semantically parses and links the problem data to determine a target conceptual problem template from the plurality of conceptual problem templates, comprising:
the KBQA engine carries out semantic analysis on the problem data based on a deep learning model to obtain a plurality of characteristics in the problem data;
linking each of the plurality of features to identify entities, relationships, and attributes in a knowledge graph referred to in the issue data;
a target conceptualized problem template is determined from the plurality of conceptualized problem templates based on the identified entities, relationships, and attributes.
CN202110690975.1A 2021-06-22 2021-06-22 Knowledge graph-based question-answering system and method Pending CN113392202A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110690975.1A CN113392202A (en) 2021-06-22 2021-06-22 Knowledge graph-based question-answering system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110690975.1A CN113392202A (en) 2021-06-22 2021-06-22 Knowledge graph-based question-answering system and method

Publications (1)

Publication Number Publication Date
CN113392202A true CN113392202A (en) 2021-09-14

Family

ID=77623265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110690975.1A Pending CN113392202A (en) 2021-06-22 2021-06-22 Knowledge graph-based question-answering system and method

Country Status (1)

Country Link
CN (1) CN113392202A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149985A (en) * 2023-10-31 2023-12-01 海信集团控股股份有限公司 Question and answer method, device, equipment and medium based on large model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149985A (en) * 2023-10-31 2023-12-01 海信集团控股股份有限公司 Question and answer method, device, equipment and medium based on large model
CN117149985B (en) * 2023-10-31 2024-03-19 海信集团控股股份有限公司 Question and answer method, device, equipment and medium based on large model

Similar Documents

Publication Publication Date Title
CN106663101A (en) Ontology mapping method and apparatus
US20140236579A1 (en) Method and Device for Performing Natural Language Searches
EP3671526B1 (en) Dependency graph based natural language processing
WO2021120707A1 (en) Intelligent question-answering method and apparatus, computer device, and computer-readable medium
US20200356726A1 (en) Dependency graph based natural language processing
WO2022052639A1 (en) Data query method and apparatus
CN116244344B (en) Retrieval method and device based on user requirements and electronic equipment
CN114218472A (en) Intelligent search system based on knowledge graph
EP3497580A1 (en) Methods and apparatus for semantic knowledge transfer
US20220245353A1 (en) System and method for entity labeling in a natural language understanding (nlu) framework
Giordani et al. Generating SQL queries using natural language syntactic dependencies and metadata
Qin et al. A survey on text-to-sql parsing: Concepts, methods, and future directions
CN112507089A (en) Intelligent question-answering engine based on knowledge graph and implementation method thereof
CN114528312A (en) Method and device for generating structured query language statement
CN116108194A (en) Knowledge graph-based search engine method, system, storage medium and electronic equipment
CN115114419A (en) Question and answer processing method and device, electronic equipment and computer readable medium
CN113392202A (en) Knowledge graph-based question-answering system and method
CN117271558A (en) Language query model construction method, query language acquisition method and related devices
Pietranik et al. A method for ontology alignment based on semantics of attributes
CN115345153A (en) Natural language generation method based on concept network
CN114880483A (en) Metadata knowledge graph construction method, storage medium and system
Dai et al. Qam: question answering system based on knowledge graph in the military
Forcher et al. Semantic logging: Towards explanation-aware das
Hoi et al. Manipulating Data Lakes Intelligently with Java Annotations
CN116991969B (en) Method, system, electronic device and storage medium for retrieving configurable grammar relationship

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination