WO2017076263A1 - 融合知识库处理方法和装置及知识库管理系统、存储介质 - Google Patents

融合知识库处理方法和装置及知识库管理系统、存储介质 Download PDF

Info

Publication number
WO2017076263A1
WO2017076263A1 PCT/CN2016/104136 CN2016104136W WO2017076263A1 WO 2017076263 A1 WO2017076263 A1 WO 2017076263A1 CN 2016104136 W CN2016104136 W CN 2016104136W WO 2017076263 A1 WO2017076263 A1 WO 2017076263A1
Authority
WO
WIPO (PCT)
Prior art keywords
ontology
knowledge base
knowledge
module
fusion
Prior art date
Application number
PCT/CN2016/104136
Other languages
English (en)
French (fr)
Inventor
陈虹
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017076263A1 publication Critical patent/WO2017076263A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Definitions

  • the invention relates to the field of computer and communication technology, in particular to a fusion knowledge base processing method and device, a knowledge base management system and a storage medium.
  • the current question and answer system is usually divided into three categories, including: a search-based question answering system, a knowledge base-based question answering system, and a community collaborative question answering system.
  • search-based question-and-answer system such as the intelligent customer service system
  • FAQ Frequently Asked Questions
  • a knowledge-based question-and-answer system is rarely used, for example, Expert systems, etc.
  • social collaborative question and answer systems such as Baidu know, etc., it is difficult to judge the accuracy of the question and answer and deal with the results of the question and answer, more is to solve the problem through group wisdom, usually lack of standard rules of judgment.
  • the question and answer principle and characteristics of the query question answering system and the knowledge base based knowledge reasoning and answering system are analyzed.
  • the search-based question-and-answer system it is mainly to overcome the shortcomings of the search engine.
  • the shortcomings of the search engine include, for example, the inability to correctly understand and express the user's information needs, and the information returned is too many simple answers that the user wants.
  • the main feature is the use of information retrieval and shallow natural language technology to extract answers from text libraries or web pages. It has the advantages of being unrestricted by the field, understanding the user's intentions, and returning more accurate answers.
  • the question and answer system also has many Disadvantages, because the corpus of the search-based question-and-answer system is mainly FAQ, the following problems must exist:
  • the FAQ is a common problem. Through statistics, there is a colloquial imagination, which is not standard enough, which usually leads to inaccurate matching and low accuracy.
  • the essential technology of the system is retrieval, the return of the answer depends on the similarity calculation, the accuracy is difficult to improve;
  • knowledge of FAQ form The granularity is relatively coarse. For knowledge management personnel, there is no hierarchical structure, and the management cost and threshold are high.
  • the knowledge base For the knowledge base-based question answering system, the knowledge base needs to be built first.
  • the answer can be obtained directly from the knowledge base. It can also be inferred on the basis of the knowledge base.
  • the question and answer system is characterized by more granularity of knowledge, so More accurate answers are retrieved, and indirect answers can be obtained by reasoning when no answers are retrieved.
  • the knowledge base in the question and answer system is usually based on a specific field, and the size of the knowledge base is limited, which affects the scope of use of the question and answer system.
  • the question and answer system in the prior art has various defects in the practical application due to the retrieval question answering system, the knowledge base based question answering system and the community collaborative question answering system, and the performance of the question answering system is difficult to improve.
  • an embodiment of the present invention provides a method and device for processing a fusion knowledge base, a knowledge base management system, and a storage medium, to solve the problem-answering system in the prior art, because of the search-based question answering system and the knowledge base-based
  • the question answering system and the community collaborative question answering system have various defects in practical applications, and the performance of the question answering system is difficult to improve.
  • an embodiment of the present invention provides a method for processing a fusion knowledge base, including:
  • an embodiment of the present invention provides a fusion knowledge base processing apparatus, including:
  • a directory extraction module configured to extract a knowledge classification directory from a common problem FAQ library and an ontology knowledge base, where the ontology knowledge base includes an ontology and an ontology relationship;
  • the mounting module is configured to mount the content of the FAQ library to the ontology knowledge base according to the knowledge ontology, the ontology relationship, and the knowledge classification directory extracted by the directory extraction module, to generate a fusion knowledge base.
  • the embodiment of the present invention provides a knowledge base management system, comprising: a question answering device and the fusion knowledge base processing device according to any one of the above second aspects;
  • the question answering device includes: a user interaction module configured to obtain a problem input by the user;
  • a language processing module configured to perform language processing on a problem acquired by the user interaction module, where the language processing includes word segmentation, part-of-speech tagging, and entity recognition;
  • a normalization module configured to perform normalization processing on a problem processed by the language processing module, the standardization processing including typo correction, language conversion, and standardization and word recognition;
  • a semantic understanding module configured to perform semantic understanding on a problem processed by the standardized module to obtain semantic information of the problem
  • the information retrieval module is configured to retrieve the semantic information acquired by the semantic understanding module in the ontology knowledge base to obtain an answer to the question.
  • an embodiment of the present invention provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to execute the fusion knowledge base processing method provided by the first aspect of the present invention. .
  • the fusion knowledge base processing method and device, the knowledge base management system and the storage medium provided by the invention extract the knowledge classification directory from the FAQ library and the ontology knowledge base, and according to the knowledge ontology, the ontology relationship and the extracted body in the ontology knowledge base
  • the knowledge classification directory, the contents of the FAQ library are loaded into the ontology knowledge base, the fusion knowledge base is generated, and the fusion knowledge base generated by the embodiment is applied to the question and answer system, and the knowledge quiz is implemented through the FAQ library and the ontology knowledge base.
  • the answers can be obtained through the fusion knowledge base. While combining the advantages of the FAQ library and the ontology knowledge base, the disadvantages of the FAQ library and the ontology knowledge base can be avoided.
  • the method provided by the present invention solves the question and answer system in the prior art. Because the search-based question answering system, the knowledge base-based question answering system, and the community collaborative question answering system have various defects in practical applications, the performance of the question answering system is difficult to improve.
  • FIG. 1 is a flowchart of a method for processing a fusion knowledge base according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for processing a fusion knowledge base according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a knowledge map in a method for processing a fusion knowledge base provided by the embodiment shown in FIG. 1;
  • FIG. 4 is a flowchart of still another method for processing a fusion knowledge base according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of still another method for processing a fusion knowledge base according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a fusion knowledge base processing apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of another fusion knowledge base processing apparatus according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a knowledge base management system according to an embodiment of the present invention.
  • FIG. 9 is a flow chart of intelligent question and answer through the knowledge base management system shown in FIG. 8;
  • FIG. 10 is a schematic structural diagram of another knowledge base management system according to an embodiment of the present invention.
  • the merits of the knowledge base-based question answering system can make up for the difficulty in improving the accuracy of the search-based question answering system and cannot be reasoned. Get answers and knowledge coarser
  • the question-and-answer system can also make up for the knowledge base-based question and answer system, due to the limited size of the knowledge base.
  • the advantages and disadvantages of both the search-based question answering system and the knowledge base-based question answering system can be complemented. Therefore, based on the complementarity of the advantages and disadvantages of the above two question answering systems, the present invention proposes a technical solution combining the FAQ and the knowledge base. And the integrated knowledge base is applied to the question answering system and the management system, so that the system performance can be significantly improved.
  • the technical solutions of the present invention are described in detail below through specific embodiments.
  • the ontology knowledge base in the following embodiments of the present invention is a domain-specific knowledge base, which may be based on a certain technical field, or may include multiple items. The content of the technical field.
  • the following specific embodiments of the present invention may be combined with each other, and the same or similar concepts or processes may not be described in some embodiments.
  • FIG. 1 is a flowchart of a method for processing a fusion knowledge base according to an embodiment of the present invention.
  • the fusion knowledge base processing method provided in this embodiment can be applied to the knowledge base system to improve the use scope of the knowledge base system.
  • the method can be executed by the fusion knowledge base processing device, and the fusion knowledge base processing device is combined by hardware and software.
  • the device can be integrated in the processor of the terminal device for use by the processor.
  • the method in this embodiment may include:
  • the method for processing the fusion knowledge base aims to construct a knowledge base with better performance and universal application, and the main idea is to combine the advantages of the FAQ library and the ontology knowledge base, and improve the FAQ library and ontology knowledge.
  • the ontology knowledge base in the embodiments of the present invention is an established knowledge base, for example, a knowledge base based on a specific technical domain, where the ontology knowledge base usually has an ontology, and an ontology relationship exists between the knowledge ontology, such as knowledge.
  • the ontology includes a general ontology and a domain ontology.
  • the ontology relationship includes a generic relationship, an equivalence relationship, an attribute relationship, and a composition relationship.
  • the FAQ library in this embodiment includes FAQ corpus for some frequently asked questions and keys. Words, combined with FAQ corpus and ontology knowledge base to extract knowledge classification catalogue, the purpose of extracting this knowledge classification catalog is to facilitate knowledge combing and management, for example, the upper word of knowledge base can be used for extraction.
  • the knowledge classification directory is extracted, that is, the basic conditions for knowledge combing and management have been provided. Therefore, according to the relationship between the knowledge ontology and the knowledge classification directory, the contents of the FAQ index library are respectively classified into the knowledge classification directory and the knowledge ontology. The relationship, and the ontology relationship in the ontology knowledge base, the contents of the FAQ library are loaded into the ontology knowledge base to construct a fusion knowledge base.
  • the fusion knowledge base constructed in this embodiment not only has the knowledge ontology and ontology relationship in the original ontology knowledge base, but also integrates the contents of the FAQ library into the ontology knowledge base in the form of knowledge classification and structural relationship, and constitutes a fusion knowledge base;
  • the answers to the knowledge quiz that were originally passed through the FAQ library and the ontology knowledge base can be derived from the fusion knowledge base.
  • FIG. 2 is a flowchart of another method for processing a fusion knowledge base according to an embodiment of the present invention, and illustrates a manner of fusion
  • S120 may include: S121, the ontology of the ontology knowledge base Mounted on the knowledge classification directory, this step is to classify the knowledge ontology; S122, the content of the FAQ library is mounted on the knowledge classification directory and the knowledge ontology, the step is to classify the FAQ library, the category is the knowledge classification directory and Knowledge ontology; the content and structure of the completed knowledge base can be represented by a knowledge map.
  • S121 the ontology of the ontology knowledge base Mounted on the knowledge classification directory
  • S122 the content of the FAQ library is mounted on the knowledge classification directory and the knowledge ontology
  • the step is to classify the FAQ library
  • the category is the knowledge classification directory and Knowledge ontology
  • the content and structure of the completed knowledge base can be represented by a knowledge map.
  • FIG. 2 is a flowchart of another method for processing a
  • the knowledge classification directory in FIG. 3 includes a directory classification A and a directory classification B, and an ontology.
  • the knowledge ontology in the knowledge base is mounted on the above-mentioned knowledge classification directory.
  • the contents of the FAQ library include FAQ1, FAQ2 and FAQ3, which can be mounted on the knowledge classification directory or the knowledge ontology.
  • the knowledge map also has the ontology of knowledge. Behaviors and classes belong to the attributes of the ontology, and the root nodes in the knowledge map represent superclasses, that is, the classification of the upper layers.
  • the A business package and the B business package are two packages with different tariffs, which are basic packages.
  • the C upgrade package can be superimposed on the basis of the A business package. It can be known that the A service package and the B service package are equivalent, and the A service package
  • the relationship with the C upgrade package is the upper layer and the lower layer, that is, the relationship between the knowledge ontology can be horizontally contrasted or vertically contrasted.
  • the fusion knowledge base processing method provided in this embodiment extracts the knowledge classification directory from the FAQ library and the ontology knowledge base, and according to the knowledge ontology, the ontology relationship and the extracted knowledge classification directory in the ontology knowledge base, the FAQ library is The content is loaded into the ontology knowledge base, and the fusion knowledge base is generated.
  • the fusion knowledge base generated by the embodiment is applied to the question and answer system, and the answers to the knowledge question and answer through the FAQ library and the ontology knowledge base are all passed through the fusion knowledge.
  • the library concludes that the disadvantages of the FAQ library and the ontology knowledge base can be avoided while combining the advantages of the FAQ library and the ontology knowledge base; the method provided by the embodiment solves the question and answer system in the prior art, because of the search-based question answering system, The knowledge base-based question answering system and the community collaborative question answering system have various defects in practical applications, which makes the performance of the question answering system difficult to improve.
  • the ontology knowledge base in the embodiments of the present invention has been described as an established knowledge base, and the ontology knowledge base may be, for example, a traditional artificially constructed knowledge base, however, due to the ontology knowledge base.
  • the amount of knowledge is large, the structure is complex, and the artificial construction is complicated and costly.
  • the ontology knowledge base in this embodiment may also be automatically constructed.
  • FIG. 4 it is a flowchart of another method for processing the fusion knowledge base provided in the embodiment of the present invention. In the embodiment shown in FIG.
  • the ontology knowledge base may be constructed by: before S110, including: S100, inputting initial corpus, the initial corpus includes text corpus and FAQ corpus; S101, constructing a knowledge base according to text corpus and FAQ corpus, and generating an ontology knowledge base .
  • the text corpus in this embodiment is an input item of the knowledge base in the prior art, and the input item of the ontology knowledge base in this embodiment It also includes FAQ corpus, which can be used as a basis for subsequent integration.
  • the form of constructing the ontology knowledge base is different from the manual construction in the prior art, wherein S101 includes: S102, determining an ontology target and an extraction rule, and the extraction rule includes an ontology relationship rule and an extraction algorithm; S103 The ontology is extracted from the text corpus and the FAQ corpus by the ontology object and the extraction algorithm, the knowledge ontology includes the domain ontology and the general ontology; S104, extracting the ontology relationship from the knowledge ontology according to the ontology relationship rule; S105, according to the extracted knowledge The ontology and ontology relationship are used to build the knowledge base and generate the ontology knowledge base.
  • S101 includes: S102, determining an ontology target and an extraction rule, and the extraction rule includes an ontology relationship rule and an extraction algorithm; S103 The ontology is extracted from the text corpus and the FAQ corpus by the ontology object and the extraction algorithm, the knowledge ontology includes the domain ontology and the general ontology; S104, extracting the
  • the target of the ontology and the extraction rules for subsequent use may be firstly determined, that is, which are domain ontology, which are general ontology, the type of ontology relationship, the formulation of extraction rules, etc., and the information may be defined by an operator.
  • the human-computer interaction interface is input to the terminal device; after the rules and targets are defined, the entity, attribute, behavior, event, and other knowledge ontology are extracted through the predefined encoding, and the predefined encoding is the form of the extraction algorithm, and the extraction algorithm is performed.
  • Stanford's Named Edtity Recognition (NER) can be used, or the extraction algorithm can be designed by itself.
  • NER Stanford's Named Edtity Recognition
  • the general ontology and the domain ontology are extracted according to the initial corpus.
  • the ontology is the node on the knowledge map, as shown in Figure 3.
  • the ontology relationship is extracted according to the ontology relationship rule coding, and the ontology relationship includes but is not limited to a generic relationship, an equivalent relationship, an attribute relationship, a composition relationship, etc., for example, a definition may be adopted. Rule extraction, etc. Determining the edge between the nodes on the map knowledge; After determining the edge between nodes and node, it is to build the knowledge base, to form a mapping knowledge, i.e. knowledge base body.
  • the method provided by the embodiment further includes: S106, according to the preset inference rule, extracting the knowledge inference rule in the semantic knowledge base, and generating a knowledge inference rule base.
  • the formation of the knowledge inference rule base is beneficial to the use of the knowledge base.
  • the answers that cannot be directly obtained can be obtained from the knowledge inference rule base according to the preset inference rules, further improving the fusion knowledge base. Use performance.
  • the traditional knowledge base construction is manually constructed.
  • the ontology knowledge base and the knowledge inference rule base are formed by using an automated construction method.
  • the content of the knowledge base can also be monitored by manual proofreading. To ensure knowledge accuracy and recall rate.
  • FIG. 5 is a flowchart of still another method for processing a fusion knowledge base according to an embodiment of the present invention.
  • the automatically constructed ontology is automatically constructed.
  • the knowledge base is used to fuse with the FAQ library to form the fusion knowledge base in the above embodiment; however, the content of the constructed ontology knowledge base is limited, it is impossible to include all the knowledge ontology, and the content of the FAQ library is also based on the user's input.
  • the common problems are summarized. Therefore, in actual use, the user's demand for the knowledge base is increasing.
  • the method may be: after S101, further comprising: S130, inputting an incremental corpus in the ontology knowledge base; S140, extracting an incremental ontology from the incremental corpus by using a decimation algorithm; S150, according to an ontology relationship rule Extracting the incremental ontology relationship in the incremental ontology; S160, knowing the ontology through the extracted incremental ontology and incremental ontology relationship Update the library.
  • the manner of updating the ontology knowledge base is similar to the way of constructing the ontology knowledge base, and the ontology and the ontology relationship are extracted according to the input corpus, thereby constructing through the extracted content; in the process of updating,
  • the incremental corpus can be the latest domain knowledge, the knowledge content added to the knowledge base, and the user's question record or log, etc., the purpose is to add some new knowledge and new question content to the ontology knowledge base, and at the same time Add FAQ corpus, which is the same as updating the fusion knowledge base.
  • the embodiment does not limit the execution order of the S130-S160 in the fusion knowledge base processing method, for example, may be performed after S101, may be performed after S106, or may be performed after S120.
  • FIG. 5 shows that S130-S160 are executed after S120 as an example, that is, the embodiment has an update requirement for the knowledge base after constructing the fusion knowledge base.
  • FIG. 6 is a schematic structural diagram of a fusion knowledge base processing apparatus according to an embodiment of the present invention.
  • the fusion knowledge base processing device provided in this embodiment can be applied to the knowledge base management system to improve the use scope of the knowledge base system.
  • the fusion knowledge base processing device can be implemented by combining hardware and software, and the device can be integrated in the
  • the processor of the terminal device is used by the processor to call.
  • the fusion knowledge base processing apparatus of this embodiment includes a directory extraction module 11 and a mount module 12.
  • the directory extraction module 11 is configured to extract a knowledge classification directory from a common problem FAQ library and an ontology knowledge base, where the ontology knowledge base includes an ontology and an ontology relationship.
  • the fusion knowledge base processing device aims to construct a knowledge base with relatively good performance and universal application, and the main idea is to combine the advantages of the FAQ library and the ontology knowledge base, and can improve the FAQ library and the ontology knowledge.
  • the respective disadvantages of the library; the ontology knowledge base in each embodiment of the present invention is an established knowledge base.
  • the mounting module 12 is configured to mount the content of the FAQ library to the ontology knowledge base according to the knowledge ontology, the ontology relationship, and the knowledge classification directory extracted by the directory extraction module 11, to generate a fusion knowledge base.
  • the method for generating the fusion knowledge base in the embodiments of the present invention may be that the mounting module 12 includes: a first mounting unit 13 configured to mount the knowledge ontology of the ontology knowledge base to the knowledge classification directory extracted by the directory extraction module 11
  • the second mounting unit 14 is configured to mount the content of the FAQ library to the knowledge classification directory and the knowledge ontology.
  • the fusion knowledge base constructed by the fusion knowledge base processing apparatus provided by the embodiment of the present invention can also be represented by the knowledge map shown in FIG. 3. The content corresponding to the nodes and edges in the knowledge map has been described in the above embodiment, so it is not here. Let me repeat.
  • the fusion knowledge base processing device provided by the embodiment of the present invention is used to execute the fusion knowledge base processing method provided by the embodiment shown in FIG. 1 and FIG. 2 of the present invention, and has a corresponding functional module, and the implementation principle and technical effect thereof are similar. Let me repeat.
  • the ontology knowledge base in the embodiments of the present invention has been described as an established knowledge base, which may be, for example, a traditional artificially constructed knowledge base.
  • the method for constructing the ontology knowledge base in the embodiment is as shown in FIG. 7 , which is a schematic structural diagram of another fusion knowledge base processing device provided by the embodiment of the present invention, and the fusion knowledge base processing device further includes
  • the input module 15 is configured to input an initial corpus before the directory extraction module 11 extracts the knowledge classification directory from the FAQ library and the ontology knowledge base, the initial corpus includes a text corpus and a FAQ corpus; and the building module 16 is configured according to the input module 15
  • the input text corpus and FAQ corpus are used to construct the knowledge base to generate an ontology knowledge base.
  • the building module 16 may include: a determining unit 17 configured to determine an ontology target and an extraction rule, the extraction rule includes an ontology relationship rule and a decimation algorithm; and the ontology extraction unit 18 is configured to be determined by the determining unit 17
  • the ontology object and the extraction algorithm extract the knowledge ontology from the text corpus and the FAQ corpus input by the input module 15, the knowledge ontology includes a domain ontology and a general ontology;
  • the relationship extraction unit 19 is configured to extract from the ontology according to the ontology relationship rule determined by the determining unit 17.
  • the ontology relationship is extracted from the knowledge ontology extracted by the unit 18; the knowledge base generating unit 20 is configured to construct the knowledge base according to the ontology extracted by the ontology extraction unit 18 and the ontology relationship extracted by the relationship extraction unit 19, and generate an ontology knowledge base.
  • the ontology knowledge base generated by the fusion knowledge base processing apparatus provided by this embodiment can also be represented by nodes and tables in the knowledge map.
  • the apparatus provided in this embodiment further includes: an inference rule module 21 configured to perform in the ontology knowledge base according to the preset inference rules after the building module 16 generates the ontology knowledge base.
  • the extraction of knowledge inference rules generates a library of knowledge inference rules.
  • the formation of the knowledge inference rule base is beneficial to the use of the knowledge base. When questioning and answering through the knowledge base, the answers that cannot be directly obtained can be obtained from the knowledge inference rule base according to the preset inference rules, further improving the fusion knowledge base. Use performance.
  • the fusion knowledge base processing device provided by the embodiment of the present invention is configured to perform the fusion knowledge base processing method provided by the embodiment shown in FIG. 4 of the present invention, and has a corresponding functional module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
  • the automatically constructed ontology knowledge base is configured to be merged with the FAQ library to form the fusion knowledge base in the above embodiment; however, the constructed ontology knowledge base is The content is limited, and it is impossible to include all the ontology. Therefore, the content of the ontology knowledge base in this embodiment is not static, and the scope of the ontology knowledge base can be further increased, including adding FAQ corpus, which can be: input
  • the module 15 is further configured to: after the building module 16 generates the ontology knowledge base, input the incremental corpus in the ontology knowledge base; the body extracting unit 18 is further configured to increase the input from the input module 15 by the decimation algorithm determined by the determining unit 17.
  • the incremental body is extracted from the quantity corpus; the relationship extraction unit 19 is further configured to extract the incremental ontology relationship from the incremental body extracted by the body extraction unit 18 according to the ontology relationship rule determined by the determining unit 17; correspondingly, in the embodiment
  • the fusion knowledge base processing device further includes: an update module configured to be extracted by the ontology extraction unit 18 Body and relation extraction unit 19 extracts incremental bulk update relationship ontology knowledge.
  • the execution order of the knowledge base is updated, for example, the update may be performed after the ontology knowledge base is completed, or may be performed after the knowledge inference rule base is constructed, or may be in the fusion.
  • the update is performed after the knowledge base is built.
  • the fusion knowledge base processing device provided by the embodiment of the present invention is used to execute the fusion knowledge base processing method provided by the embodiment shown in FIG. 5 of the present invention, and has a corresponding functional module, and the implementation principle and technical effect thereof are similar, and details are not described herein again.
  • FIG. 8 is a schematic structural diagram of a knowledge base management system according to an embodiment of the present invention.
  • the knowledge base management system provided in this embodiment includes a question answering device 200, and the fusion knowledge base processing device 100 in any of the embodiments shown in FIG. 6 and FIG.
  • the question answering device 200 includes: a user interaction module 210 configured to obtain a problem input by the user.
  • the language processing module 220 is configured to perform language processing on the questions acquired by the user interaction module 210, and the language processing includes word segmentation, part-of-speech tagging, and entity recognition.
  • the speech processing in this embodiment generally refers to natural language processing.
  • the natural language processing such as word segmentation, part-of-speech tagging, and entity recognition is first performed, and the traditional FAQ question and answer is different from the knowledge base of the present embodiment.
  • the general ontology and domain ontology are introduced, which mainly acts on word segmentation, which makes the word segmentation more accurate. Only when the word segmentation is accurate, can the correct semantic understanding be carried out.
  • the input question is: “How to charge the Tianyi A8 pilot package”, which is basically divided into “ ⁇ / ⁇ /A/8/ ⁇ / ⁇ / ⁇ /how/charge” by the traditional word segmentation method, and adopts the implementation of the present invention.
  • the fusion knowledge base generated by the example it can be divided into "Tianyi A8 pilot package / how / charge”.
  • the normalization module 230 is configured to standardize the problems processed by the language processing module 220, including standardization of typos, language conversion, and standardization and word recognition.
  • the module in this embodiment mainly standardizes languages, such as typos correction, dialect conversion into standard Mandarin, full/simplified/alias standardization processing, and identification of sensitive words.
  • the semantic understanding module 240 is configured to perform semantic understanding on the problem processed by the normalization module 230 to obtain semantic information of the problem.
  • the correct semantics of the input problem can usually be understood.
  • it can be the semantic information of the acquired problem.
  • the semantic information mainly includes extracted and expanded keywords, extracted triples and ternary relationships, and context-based. Omitted reply, referential resolution, intention identification, sentiment analysis, question questioning, etc.
  • the information retrieval module 250 is configured to retrieve the semantic information acquired by the semantic understanding module 240 in the ontology knowledge base to obtain an answer to the question.
  • the semantic information acquired by the semantic understanding module 240 may be retrieved in the fusion knowledge base, for example, a query language and a data acquisition protocol (Simple Protocol and RDF Query) may be employed.
  • a query language and a data acquisition protocol (Simple Protocol and RDF Query) may be employed.
  • Language, SPARQL) and other search languages the answers to the questions can be returned to the user.
  • the user interaction module 210 for example, a graphical user interface (Graphical User Interface, GUI).
  • the knowledge base management system provided in this embodiment is configured to search for a problem input by a user in a fusion knowledge base, and provide an answer by fusing the content of the knowledge base.
  • the resources for the retrieval in the system provided in this embodiment are generated by the fusion knowledge base processing device provided by the foregoing embodiment, and therefore have the same technical effects as the above embodiments, and are not described herein again.
  • the fusion knowledge base processing apparatus 100 provided by the above embodiment of the present invention can further generate a knowledge inference rule base after generating the ontology knowledge base. If the information retrieval module 250 does not obtain an answer in the fusion knowledge base, the method can further The knowledge inference rule base performs reasoning to obtain an answer; that is, the question answering apparatus 200 provided in this embodiment further includes: a knowledge inference module 260, which is further configured to generate, according to the inference rule module, when the answer obtained by the information retrieval module 250 is empty The knowledge reasoning rule base performs reasoning to obtain answers to the above questions.
  • the semantic information acquired by the semantic understanding module 240 may be searched in the FAQ library to obtain an answer to the question.
  • the semantic information can be retrieved in the FAQ library, the search result is similarly calculated, and finally returned to the threshold, and the first answer is used as the answer to the question.
  • the question answering apparatus 200 in the present embodiment may further include: a cached answer obtaining module 270 configured to obtain an answer of the question from the cache after the standardizing module 230 normalizes the problem processed by the language processing module 220.
  • a cached answer obtaining module 270 configured to obtain an answer of the question from the cache after the standardizing module 230 normalizes the problem processed by the language processing module 220.
  • This method is beneficial to improve system performance. It can put common problems and knowledge base into memory. On the basis of language standards, first obtain answers from the cache, avoiding a large number of retrieval and calculation processes.
  • the question answering apparatus 200 provided in this embodiment may further include an answer returning module 280 configured to return an answer obtained by the information retrieving module 250, the knowledge inference module 260 or the cached answer obtaining module 270, and output the answer to the user interaction module 210. .
  • FIG. 9 is a flow chart of intelligent question and answer through the knowledge base management system shown in FIG. 8 , and the process includes:
  • S202 performing natural language processing on the problem input by the user.
  • Common ontology libraries, domain ontology libraries, thesaurus and stop words are commonly used in the process.
  • S203 Perform language normalization processing on the problem input by the user. Sensitive lexicons and dialect libraries are commonly used in the process.
  • S205 Perform semantic understanding on the problem after the standardization process, and obtain semantic information.
  • FIG. 10 is a schematic structural diagram of another knowledge base management system according to an embodiment of the present invention.
  • the system provided in this embodiment further includes a management device 300, and the management device 300 includes: a knowledge editing module 310 configured to be extracted from the fusion knowledge base processing device 100.
  • the ontology is edited, proofread, and added to the new ontology. You can also increase your knowledge manually.
  • the knowledge review module 320 is configured to perform multi-level auditing on the knowledge ontology in the fusion knowledge base. It is usually necessary to pass multiple levels of personnel review to ensure the correctness of the knowledge.
  • the statistical analysis module 330 is configured to: in the process of asking questions and obtaining answers through the question answering device 200, counting popular questions, popular knowledge, integrating the knowledge ontology not accessed in the knowledge base, and one of the knowledge ontology with the wrong answer or A plurality of pieces, and analyzing the accuracy of obtaining an answer through the question answering device 200.
  • the online test module 340 is configured to verify the knowledge ontology edited by the knowledge editing module 310 to obtain an online test result.
  • the rule management module 350 is configured to manage and edit the extracted manners and rules. For example, it can be manual editing.
  • the data management module 360 is configured to store the initial content in the FAQ library and the ontology knowledge base, and store the content in the fusion knowledge base according to a preset time. This module manages the imported original attachments, such as storage, multi-version comparison, and attachment archiving.
  • the intelligent quality check module 370 is configured to determine the accuracy of the answer obtained by the question answering device 200 by sampling test, and is further configured to automatically detect the accuracy of the answer during the operation of the question answering device 200.
  • each module included in the fusion knowledge base processing device and each submodule included in each module may be implemented by a processor in the terminal; of course, it may also be implemented by a logic circuit, in the process of implementation.
  • the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA).
  • the above-mentioned fusion knowledge base processing method is implemented in the form of a software function module and sold or used as a stand-alone product, it may also be stored in a computer readable storage medium.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a USB flash drive, A variety of media that can store program code, such as a removable hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • ROM read only memory
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present disclosure.
  • the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to execute the fusion knowledge base processing method in the embodiment of the present invention.
  • an embodiment of the present invention further provides a terminal, including: a storage medium and a processor, wherein the storage medium is configured to store executable instructions, and the processor is configured to execute the stored executable instructions, the executable instructions Configure to perform the following steps:
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD)
  • a storage medium such as ROM/RAM, disk, CD
  • a terminal device which may be a cell phone, computer, server, air conditioner, or network device, etc.
  • the knowledge base management system provided in this embodiment can also synchronize knowledge, mainly to synchronize the user's question record and log in the knowledge base management system to the knowledge base management system, by the administrator or the knowledge professional. The person performs quality check on the record and detects it in real time.
  • the knowledge base management system of the embodiment can also extract knowledge from the question that is not accurately answered when the question is answered or the question that has no answer, and incrementally update the file into the fusion knowledge base. It can be retrieved next time.
  • the manner of synchronization and the manner of updating are the same as the manner of updating the fusion knowledge base processing device, and therefore will be described herein.
  • the content of the FAQ library is attached to the ontology knowledge by extracting the knowledge classification directory from the FAQ library and the ontology knowledge base, and according to the knowledge ontology, the ontology relationship, and the extracted knowledge classification directory in the ontology knowledge base.
  • the fusion knowledge base is generated, and the fusion knowledge base generated by the embodiment is applied to the question and answer system, and the answers to the knowledge quiz obtained by the FAQ library and the ontology knowledge base can be obtained through the fusion knowledge base.
  • the FAQ library and the ontology knowledge base have the advantages, the respective disadvantages of the FAQ library and the ontology knowledge base can be avoided; the method provided by the present invention solves the prior art question answering system, because of the search question answering system, the knowledge base based question answering system, and The community collaborative question and answer system has various defects in practical applications, which makes it difficult to improve the performance of the question and answer system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

一种融合知识库处理方法和装置及知识库管理系统、存储介质,所述融合知识库处理方法包括:从常见问题FAQ库和本体知识库中抽取知识分类目录,该本体知识库中包括知识本体和本体关系(S110);根据知识本体、本体关系和知识分类目录,将FAQ库的内容挂载到本体知识库中,生成融合知识库(S120);所述方法提高了问答系统的性能。

Description

融合知识库处理方法和装置及知识库管理系统、存储介质 技术领域
本发明涉及计算机和通信技术领域,尤指一种融合知识库处理方法和装置及知识库管理系统、存储介质。
背景技术
目前的问答系统通常分为三类,包括:检索式问答系统、基于知识库的问答系统和社区协作式问答系统。其中,广泛应用的是检索式问答系统,如智能客服系统,都是基于常见问题(Frequently Asked Questions,FAQ)的检索式问答系统;基于知识库的问答系统,通常很少被采用,应用例如为专家系统等;社会协作式问答系统,例如百度知道等,很难判断问答的准确性和处理问答结果,更多的是通过群体智慧来解决问题,通常缺乏判断的标准规则。
对检索式问答系统和基于知识库的知识推理问答系统的问答原理和特点进行分析。对于检索式问答系统,主要是为了克服搜索引擎的不足应运而生的,搜索引擎的不足例如包括无法正确理解和表达用户的信息需求意图、返回的信息太多非用户想要的简单答案,其主要的特点是利用信息检索以及浅层的自然语言技术从文本库或网页中抽取出答案,具有不受领域限制、能理解用户意图,返回的答案更准确等优点,但该问答系统也存在很多缺点,由于检索式问答系统的语料主要是FAQ,则必然存在以下问题:第一、FAQ为常见问题,通过统计得出,存在口语化想象,不够标准,通常会导致匹配不精准,准确率低等问题;第二、该系统的本质技术是检索,答案的返回依靠相似度计算,准确率难以提高;第三、对于没有答案的问题,无法通过推理获取答案,系统的智能性较差;第四、FAQ形式的知识 粒度较粗,对于知识管理人员而言,没有层次结构,管理成本和门槛较高。对于基于知识库的问答系统,需要先构建知识库,答案可以从知识库中直接获取得到,还可以在知识库的基础上经过推理得到,该问答系统的特点是知识更细粒度化,因此可以检索到更加精准的答案,并且在没有检索到答案时,还可以通过推理获得间接答案,但该问答系统中的知识库通常基于特定领域,知识库的规模有限,影响了问答系统的使用范围。
综上所述,现有技术中的问答系统,由于检索式问答系统、基于知识库的问答系统和社区协作式问答系统在实际应用中都存在各项缺陷,而导致问答系统的性能难以提高。
发明内容
为了解决上述技术问题,本发明实施例提供了一种融合知识库处理方法和装置及知识库管理系统、存储介质,以解决现有技术中的问答系统,由于检索式问答系统、基于知识库的问答系统和社区协作式问答系统在实际应用中都存在各项缺陷,而导致问答系统的性能难以提高。
第一方面,本发明实施例提供一种融合知识库处理方法,包括:
从常见问题FAQ库和本体知识库中抽取知识分类目录,所述本体知识库中包括知识本体和本体关系;
根据所述知识本体、所述本体关系和所述知识分类目录,将所述FAQ库的内容挂载到所述本体知识库中,生成融合知识库。
第二方面,本发明实施例提供一种融合知识库处理装置,包括:
目录抽取模块,配置为从常见问题FAQ库和本体知识库中抽取知识分类目录,所述本体知识库中包括知识本体和本体关系;
挂载模块,配置为根据所述知识本体、所述本体关系和所述目录抽取模块抽取的知识分类目录,将所述FAQ库的内容挂载到本体知识库中,生成融合知识库。
第三方面,本发明实施例提供一种知识库管理系统,包括:问答装置和如上述第二方面中任一项所述的融合知识库处理装置;
其中,所述问答装置包括:用户交互模块,配置为获取用户输入的问题;
语言处理模块,配置为对所述用户交互模块获取的问题进行语言处理,所述语言处理包括分词、词性标注和实体识别;
标准化模块,配置为对所述语言处理模块处理后的问题进行标准化处理,所述标准化处理包括错别字纠正、语言转换和标准化和词语识别;
语义理解模块,配置为对所述标准化模块处理后的问题进行语义理解,获取所述问题的语义信息;
信息检索模块,配置为将所述语义理解模块获取的语义信息在所述本体知识库中进行检索,获取所述问题的答案。
第四方面,本发明实施例提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行本发明第一方面实施例提供的融合知识库处理方法。
本发明提供的融合知识库处理方法和装置及知识库管理系统、存储介质,通过从FAQ库和本体知识库中抽取知识分类目录,并根据本体知识库中的知识本体、本体关系和已抽取的知识分类目录,将FAQ库的内容挂载到本体知识库中,生成融合知识库,将本实施例生成的融合知识库应用于问答系统中,实现了原先通过FAQ库和本体知识库进行知识问答的答案都可以通过该融合知识库得出,在结合FAQ库和本体知识库优点的同时,可以避免FAQ库和本体知识库各自的缺点;本发明提供的方法解决了现有技术中的问答系统,由于检索式问答系统、基于知识库的问答系统和社区协作式问答系统在实际应用中都存在各项缺陷,而导致问答系统的性能难以提高。
附图说明
附图用来提供对本发明技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本发明的技术方案,并不构成对本发明技术方案的限制。
图1为本发明实施例提供的一种融合知识库处理方法的流程图;
图2为本发明实施例提供的另一种融合知识库处理方法的流程图;
图3为本发明图1所示实施例提供的融合知识库处理方法中一种知识图谱示意图;
图4为本发明实施例提供的又一种融合知识库处理方法的流程图;
图5为本发明实施例提供的再一种融合知识库处理方法的流程图;
图6为本发明实施例提供的一种融合知识库处理装置的结构示意图;
图7为本发明实施例提供的另一种融合知识库处理装置的结构示意图;
图8为本发明实施例提供的一种知识库管理系统的结构示意图;
图9为通过图8所示知识库管理系统进行智能问答的流程图;
图10为本发明实施例提供的另一种知识库管理系统的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚明白,下文中将结合附图对本发明的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
基于现有技术中检索式问答系统和基于知识库的问答系统的各项缺点,可以发现,基于知识库的问答系统的优点可以一定程度上弥补检索式问答系统中准确率难以提高、无法通过推理获取答案和知识粒度较粗等问 题,而检索式问答系统也能弥补基于知识库的问答系统,由于知识库规模受限而带来的各项缺点。显然地,检索式问答系统和基于知识库的问答系统两者的优缺点可以形成互补,因此,本发明基于上述两种问答系统优缺点的互补性,提出一种融合FAQ和知识库的技术方案,并且将融合后的知识库应用于问答系统和管理系统,从而可以显著的提高系统性能。
下面通过具体的实施例对本发明的技术方案进行详细说明,本发明以下各实施例中的本体知识库为基于特定领域的知识库,其可以是基于某一项技术领域,也可以是包含多项技术领域的内容。本发明提供以下几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图1为本发明实施例提供的一种融合知识库处理方法的流程图。本实施例提供的融合知识库处理方法可以应用于知识库系统中,以提高知识库系统的使用范围,该方法可以由融合知识库处理装置执行,该融合知识库处理装置通过硬件和软件结合的方式来实现,该装置可以集成在终端设备的处理器中,供处理器调用使用。如图1所示,本实施例的方法可以包括:
S110,从FAQ库和本体知识库中抽取知识分类目录,该本体知识库中包括知识本体和本体关系。
本发明实施例提供的融合知识库处理方法,目的在于构建一种使用性能较为完善并且普遍适用的知识库,其主要思想是结合FAQ库和本体知识库的优点,并且可以改善FAQ库和本体知识库各自的缺点。本发明各实施例中的本体知识库为一个已建立的知识库,例如为基于特定技术领域的知识库,该本体知识库中通常具有知识本体,并且各知识本体之间存在本体关系,例如知识本体包括通用本体和领域本体等,本体关系包括类属关系、等同关系、属性关系和组成关系等。
本实施例的FAQ库中包括FAQ语料,为一些经常提问到的问题和关键 词,结合FAQ语料和本体知识库中的知识本体抽取知识分类目录,抽取该知识分类目录的目的在于便于知识梳理和管理,例如可以采用知识库的上位词进行抽取。
S120,根据知识本体、本体关系和知识分类目录,将FAQ库的内容挂载到本体知识库中,生成融合知识库。
本实施例中抽取出知识分类目录,即已经提供了知识梳理和管理的基础条件,因此,可以根据各知识本体与知识分类目录的关系,FAQ索引库中的内容分别于知识分类目录和知识本体的关系,以及结合本体知识库中的本体关系,将FAQ库的内容挂载到本体知识库中以构建出融合知识库。本实施例中构建的融合知识库不仅具有原本体知识库中的知识本体和本体关系,还将FAQ库的内容以知识分类和结构关系的方式融合到本体知识库中,构成了融合知识库;使得原先通过FAQ库和本体知识库进行知识问答的答案都可以通过该融合知识库得出。
本实施例的实现方式如图2所示,为本发明实施例提供的另一种融合知识库处理方法的流程图,说明融合的方式,即S120可以包括:S121,将本体知识库的知识本体挂载到知识分类目录上,该步骤为对知识本体进行分类;S122,将FAQ库的内容挂载到知识分类目录和知识本体上,该步骤为对FAQ库进行分类,类别为知识分类目录和知识本体;完成融合的知识库的内容和结构可以通过知识图谱来表示。举例来说,如图3所示,为本发明图1所示实施例提供的融合知识库处理方法中一种知识图谱示意图,图3中的知识分类目录包括目录分类A和目录分类B,本体知识库中的知识本体都挂载到上述知识分类目录上,FAQ库的内容例如包括FAQ1、FAQ2和FAQ3,可以挂载到知识分类目录或知识本体上;另外,知识图谱中还具有知识本体的行为和类属于知识本体的属性,知识图谱中的根节点表示超类,即上层的分类)。
以下通过一个具体实例说明知识本体与其行为、属性等,例如问题“彩铃业务如何办理”,其中,“彩铃业务”为知识本体,“办理”为事件行为,彩铃业务的收费标准为属性;再例如,A业务套餐和B业务套餐为两个资费不同的套餐,均为基础套餐,在A业务套餐的基础上可以叠加C升级套餐,可知,A业务套餐与B业务套餐是等同关系,A业务套餐与C升级套餐为上层和下层的关系,即知识本体间的关系可以是横向对比的,也可以是纵向对比的。
本实施例所提供的融合知识库处理方法,通过从FAQ库和本体知识库中抽取知识分类目录,并根据本体知识库中的知识本体、本体关系和已抽取的知识分类目录,将FAQ库的内容挂载到本体知识库中,生成融合知识库,将本实施例生成的融合知识库应用于问答系统中,实现了原先通过FAQ库和本体知识库进行知识问答的答案都可以通过该融合知识库得出,在结合FAQ库和本体知识库优点的同时,可以避免FAQ库和本体知识库各自的缺点;本实施例提供的方法解决了现有技术中的问答系统,由于检索式问答系统、基于知识库的问答系统和社区协作式问答系统在实际应用中都存在各项缺陷,而导致问答系统的性能难以提高。
需要说明的是,上述实施例中已经说明本发明各实施例中的本体知识库为一个已建立的知识库,该本体知识库例如可以为传统的人工构建的知识库,然而,由于本体知识库的知识量大、结构关系复杂,人工构建较为复杂且成本较高。另外,本实施例中的本体知识库还可以为自动构建的,如图4所示,为本发明实施例提供的又一种融合知识库处理方法的流程图,在图4所示实施例中,本体知识库的构建方式可以为:在S110之前还包括:S100,输入初始语料,该初始语料包括文本语料和FAQ语料;S101,根据文本语料和FAQ语料进行知识库的构建,生成本体知识库。本实施例中的文本语料为现有技术中知识库的输入项,本实施例中本体知识库的输入项 还包括FAQ语料,可以作为后续融合的基础。
本实施例在实现中,构建本体知识库的形式不同于现有技术中的人工构建,其方式,即S101包括:S102,确定本体目标和抽取规则,抽取规则包括本体关系规则和抽取算法;S103,通过本体目标和抽取算法从文本语料和FAQ语料中抽取知识本体,该知识本体包括领域本体和通用本体;S104,根据本体关系规则从知识本体中抽取本体关系;S105,根据所抽取到的知识本体和本体关系进行知识库的构建,生成本体知识库。本实施例中可以先制定知识本体的目标和后续使用的抽取规则,即哪些是领域本体,哪些是通用本体,本体关系的种类,抽取规则的制定等,这些信息可以是操作人员定义的,通过人机交互界面输入到终端设备中;在定义好规则和目标后,通过预定义的编码进行实体、属性、行为、事件等知识本体的抽取,预定义的编码即为抽取算法的形式,抽取算法例如可以采用斯坦福的命名实体识别器(Named Edtity Recognition,NER),也可以自己设计抽取算法,最终根据初始语料抽取出通用本体和领域本体,知识本体为知识图谱上的节点,如图3所示,在确定了知识本体后,进一步执行抽取操作,此次根据本体关系规则编码进行本体关系的抽取,本体关系包括但不限于类属关系、等同关系、属性关系、组成关系等,例如可以采用定义规则等方式抽取,本体关系抽取可以确定知识图谱上节点间的边;在确定了节点和节点间的边后,就进行知识库的构建,使其形成知识图谱,即本体知识库。
进一步地,在本体知识库构建完成之后,本实施例提供的方法在S101之后还包括:S106,根据预置的推理规则,在语义知识库中进行知识推理规则的抽取,生成知识推理规则库。知识推理规则库的形成有利于该知识库的使用,在通过该知识库进行问答时,无法直接得到的答案可以根据预置的推理规则从知识推理规则库中得到,进一步提高了融合知识库的使用 性能。
需要说明的是,传统的知识库构建均为人工构建,本实施例采用自动化构建的方式形成本体知识库和知识推理规则库,在此基础上,还可以采用人工校对的方式监测知识库的内容,以确保知识的准确率和召回率。
更进一步地,图5为本发明实施例提供的再一种融合知识库处理方法的流程图,在上述各实施例的基础上,在完成本体知识库的自动构建后,该自动构建出的本体知识库用于和FAQ库融合形成上述实施例中的融合知识库;然而,已构建的本体知识库的内容是有限的,不可能包含所有的知识本体,并且FAQ库的内容也是根据用户的输入总结出的常见问题,因此,在实际使用中,用户对知识库的需求是不断增加的,因此,本实施例中的本体知识库的内容不是一成不变的,可以进一步增加本体知识库的范围,其中包括增加FAQ语料,该方式可以为,在S101之后还包括:S130,在本体知识库中输入增量语料;S140,通过抽取算法从增量语料中抽取增量本体;S150,根据本体关系规则从增量本体中抽取增量本体关系;S160,通过所抽取到的增量本体和增量本体关系对本体知识库进行更新。
在本实施例中,对本体知识库的更新方式与构建本体知识库的方式相似,都是根据输入的语料抽取本体和本体关系,从而通过所抽取到的内容进行构建;在更新的过程中,增量语料可以为最新的领域知识、针对知识库增添的知识内容,还可以为用户的提问记录或日志等,目的就是将一些新的知识和新的提问内容添加到本体知识库中,同时可以增加FAQ语料,即同时更新了融合知识库。
需要说明的是,本实施例不限制S130~S160在融合知识库处理方法中的执行顺序,例如可以是在S101之后执行的,还可以是在S106之后执行的,还可以是在S120之后执行的,图5为S130~S160在S120之后执行为例予以示出,即本实施例在构建融合知识库之后对知识库有更新需求。
图6为本发明实施例提供的一种融合知识库处理装置的结构示意图。本实施例提供的融合知识库处理装置可以应用于知识库管理系统中,以提高知识库系统的使用范围,该融合知识库处理装置可以通过硬件和软件结合的方式来实现,该装置可以集成在终端设备的处理器中,供处理器调用使用。如图6所示,本实施例的融合知识库处理装置包括:目录抽取模块11和挂载模块12。
其中,目录抽取模块11,配置为从常见问题FAQ库和本体知识库中抽取知识分类目录,该本体知识库中包括知识本体和本体关系。
本发明实施例提供的融合知识库处理装置,目的在于构建一种使用性能较为完善并且普遍适用的知识库,其主要思想是结合FAQ库和本体知识库的优点,并且可以改善FAQ库和本体知识库各自的缺点;本发明各实施例中的本体知识库为一个已建立的知识库。
挂载模块12,配置为根据知识本体、本体关系和目录抽取模块11抽取的知识分类目录,将FAQ库的内容挂载到本体知识库中,生成融合知识库。
本发明各实施例中生成融合知识库的方式可以为,挂载模块12包括:第一挂载单元13,配置为将本体知识库的知识本体挂载到目录抽取模块11抽取的知识分类目录上;第二挂载单元14,配置为将FAQ库的内容挂载到知识分类目录和知识本体上。通过本发明实施例提供的融合知识库处理装置构建的融合知识库同样可以用图3所示知识图谱来表示,知识图谱中节点和边对应的内容在上述实施例中已经说明,故在此不再赘述。
本发明实施例提供的融合知识库处理装置用于执行本发明图1和图2所示实施例提供的融合知识库处理方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。
需要说明的是,上述实施例中已经说明本发明各实施例中的本体知识库为一个已建立的知识库,例如可以为传统的人工构建的知识库,也可以 为自动构建的本体知识库,本实施例中构建本体知识库的方式如图7所示,为本发明实施例提供的另一种融合知识库处理装置的结构示意图,融合知识库处理装置还包括:输入模块15,配置为在目录抽取模块11从FAQ库和本体知识库中抽取知识分类目录之前,输入初始语料,该初始语料包括文本语料和FAQ语料;构建模块16,配置为根据输入模块15输入的文本语料和FAQ语料进行知识库的构建,生成本体知识库。
本实施例在实现中,构建模块16可以包括:确定单元17,配置为确定本体目标和抽取规则,该抽取规则包括本体关系规则和抽取算法;本体抽取单元18,配置为通过确定单元17确定的本体目标和抽取算法从输入模块15输入的文本语料和FAQ语料中抽取知识本体,该知识本体包括领域本体和通用本体;关系抽取单元19,配置为根据确定单元17确定的本体关系规则从本体抽取单元18抽取的知识本体中抽取本体关系;知识库生成单元20,配置为根据本体抽取单元18抽取的知识本体和关系抽取单元19抽取的本体关系进行知识库的构建,生成本体知识库。通过本实施例提供的融合知识库处理装置生成的本体知识库同样可以通过知识图谱中的节点和表来表示。
进一步地,在本体知识库构建完成之后,本实施例提供的装置还包括:推理规则模块21,配置为在构建模块16生成本体知识库之后,根据预置的推理规则,在本体知识库中进行知识推理规则的抽取,生成知识推理规则库。知识推理规则库的形成有利于该知识库的使用,在通过该知识库进行问答时,无法直接得到的答案可以根据预置的推理规则从知识推理规则库中得到,进一步提高了融合知识库的使用性能。
本发明实施例提供的融合知识库处理装置配置为执行本发明图4所示实施例提供的融合知识库处理方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。
更进一步地,本实施例在完成本体知识库的自动构建后,该自动构建出的本体知识库配置为和FAQ库融合形成上述实施例中的融合知识库;然而,已构建的本体知识库的内容是有限的,不可能包含所有的知识本体,因此,本实施例中的本体知识库的内容不是一成不变的,可以进一步增加本体知识库的范围,其中包括增加FAQ语料,该方式可以为:输入模块15,还配置为在构建模块16生成本体知识库之后,在该本体知识库中输入增量语料;本体抽取单元18,还配置为通过确定单元17确定的抽取算法从输入模块15输入的增量语料中抽取增量本体;关系抽取单元19,还配置为根据确定单元17确定的本体关系规则从本体抽取单元18抽取的增量本体中抽取增量本体关系;相应地,本实施例中的融合知识库处理装置还包括:更新模块,配置为通过本体抽取单元18抽取的增量本体和关系抽取单元19抽取的增量本体关系对本体知识库进行更新。
需要说明的是,本实施例更新知识库的执行顺序,例如可以是在本体知识库构建完成后就执行更新,还可以是在知识推理规则库构建完成后执行更新,还可以是在是在融合知识库构建完成后执行更新。
本发明实施例提供的融合知识库处理装置用于执行本发明图5所示实施例提供的融合知识库处理方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。
图8为本发明实施例提供的一种知识库管理系统的结构示意图。本实施例提供的知识库管理系统包括问答装置200,以及上述图6和图7所示任一实施例中的融合知识库处理装置100。
其中,该问答装置200包括:用户交互模块210,配置为获取用户输入的问题。
语言处理模块220,配置为对用户交互模块210获取的问题进行语言处理,语言处理包括分词、词性标注和实体识别。
本实施例中的语音处理通常指自然语言处理,例如用户输入问题后,首先对问题进行分词、词性标注、实体识别等自然语言处理,和传统的FAQ问答不同的是,本实施例的知识库中引入了通用本体和领域本体,其主要作用于分词,使分词更准确,只有分词准确了,才能进行正确的语义理解。举例来说,输入问题为:“天翼A8领航套餐如何收费”,采用传统的分词方式基本分为“天/翼/A/8/领/航/套餐/如何/收费”,而采用本发明实施例生成的融合知识库后,可以分为“天翼A8领航套餐/如何/收费”。
标准化模块230,配置为对语言处理模块220处理后的问题进行标准化处理,标准化处理包括错别字纠正、语言转换和标准化和词语识别。
本实施例中的该模块主要将语言标准化,比如错别字纠正、方言转化为标准普通话、全/简/别称的标准化处理,还可以包括敏感词的识别等。
语义理解模块240,配置为对标准化模块230处理后的问题进行语义理解,获取问题的语义信息。
在语言处理和标准化后,通常可以理解输入问题的正确语义,例如可以是获取到问题的语义信息,该语义信息主要包括抽取和扩展的关键词、抽取的三元组和三元关系、基于上下文的省略回复、指代消解、意图识别、情感分析、问题追问等。
信息检索模块250,用配置为将语义理解模块240获取的语义信息在本体知识库中进行检索,获取问题的答案。
在本实施例中,可以将语义理解模块240获取的语义信息,即关键词、三元组、意图等在融合知识库中进行检索,例如可以采用查询语言和数据获取协议(Simple Protocol and RDF Query Language,SPARQL)等检索语言,在获取问题的答案可以返回给用户,通常地,用户的输入和查看返回的答案均可以通过用户交互模块210来实现,例如可以为图形用户界面(Graphical User Interface,GUI)。
本实施例提供的知识库管理系统配置为对用户输入的问题在融合知识库中进行检索,并通过融合知识库的内容给出答案。本实施例提供的系统中的用于检索的资源为上述实施例提供的融合知识库处理装置所生成的,因此具有与上述实施例相同的技术效果,此处不再赘述。
由于本发明上述实施例提供的融合知识库处理装置100在生成本体知识库后,还能进一步生成知识推理规则库,若上述通过信息检索模块250在融合知识库中没有得到答案,则可以进一步根据知识推理规则库,进行推理来获得答案;即本实施例提供的问答装置200还包括:知识推理模块260,还配置为在信息检索模块250获取到的答案为空时,根据推理规则模块生成的知识推理规则库进行推理,获取上述问题的答案。
进一步地,若知识推理模块260在上述知识推理规则库中进行推理获取到的答案同样为空时,还可以将语义理解模块240获取的语义信息在FAQ库中进行检索,获取问题的答案。在实现的过程中,可以通过将语义信息在FAQ库中进行检索,将检索结果进行相似度计算,并将最终返回阈值内,且排名第一的答案作为问题的答案。
需要说明的是,本实施提供中的问答装置200还可以包括:缓存答案获取模块270,配置为在标准化模块230对语言处理模块220处理后问题进行标准化处理之后,从缓存中获取问题的答案。该方式有利于提高系统性能,可以将常用问题和知识库放入内存中,在语言标准后的基础上,先从缓存中获取答案,避免了大量的检索和计算过程。另外,本实施例提供的问答装置200还可以包括答案返回模块280,配置为返回信息检索模块250、知识推理模块260或缓存答案获取模块270获取的答案,并将答案输出到用户交互模块210中。
通过本实施例提供的知识库管理系统进行智能问答的处理流程如图9所示,为通过图8所示知识库管理系统进行智能问答的流程图,流程包括:
S201,获取用户输入的问题。
S202,对用户输入的问题进行自然语言处理。在处理过程中通常使用到通用本体库、领域本体库、同义词库和停用词库等。
S203,对所述用户输入的问题进行语言标准化处理。在处理过程中通常使用到敏感词库和方言库等。
S204,判断是否可以从缓存中获取问题的答案。若否,则执行S205;若是,则执行S210,若缓存中存储有该问题的知识内容,则可以获取并返回问题的答案。
S205,对标准化处理后的问题进行语义理解,获取语义信息。
S206,判断是否可以从本体知识库中获取问题的答案。若否,则执行S207;若是,则执行S210,若本体知识库中存储有该问题的知识内容,则可以获取并返回问题的答案。
S207,判断是否可以从知识推理规则库中获取问题的答案。若否,则执行S208;若是,则执行S210,若知识推理规则库中可以推理出该问题的知识内容,则可以获取并返回问题的答案。
S208,将语义信息在FAQ库中进行检索,获取检索结果。
S209,对检索结果进行相似性计算。
S210,获取并返回问题的答案。
图10为本发明实施例提供的另一种知识库管理系统的结构示意图。在上述图8所提供的知识库管理系统的结构基础上,本实施例提供的系统还包括管理装置300,该管理装置300包括:知识编辑模块310,配置为对融合知识库处理装置100所抽取的知识本体进行编辑、校对,以及增加新的知识本体。还可以人工增加知识。
知识审核模块320,配置为对融合知识库中的知识本体进行多级审核。通常需要通过多级人员审核,以确保知识的正确性。
统计分析模块330,配置为在通过问答装置200进行提问和获取答案的过程中,统计热门问题、热门知识,融合知识库中未访问的知识本体,以及出现错误答案的知识本体中的一项或多项,并分析通过问答装置200获取答案的准确率。
在线测试模块340,配置为对知识编辑模块310所编辑的知识本体进行验证,获取在线测试结果。
规则管理模块350,配置为对抽取的方式和规则进行管理编辑。例如可以是人工编辑。
资料管理模块360,配置为存储FAQ库和本体知识库中的初始内容,并且按照预置的时间存储融合知识库中的内容。该模块可以对导入的原始附件进行管理,如存储、多版本对比、附件归档等。
智能质检模块370,配置为通过抽样检验确定通过问答装置200获取答案的准确率,还配置为在问答装置200的工作过程中自动检测答案的准确率。
本发明实施例中融合知识库处理装置所包括的各模块,以及各模块所包括的各子模块,都可以通过终端中处理器来实现;当然还可以通过逻辑电路来实现,在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。
需要说明的是,本发明实施例中,如果以软件功能模块的形式实现上述的融合知识库处理方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、 移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。
相应地,本发明实施例再提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行本发明实施例中融合知识库处理方法。
基于前述的实施例,本发明实施例再提供一种终端,包括:存储介质和处理器,其中存储介质配置为存储可执行指令,处理器配置为执行存储的可执行指令,所述可执行指令配置为执行下面的步骤:
从常见问题FAQ库和本体知识库中抽取知识分类目录,所述本体知识库中包括知识本体和本体关系;
根据所述知识本体、所述本体关系和所述知识分类目录,将所述FAQ库的内容挂载到所述本体知识库中,生成融合知识库。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘) 中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本发明各个实施例所述的方法。
需要说明的是,本实施例提供的知识库管理系统还可以对知识进行同步,主要是将用户在知识库管理系统的提问记录和日志,同步到知识库管理系统中,由管理员或知识专业人员对记录进行质检,实时检测;另外,本实施例的知识库管理系统还可以将提问时没有准确回答的问题或没有答案的问题,进行知识抽取,增量更新到融合知识库中,以便于下次能检索到。其中,同步的方式和更新的方式均与上述融合知识库处理装置进行更新的方式相同,故在此再赘述。
虽然本发明所揭露的实施方式如上,但所述的内容仅为便于理解本发明而采用的实施方式,并非用以限定本发明。任何本发明所属领域内的技术人员,在不脱离本发明所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本发明的专利保护范围,仍须以所附的权利要求书所界定的范围为准。
工业实用性
本发明实施例中,通过从FAQ库和本体知识库中抽取知识分类目录,并根据本体知识库中的知识本体、本体关系和已抽取的知识分类目录,将FAQ库的内容挂载到本体知识库中,生成融合知识库,将本实施例生成的融合知识库应用于问答系统中,实现了原先通过FAQ库和本体知识库进行知识问答的答案都可以通过该融合知识库得出,在结合FAQ库和本体知识库优点的同时,可以避免FAQ库和本体知识库各自的缺点;本发明提供的方法解决了现有技术中的问答系统,由于检索式问答系统、基于知识库的问答系统和社区协作式问答系统在实际应用中都存在各项缺陷,而导致问答系统的性能难以提高。

Claims (18)

  1. 一种融合知识库处理方法,包括:
    从常见问题FAQ库和本体知识库中抽取知识分类目录,所述本体知识库中包括知识本体和本体关系;
    根据所述知识本体、所述本体关系和所述知识分类目录,将所述FAQ库的内容挂载到所述本体知识库中,生成融合知识库。
  2. 根据权利要求1所述的融合知识库处理方法,其中,所述根据所述知识本体、所述本体关系和知识分类目录将所述FAQ库的内容挂载到本体知识库中,包括:
    将所述本体知识库的知识本体挂载到所述知识分类目录上;
    将所述FAQ库的内容挂载到所述知识分类目录和所述知识本体上。
  3. 根据权利要求1所述的融合知识库处理方法,其中,所述从常见问题FAQ库和本体知识库中抽取知识分类目录之前,还包括:
    输入初始语料,所述初始语料包括文本语料和FAQ语料;
    根据所述文本语料和所述FAQ语料进行知识库的构建,生成所述本体知识库。
  4. 根据权利要求3所述的融合知识库处理方法,其中,所述根据所述文本语料和所述FAQ语料进行知识库的构建,生成所述本体知识库,包括:
    确定本体目标和抽取规则,所述抽取规则包括本体关系规则和抽取算法;
    通过所述本体目标和所述抽取算法从所述文本语料和所述FAQ语料中抽取知识本体,所述知识本体包括领域本体和通用本体;
    根据所述本体关系规则从所述知识本体中抽取本体关系;
    根据所抽取到的知识本体和本体关系进行知识库的构建,生成本体 知识库。
  5. 根据权利要求3或4所述的融合知识库处理方法,其中,所述生成本体知识库之后,还包括:
    根据预置的推理规则,在所述本体知识库中进行知识推理规则的抽取,生成知识推理规则库。
  6. 根据权利要求3或4所述的融合知识库处理方法,其中,所述生成本体知识库之后,还包括:
    在所述本体知识库中输入增量语料;
    通过所述抽取算法从所述增量语料中抽取增量本体;
    根据所述本体关系规则从所述增量本体中抽取增量本体关系;
    通过所抽取到的增量本体和增量本体关系对所述本体知识库进行更新。
  7. 一种融合知识库处理装置,包括:
    目录抽取模块,配置为从常见问题FAQ库和本体知识库中抽取知识分类目录,所述本体知识库中包括知识本体和本体关系;
    挂载模块,配置为根据所述知识本体、所述本体关系和所述目录抽取模块抽取的知识分类目录,将所述FAQ库的内容挂载到本体知识库中,生成融合知识库。
  8. 根据权利要求7所述的融合知识库处理装置,其中,所述挂载模块包括:
    第一挂载单元,配置为将所述本体知识库的知识本体挂载到所述目录抽取模块抽取的知识分类目录上;
    第二挂载单元,配置为将所述FAQ库的内容挂载到所述知识分类目录和所述知识本体上。
  9. 根据权利要求7所述的融合知识库处理装置,其中,还包括:
    输入模块,配置为在所述目录抽取模块从所述FAQ库和所述本体知识库中抽取知识分类目录之前,输入初始语料,所述初始语料包括文本语料和FAQ语料;
    构建模块,配置为根据所述输入模块输入的文本语料和所述FAQ语料进行知识库的构建,生成所述本体知识库。
  10. 根据权利要求9所述的融合知识库处理装置,其中,所述构建模块包括:
    确定单元,配置为确定本体目标和抽取规则,所述抽取规则包括本体关系规则和抽取算法;
    本体抽取单元,配置为通过所述确定单元确定的本体目标和抽取算法从所述输入模块输入的文本语料和所述FAQ语料中抽取知识本体,所述知识本体包括领域本体和通用本体;
    关系抽取单元,配置为根据所述确定单元确定的本体关系规则从所述本体抽取单元抽取的知识本体中抽取本体关系;
    知识库生成单元,配置为根据所述本体抽取单元抽取的知识本体和所述关系抽取单元抽取的本体关系进行知识库的构建,生成本体知识库。
  11. 根据权利要求9或10所述的融合知识库处理装置,其中,还包括:推理规则模块,配置为在所述构建模块生成本体知识库之后,根据预置的推理规则,在所述本体知识库中进行知识推理规则的抽取,生成知识推理规则库。
  12. 根据权利要求9或10所述的融合知识库处理装置,其中,所述输入模块,还配置为在所述构建模块生成本体知识库之后,在所述本体知识库中输入增量语料;
    所述本体抽取单元,还配置为通过所述确定单元确定的抽取算法从所述输入模块输入的增量语料中抽取增量本体;
    所述关系抽取单元,还配置为根据所述确定单元确定的本体关系规则从所述本体抽取单元抽取的增量本体中抽取增量本体关系;
    所述融合知识库处理装置还包括:更新模块,配置为通过所述本体抽取单元抽取的增量本体和所述关系抽取单元抽取的增量本体关系对所述本体知识库进行更新。
  13. 一种知识库管理系统,包括:问答装置和如权利要求7~12中任一项所述的融合知识库处理装置;
    其中,所述问答装置包括:用户交互模块,配置为获取用户输入的问题;
    语言处理模块,配置为对所述用户交互模块获取的问题进行语言处理,所述语言处理包括分词、词性标注和实体识别;
    标准化模块,配置为对所述语言处理模块处理后的问题进行标准化处理,所述标准化处理包括错别字纠正、语言转换和标准化和词语识别;
    语义理解模块,配置为对所述标准化模块处理后的问题进行语义理解,获取所述问题的语义信息;
    信息检索模块,配置为将所述语义理解模块获取的语义信息在所述本体知识库中进行检索,获取所述问题的答案。
  14. 根据权利要求13所述的知识库管理系统,其中,所述融合知识库处理装置包括所述推理规则模块;所述问答装置还包括:知识推理模块,配置为在所述信息检索模块获取到的答案为空时,根据所述推理规则模块生成的知识推理规则库进行推理,获取所述问题的答案。
  15. 根据权利要求14所述的知识库管理系统,其中,所述信息检索模块,还配置为在所述知识推理模块获取到的答案为空时,将所述语义理解模块获取的语义信息在所述FAQ库中进行检索,获取所述问题的答案。
  16. 根据权利要求13所述的知识库管理系统,其中,所述问答装置还包括:缓存答案获取模块,配置为在所述标准化模块对所述语言处理模块处理后的问题进行标准化处理之后,从缓存中获取所述问题的答案。
  17. 根据权利要求13~16中任一项所述的知识库管理系统,其中,还包括管理装置;
    其中,所述管理装置包括:知识编辑模块,配置为对所述融合知识库处理装置所抽取的知识本体进行编辑、校对,以及增加新的知识本体;
    知识审核模块,配置为对所述融合知识库中的知识本体进行多级审核;
    统计分析模块,配置为在通过所述问答装置进行提问和获取答案的过程中,统计热门问题、热门知识,所述融合知识库中未访问的知识本体,以及出现错误答案的知识本体中的一项或多项,并分析通过所述问答装置获取答案的准确率;
    在线测试模块,配置为对所述知识编辑模块所编辑的知识本体进行验证,获取在线测试结果;
    规则管理模块,配置为对抽取的方式和规则进行管理编辑;
    资料管理模块,配置为存储所述FAQ库和所述本体知识库中的初始内容,并且按照预置的时间存储所述融合知识库中的内容;
    智能质检模块,配置为通过抽样检验确定通过所述问答装置获取答案的准确率,还配置为在所述问答装置的工作过程中自动检测答案的准确率。
  18. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行权利要求1至6任一项所述的融合知识库处理方法。
PCT/CN2016/104136 2015-11-03 2016-10-31 融合知识库处理方法和装置及知识库管理系统、存储介质 WO2017076263A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510738095.1 2015-11-03
CN201510738095.1A CN106649394A (zh) 2015-11-03 2015-11-03 融合知识库处理方法和装置,以及知识库管理系统

Publications (1)

Publication Number Publication Date
WO2017076263A1 true WO2017076263A1 (zh) 2017-05-11

Family

ID=58661667

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/104136 WO2017076263A1 (zh) 2015-11-03 2016-10-31 融合知识库处理方法和装置及知识库管理系统、存储介质

Country Status (2)

Country Link
CN (1) CN106649394A (zh)
WO (1) WO2017076263A1 (zh)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388793A (zh) * 2017-08-03 2019-02-26 阿里巴巴集团控股有限公司 实体标注方法、意图识别方法及对应装置、计算机存储介质
CN109598384A (zh) * 2018-12-06 2019-04-09 同方知网(北京)技术有限公司 一种农业产业创新服务图谱构建系统
CN110019709A (zh) * 2017-11-22 2019-07-16 上海智臻智能网络科技股份有限公司 机器人自动问答的方法及机器人自动问答的系统
CN110580339A (zh) * 2019-08-21 2019-12-17 华东理工大学 一种医疗术语知识库完善的方法和装置
CN110619038A (zh) * 2019-09-20 2019-12-27 上海氦豚机器人科技有限公司 一种垂直引导专业咨询的方法、系统及电子设备
CN110807603A (zh) * 2019-11-12 2020-02-18 中国舰船研究设计中心 一种面向舰船生活舱室布置设计的知识应用系统与方法
CN110837563A (zh) * 2018-08-17 2020-02-25 阿里巴巴集团控股有限公司 案件的裁判方法、装置和系统
CN111104501A (zh) * 2019-12-23 2020-05-05 中国银行股份有限公司 一种针对测试服务的知识确定方法和装置
CN111159045A (zh) * 2019-12-31 2020-05-15 中国银行股份有限公司 一种兼容性问题检测方法、装置及存储介质
CN111414331A (zh) * 2020-03-26 2020-07-14 北京字节跳动网络技术有限公司 在线协同知识库的文档导入方法、装置、存储介质及设备
CN111506719A (zh) * 2020-04-20 2020-08-07 深圳追一科技有限公司 一种关联问句推荐方法、装置、设备及可读存储介质
CN111708869A (zh) * 2020-05-12 2020-09-25 北京明略软件系统有限公司 人机对话的处理方法及装置
CN111782825A (zh) * 2020-08-20 2020-10-16 支付宝(杭州)信息技术有限公司 知识库构建方法及装置
CN111916110A (zh) * 2020-08-06 2020-11-10 龙马智芯(珠海横琴)科技有限公司 语音质检的方法及装置
CN111914550A (zh) * 2020-07-16 2020-11-10 华中师范大学 一种面向限定领域的知识图谱更新方法及系统
CN112015919A (zh) * 2020-09-15 2020-12-01 重庆广播电视大学重庆工商职业学院 一种基于学习辅助知识图谱的对话管理方法
CN112015915A (zh) * 2020-09-01 2020-12-01 哈尔滨工业大学 基于问题生成的知识库问答系统及装置
CN112015920A (zh) * 2020-09-15 2020-12-01 重庆广播电视大学重庆工商职业学院 一种基于知识图谱和边缘计算智能辅助学习系统
CN112163094A (zh) * 2020-08-25 2021-01-01 中国科学院计算机网络信息中心 一种科技资源汇聚与持续服务方法及装置
CN112463986A (zh) * 2020-12-08 2021-03-09 北京明略软件系统有限公司 信息存储的方法及装置
CN112528046A (zh) * 2020-12-25 2021-03-19 网易(杭州)网络有限公司 新的知识图谱的构建方法、装置和信息检索方法、装置
CN112612902A (zh) * 2020-12-23 2021-04-06 国网浙江省电力有限公司电力科学研究院 一种电网主设备的知识图谱构建方法及设备
CN112801492A (zh) * 2021-01-22 2021-05-14 中国平安人寿保险股份有限公司 基于知识阶层的数据质检的方法、装置及计算机设备
CN113268604A (zh) * 2021-05-19 2021-08-17 国网辽宁省电力有限公司 知识库自适应扩展方法及系统
CN113282698A (zh) * 2021-06-07 2021-08-20 中国科学院地理科学与资源研究所 一种生态文明地理知识标准化知识库的构建方法
CN113495951A (zh) * 2020-04-03 2021-10-12 源析(青岛)信息技术有限公司 一种面向持续性社会事件的知识图谱的构建方法
CN113515643A (zh) * 2021-05-19 2021-10-19 思必驰科技股份有限公司 用户身份核实方法及装置
CN113535685A (zh) * 2021-07-28 2021-10-22 深圳供电局有限公司 一种电网智能调度的事件化知识库构建方法
CN113568998A (zh) * 2021-06-18 2021-10-29 武汉理工数字传播工程有限公司 一种知识服务资源处理方法、装置、设备及存储介质
CN114399006A (zh) * 2022-03-24 2022-04-26 山东省计算中心(国家超级计算济南中心) 基于超算的多源异构图数据融合方法及系统
CN114638230A (zh) * 2022-03-16 2022-06-17 四川智胜慧旅科技有限公司 一种互联网大数据分析方法及系统
CN116775847A (zh) * 2023-08-18 2023-09-19 中国电子科技集团公司第十五研究所 一种基于知识图谱和大语言模型的问答方法和系统
CN117077792A (zh) * 2023-10-12 2023-11-17 支付宝(杭州)信息技术有限公司 一种基于知识图谱生成提示数据的方法及装置

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959290A (zh) * 2017-05-18 2018-12-07 日本电气株式会社 知识数据的处理方法和设备
CN109582798A (zh) * 2017-09-29 2019-04-05 阿里巴巴集团控股有限公司 自动问答方法、系统及设备
CN108595471B (zh) * 2018-03-07 2022-08-02 中山大学 一种基于智能规划的知识获取方法
CN108959488B (zh) * 2018-06-22 2021-12-07 创新先进技术有限公司 维护问答模型的方法及装置
CN109460448A (zh) * 2018-08-31 2019-03-12 厦门快商通信息技术有限公司 一种可自主配置的faq服务框架
CN109284383A (zh) * 2018-10-09 2019-01-29 北京来也网络科技有限公司 文本处理方法及装置
CN109460452A (zh) * 2018-10-10 2019-03-12 长沙师范学院 基于本体的智能客服系统
CN109460459B (zh) * 2018-10-31 2020-09-22 神思电子技术股份有限公司 一种基于日志学习的对话系统自动优化方法
CN110083608B (zh) * 2019-04-26 2021-10-15 北京零秒科技有限公司 基于知识库的内容管理方法及装置
CN110110065A (zh) * 2019-05-08 2019-08-09 北京颢云信息科技股份有限公司 一种基于知识图谱的自然语言问答系统
CN110298445A (zh) * 2019-05-30 2019-10-01 合肥阿拉丁智能科技有限公司 深度学习自主运行方法
CN110209589B (zh) * 2019-06-05 2022-11-18 北京百度网讯科技有限公司 知识库系统测试方法、装置、设备和介质
CN110309509A (zh) * 2019-06-28 2019-10-08 神思电子技术股份有限公司 一种语义知识库构建方法
US11841883B2 (en) * 2019-09-03 2023-12-12 International Business Machines Corporation Resolving queries using structured and unstructured data
CN111949855A (zh) * 2020-07-31 2020-11-17 国网上海市电力公司 一种基于知识图谱的工程技经知识检索平台及其方法
CN112612866B (zh) * 2020-12-29 2023-07-21 北京奇艺世纪科技有限公司 知识库文本同步方法、装置、电子设备及存储介质
CN116383345B (zh) * 2023-06-05 2023-08-22 中国医学科学院医学信息研究所 本体融合的方法、装置、电子设备及存储介质
CN116860893B (zh) * 2023-07-14 2024-03-08 浪潮智慧科技有限公司 一种水利数据管理方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060085750A1 (en) * 2004-10-19 2006-04-20 International Business Machines Corporation Intelligent web based help system
CN101373532A (zh) * 2008-07-10 2009-02-25 昆明理工大学 旅游领域faq中文问答系统实现方法
CN103902652A (zh) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 自动问答系统
US8832064B2 (en) * 2005-11-30 2014-09-09 At&T Intellectual Property Ii, L.P. Answer determination for natural language questioning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850539B (zh) * 2015-05-28 2017-08-25 宁波薄言信息技术有限公司 一种自然语言理解方法及基于该方法的旅游问答系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060085750A1 (en) * 2004-10-19 2006-04-20 International Business Machines Corporation Intelligent web based help system
US8832064B2 (en) * 2005-11-30 2014-09-09 At&T Intellectual Property Ii, L.P. Answer determination for natural language questioning
CN101373532A (zh) * 2008-07-10 2009-02-25 昆明理工大学 旅游领域faq中文问答系统实现方法
CN103902652A (zh) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 自动问答系统

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388793B (zh) * 2017-08-03 2023-04-07 阿里巴巴集团控股有限公司 实体标注方法、意图识别方法及对应装置、计算机存储介质
CN109388793A (zh) * 2017-08-03 2019-02-26 阿里巴巴集团控股有限公司 实体标注方法、意图识别方法及对应装置、计算机存储介质
CN110019709B (zh) * 2017-11-22 2024-05-17 上海智臻智能网络科技股份有限公司 机器人自动问答的方法及机器人自动问答的系统
CN110019709A (zh) * 2017-11-22 2019-07-16 上海智臻智能网络科技股份有限公司 机器人自动问答的方法及机器人自动问答的系统
CN110837563A (zh) * 2018-08-17 2020-02-25 阿里巴巴集团控股有限公司 案件的裁判方法、装置和系统
CN110837563B (zh) * 2018-08-17 2023-05-30 阿里巴巴集团控股有限公司 案件的裁判方法、装置和系统
CN109598384A (zh) * 2018-12-06 2019-04-09 同方知网(北京)技术有限公司 一种农业产业创新服务图谱构建系统
CN110580339B (zh) * 2019-08-21 2023-04-07 华东理工大学 一种医疗术语知识库完善的方法和装置
CN110580339A (zh) * 2019-08-21 2019-12-17 华东理工大学 一种医疗术语知识库完善的方法和装置
CN110619038A (zh) * 2019-09-20 2019-12-27 上海氦豚机器人科技有限公司 一种垂直引导专业咨询的方法、系统及电子设备
CN110807603A (zh) * 2019-11-12 2020-02-18 中国舰船研究设计中心 一种面向舰船生活舱室布置设计的知识应用系统与方法
CN110807603B (zh) * 2019-11-12 2023-08-08 中国舰船研究设计中心 一种面向舰船生活舱室布置设计的知识应用系统与方法
CN111104501A (zh) * 2019-12-23 2020-05-05 中国银行股份有限公司 一种针对测试服务的知识确定方法和装置
CN111159045B (zh) * 2019-12-31 2024-04-19 中国银行股份有限公司 一种兼容性问题检测方法、装置及存储介质
CN111159045A (zh) * 2019-12-31 2020-05-15 中国银行股份有限公司 一种兼容性问题检测方法、装置及存储介质
CN111414331A (zh) * 2020-03-26 2020-07-14 北京字节跳动网络技术有限公司 在线协同知识库的文档导入方法、装置、存储介质及设备
CN111414331B (zh) * 2020-03-26 2023-08-08 北京字节跳动网络技术有限公司 在线协同知识库的文档导入方法、装置、存储介质及设备
CN113495951A (zh) * 2020-04-03 2021-10-12 源析(青岛)信息技术有限公司 一种面向持续性社会事件的知识图谱的构建方法
CN111506719B (zh) * 2020-04-20 2023-09-12 深圳追一科技有限公司 一种关联问句推荐方法、装置、设备及可读存储介质
CN111506719A (zh) * 2020-04-20 2020-08-07 深圳追一科技有限公司 一种关联问句推荐方法、装置、设备及可读存储介质
CN111708869B (zh) * 2020-05-12 2023-07-14 北京明略软件系统有限公司 人机对话的处理方法及装置
CN111708869A (zh) * 2020-05-12 2020-09-25 北京明略软件系统有限公司 人机对话的处理方法及装置
CN111914550B (zh) * 2020-07-16 2023-12-15 华中师范大学 一种面向限定领域的知识图谱更新方法及系统
CN111914550A (zh) * 2020-07-16 2020-11-10 华中师范大学 一种面向限定领域的知识图谱更新方法及系统
CN111916110A (zh) * 2020-08-06 2020-11-10 龙马智芯(珠海横琴)科技有限公司 语音质检的方法及装置
CN111916110B (zh) * 2020-08-06 2024-04-26 龙马智芯(珠海横琴)科技有限公司 语音质检的方法及装置
CN111782825A (zh) * 2020-08-20 2020-10-16 支付宝(杭州)信息技术有限公司 知识库构建方法及装置
CN112163094A (zh) * 2020-08-25 2021-01-01 中国科学院计算机网络信息中心 一种科技资源汇聚与持续服务方法及装置
CN112015915A (zh) * 2020-09-01 2020-12-01 哈尔滨工业大学 基于问题生成的知识库问答系统及装置
CN112015920A (zh) * 2020-09-15 2020-12-01 重庆广播电视大学重庆工商职业学院 一种基于知识图谱和边缘计算智能辅助学习系统
CN112015919A (zh) * 2020-09-15 2020-12-01 重庆广播电视大学重庆工商职业学院 一种基于学习辅助知识图谱的对话管理方法
CN112463986A (zh) * 2020-12-08 2021-03-09 北京明略软件系统有限公司 信息存储的方法及装置
CN112612902A (zh) * 2020-12-23 2021-04-06 国网浙江省电力有限公司电力科学研究院 一种电网主设备的知识图谱构建方法及设备
CN112612902B (zh) * 2020-12-23 2023-07-14 国网浙江省电力有限公司电力科学研究院 一种电网主设备的知识图谱构建方法及设备
CN112528046B (zh) * 2020-12-25 2023-09-15 网易(杭州)网络有限公司 新的知识图谱的构建方法、装置和信息检索方法、装置
CN112528046A (zh) * 2020-12-25 2021-03-19 网易(杭州)网络有限公司 新的知识图谱的构建方法、装置和信息检索方法、装置
CN112801492A (zh) * 2021-01-22 2021-05-14 中国平安人寿保险股份有限公司 基于知识阶层的数据质检的方法、装置及计算机设备
CN112801492B (zh) * 2021-01-22 2023-07-25 中国平安人寿保险股份有限公司 基于知识阶层的数据质检的方法、装置及计算机设备
CN113268604A (zh) * 2021-05-19 2021-08-17 国网辽宁省电力有限公司 知识库自适应扩展方法及系统
CN113515643A (zh) * 2021-05-19 2021-10-19 思必驰科技股份有限公司 用户身份核实方法及装置
CN113268604B (zh) * 2021-05-19 2024-06-07 国网辽宁省电力有限公司 知识库自适应扩展方法及系统
CN113282698A (zh) * 2021-06-07 2021-08-20 中国科学院地理科学与资源研究所 一种生态文明地理知识标准化知识库的构建方法
CN113568998A (zh) * 2021-06-18 2021-10-29 武汉理工数字传播工程有限公司 一种知识服务资源处理方法、装置、设备及存储介质
CN113535685A (zh) * 2021-07-28 2021-10-22 深圳供电局有限公司 一种电网智能调度的事件化知识库构建方法
CN113535685B (zh) * 2021-07-28 2024-05-17 深圳供电局有限公司 一种电网智能调度的事件化知识库构建方法
CN114638230A (zh) * 2022-03-16 2022-06-17 四川智胜慧旅科技有限公司 一种互联网大数据分析方法及系统
CN114399006A (zh) * 2022-03-24 2022-04-26 山东省计算中心(国家超级计算济南中心) 基于超算的多源异构图数据融合方法及系统
CN116775847A (zh) * 2023-08-18 2023-09-19 中国电子科技集团公司第十五研究所 一种基于知识图谱和大语言模型的问答方法和系统
CN116775847B (zh) * 2023-08-18 2023-11-28 中国电子科技集团公司第十五研究所 一种基于知识图谱和大语言模型的问答方法和系统
CN117077792A (zh) * 2023-10-12 2023-11-17 支付宝(杭州)信息技术有限公司 一种基于知识图谱生成提示数据的方法及装置
CN117077792B (zh) * 2023-10-12 2024-01-09 支付宝(杭州)信息技术有限公司 一种基于知识图谱生成提示数据的方法及装置

Also Published As

Publication number Publication date
CN106649394A (zh) 2017-05-10

Similar Documents

Publication Publication Date Title
WO2017076263A1 (zh) 融合知识库处理方法和装置及知识库管理系统、存储介质
US11790006B2 (en) Natural language question answering systems
US20220382752A1 (en) Mapping Natural Language To Queries Using A Query Grammar
US11573996B2 (en) System and method for hierarchically organizing documents based on document portions
CN109284363B (zh) 一种问答方法、装置、电子设备及存储介质
US8185509B2 (en) Association of semantic objects with linguistic entity categories
US10346358B2 (en) Systems and methods for management of data platforms
US8862458B2 (en) Natural language interface
US8296309B2 (en) System and method for high precision and high recall relevancy searching
Woodall et al. A classification of data quality assessment and improvement methods
WO2019200700A1 (zh) 一种公文处理的方法、装置、终端设备及存储介质
CN109522396B (zh) 一种面向国防科技领域的知识处理方法及系统
CN108664509A (zh) 一种即席查询的方法、装置及服务器
CN114091426A (zh) 一种处理数据仓库中字段数据的方法和装置
CN111221785A (zh) 一种多源异构数据的语义数据湖构建方法
CN112907358A (zh) 贷款用户信用评分方法、装置、计算机设备和存储介质
WO2022019986A1 (en) Enterprise knowledge graphs using multiple toolkits
CN115795030A (zh) 文本分类方法、装置、计算机设备和存储介质
CN115422155A (zh) 一种数据湖元数据模型的建模方法
CN111428037B (zh) 一种分析行为政策匹配性的方法
Sloan et al. Data preparation and fuzzy matching techniques for improved statistical modeling
JP2022137569A (ja) 情報管理システム
CN113221528A (zh) 基于openEHR模型的临床数据质量评估规则的自动生成与执行方法
CN112988986A (zh) 人机交互方法、装置与设备
CN113807429B (zh) 企业的分类方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16861531

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16861531

Country of ref document: EP

Kind code of ref document: A1