CN117933384A - Map generation method, device, equipment and storage medium - Google Patents

Map generation method, device, equipment and storage medium Download PDF

Info

Publication number
CN117933384A
CN117933384A CN202410048559.5A CN202410048559A CN117933384A CN 117933384 A CN117933384 A CN 117933384A CN 202410048559 A CN202410048559 A CN 202410048559A CN 117933384 A CN117933384 A CN 117933384A
Authority
CN
China
Prior art keywords
data
entity
action
description text
triplet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410048559.5A
Other languages
Chinese (zh)
Inventor
刘佳
杜韦宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202410048559.5A priority Critical patent/CN117933384A/en
Publication of CN117933384A publication Critical patent/CN117933384A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a map generation method, a map generation device, map generation equipment and a storage medium, wherein the map generation method comprises the following steps: and confirming the association relation between the action data and the first entity data by acquiring the action data and the first entity data in the rule description text, constructing triple data based on the association relation, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triple data.

Description

Map generation method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for generating a map.
Background
User manual (User Manuals): is a written document that guides the user to properly use a product or service. It generally contains basic information about the product or service, instructions on installation, setup, operation, maintenance, and troubleshooting, etc., to help the user to fully understand and properly use the product or service. The contents of the user manual are typically written by the manufacturer or provider of the product or service to ensure that the user can make maximum use of the functionality and performance of the product or service. The user manual may be provided in paper or electronic form, or in multiple languages to meet the needs of different locales and cultures. In the modern technological age, user manuals play a vital role in ensuring quality of products or services and user satisfaction.
The text contained in the user manual may be considered as rule description type text, and in the course of customer service, it is often necessary to answer questions posed by the user with reference to the user manual, and the manner in which the user manual is directly read to confirm how to properly use the product or service is relatively complex and time-consuming. Therefore, how to understand and comb the contents of the user manual, so as to facilitate rapid confirmation of the target knowledge from the user manual is a problem to be solved.
Disclosure of Invention
The main purpose of the present specification is to provide a map generation method, apparatus, device and storage medium, which aims to solve the problem that the rule description text is difficult to understand. The technical scheme is as follows:
in a first aspect, embodiments of the present disclosure provide a map generating method, including:
Acquiring action data in a rule description text;
acquiring first entity data in the rule description text;
confirming the association relation between the action data and the first entity data;
And constructing triplet data based on the association relation, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
In a second aspect, embodiments of the present disclosure provide a atlas-generation model training method, including:
extracting sample triplet data from the sample rule description text set based on the heuristic rule;
constructing a corresponding sample map data set based on the sample triplet data;
And optimizing network parameters of the pattern generation model by adopting a self-supervision training module based on the sample pattern data set, and obtaining the trained pattern generation model when the network parameters meet preset training requirements.
In a third aspect, embodiments of the present disclosure provide a map generating apparatus, including:
the first acquisition module is used for acquiring action data in the rule description text;
the second acquisition module is used for acquiring the first entity data in the rule description text;
The fusion module is used for confirming the association relation between the action data and the first entity data;
And the construction module is used for constructing triplet data based on the association relation, the action data and the first entity data and generating a knowledge graph corresponding to the rule description text based on the triplet data.
In a fourth aspect, embodiments of the present disclosure provide a map generating apparatus, including:
The data set construction module is used for extracting sample triplet data from the sample rule description text set based on heuristic rules;
the feature extraction module is used for constructing a corresponding sample map data set based on the sample triplet data;
And the self-supervision training module is used for optimizing network parameters of the spectrum generation model by adopting the self-supervision training module based on the sample spectrum data set, and obtaining the trained spectrum generation model when the network parameters meet the preset training requirements.
In a fifth aspect, embodiments of the present disclosure provide an electronic device, the device comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, performs the steps of the method as described above.
In a sixth aspect, embodiments of the present description provide a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method as described above.
In a seventh aspect, embodiments of the present description provide a computer program product comprising: a computer program which, when executed by a processor of an electronic device, causes the processor to at least implement the method as described in the first to second aspects.
In the embodiment of the specification, by acquiring the action data and the first entity data in the rule description text, confirming the association relationship between the action data and the first entity data, constructing triple data based on the association relationship, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triple data. The method comprises the steps of obtaining action data in a rule description text to realize analysis and extraction of programming information in the rule description text, obtaining first entity data in the rule description text to realize extraction of factual information in the rule description text, and further confirming association relation between the action data and the first entity data, so that the programming information in the rule description text and the factual information are associated, a knowledge graph corresponding to the rule description document can be automatically built efficiently based on the extracted action data, the first entity data and the association data, the generated knowledge graph has higher understanding degree on the rule description document, and the factual information and the programming information are jointly represented in the graph, so that the knowledge graph constructed based on the method can answer questions about facts, programming and non-consistency of the rule description text, and the answer effect of applying the knowledge graph to answer questions is improved.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart of a method for generating a map according to an embodiment of the present disclosure;
FIG. 3 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 4 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 5 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 6 is a schematic flow chart of a method for generating a map according to an embodiment of the present disclosure;
FIG. 7 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 8 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 9 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 10 is a schematic flow chart of a method for generating a map according to an embodiment of the present disclosure;
FIG. 11 is an exemplary schematic diagram of a method of generating a map according to an embodiment of the present disclosure;
FIG. 12 is a schematic flow chart of a method for generating a map according to an embodiment of the present disclosure;
fig. 13 is a schematic structural view of a map generating apparatus provided in the embodiment of the present specification;
fig. 14 is a schematic structural view of a map generating apparatus provided in the embodiment of the present specification;
Fig. 15 is a schematic structural view of a map generating apparatus provided in the embodiment of the present specification;
fig. 16 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
The technical solutions of the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
In addition, it should be noted that, the user information and data (including, but not limited to, data for analysis, stored data, presented data, etc.) in the embodiments of the present disclosure are all information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant country and region, and is provided with a corresponding operation portal for the user to select authorization or rejection.
The map generating device provided in the embodiment of the present disclosure may be a terminal device such as a mobile phone, a computer, a tablet computer, a smart watch, or a vehicle-mounted device, or may be a module in the terminal device for implementing a map generating method, where the map generating device may determine an association between action data and first entity data by acquiring action data and the first entity data in a rule description text, construct triplet data based on the association, the action data and the first entity data, and generate a knowledge map corresponding to the rule description text based on the triplet data.
Alternatively, the atlas generating device may perform training of the atlas generating model. The atlas generating device can extract sample triplet data from the sample rule description text set based on heuristic rules, construct a corresponding sample atlas data set based on the sample triplet data, optimize network parameters of the atlas generating model based on the sample atlas data set by adopting the self-supervision training module, and obtain the trained atlas generating model when the network parameters meet preset training requirements.
Referring to fig. 1, an exemplary schematic diagram of a graph generating method is provided for an embodiment of the present disclosure, where the graph generating device can obtain action data and first entity data from a rule description text, and confirm an association relationship between the first entity data and the action data, and further construct (entity-relationship-entity) triplet data according to the first entity data, the action data and the corresponding association relationship, so as to generate a triplet knowledge graph.
The map generation method provided in the present specification will be described in detail with reference to specific examples.
Referring to fig. 2, a schematic flow chart of a map generating method is provided in the embodiment of the present disclosure. As shown in fig. 2, the method of the embodiments of the present specification may include the following steps S102-S108.
S102, acquiring action data in a rule description text;
In one embodiment, the rule description text refers to a rule description associated with a particular target product, which constrains the behavior of the product user, and which describes the product and the attributes of the product user. In some cases, the rule description text may also be referred to as "user manual" without limitation. Two types of information are typically included in the rule description text, one being procedural information, such as operational steps, and the other being factual information, such as related notes. Take the rule description text of "code cashback" activity as an example: "user may log on to the client". Click "swipe one swipe". Payment is made by scanning the two-dimensional code. The scratch card may be retrieved at the payment interface. The total number of the scraping cards is 100000, and the scraping cards can not be transferred to other people. The winning rate of the scratch card is 100%. A maximum of 10 scratch cards can be owned during the same user activity. The scratch cards failed 1 hour after being picked up. The user of the activity limit real-name authentication participates. "wherein," a user may log into a client. Click "swipe one swipe". Payment is made by scanning the two-dimensional code. The scratch card is retrievable at a payment interface and belongs to procedural information for describing the flow of the rule; the total number of scratch cards is 100000, and can not be transferred to other people. The winning rate of the scratch card is 100%. A maximum of 10 scratch cards can be owned during the same user activity. The scratch cards failed 1 hour after being picked up. The user of the activity limit real-name authentication participates. "belonging to the fact information" is used to explain the additional condition of the rule.
Specifically, the action data may be extracted from the procedural information in the rule description text, and the action data may be action vocabulary appearing in the rule description text, and information related to the action, for example, an order in which the actions are executed, a number of times of execution, a subject and an object of execution, and the like. For example, the rule description text may be extracted. In one possible implementation, the rule description text may be identified to perform word segmentation, the part of speech of the words contained therein may be identified, the action words with the part of speech being verbs may be selected according to the part of speech, and then the words associated with the action words may be identified according to the semantic features, thereby obtaining the action data. Fig. 3 is a schematic diagram of a map generating method according to an embodiment of the present disclosure, where fig. 3 shows action data stored in a table form, action "get" extracted from "user can get scratch card at payment interface" of rule description text, and parameters related to "get" can be used to identify action data corresponding to the get step by using n-tuple.
S104, acquiring first entity data in the rule description text;
In an embodiment, first entity data may be obtained from factual information of the rule description text, where the first entity data refers to an entity included in the rule description text and information related to the entity, and the entity may specifically be an object, a person, a product, etc., and the information related to the entity may be an attribute, a state, etc. of the entity. In one possible implementation, the first entity data in the rule description text may be identified by a Natural Language Processing (NLP) algorithm. Fig. 4 is a schematic diagram of an example of a map generating method according to an embodiment of the present disclosure, and fig. 4 shows that first entity data stored in a table form, and entity "users" extracted from "user participation of active limited real name authentication" of rule description text and related information thereof are stored in the four-layer entity hierarchical tree.
S106, confirming the association relation between the action data and the first entity data;
In an embodiment, in order to better understand the rule description text, the association relationship between the action data and the first entity data is confirmed. Fig. 5 is an exemplary schematic diagram of a map generating method provided in the embodiment of the present disclosure, and fig. 5 shows an association relationship between action data and first entity data, where a "user" in the action data and a "user" in the first entity data may be physically connected, that is, an association relationship exists.
S108, constructing triplet data based on the association relation, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
In one embodiment, triplet data is constructed according to the association relationship, the action data and the first entity data, and a knowledge graph corresponding to the rule description text is generated according to the triplet data. Knowledge graph is a directed heterogeneous information network that contains a large number of entities (e.g., project entities, attribute entities, other entities) and relationships between entities. The triplet data may contain a plurality of triples, each of which is constituted of (entity-relationship-entity). Based on the triplet data, it may be confirmed whether each entity has an association with other entities, and a knowledge graph is generated based on the entities in the triplet and the entity relationship.
It will be appreciated that, for better understanding and summarizing rule description class text, heuristic rules may be pre-designed to constrain the knowledge extraction task on rule description text. Heuristic rules may include in particular: 1. what are actions in the process? ; 2. what are the parameters of the action (the object, time, place and way of the action)? ; 3. is the order of actions? ; 4. what is the underlying entity of the program? ; 5. what are the details of the entities (sub-entities, attributes, states? ; 6. is an entity or sub-entity associated with a parameter of an action? ; 7. is the status associated with the step? The heuristic rules can guide the focused elements in the rule description text, namely the first entity data and the action data which need to be acquired, and guide the matching of the association relationship between the first entity data and the action data. Therefore, in a possible implementation manner, the rule description text may be preprocessed by the dependency syntax analyzer, including word segmentation, part-of-speech tagging, entity extraction, dependency syntax analysis, and the like, to obtain first entity data and action data, then parse heuristic rules in the rule base, and match a result obtained by the dependency analysis with the rules, so that one triplet data can be obtained after matching one rule.
After the knowledge graph is generated, the data in the knowledge graph may be stored by using a graph database. The graph database is a database that uses graph structures for semantic queries. So-called semantic queries, i.e., queries and analyses that allow for associative and contextual nature, can utilize the grammatical, semantic, and structural information contained in the data to retrieve explicitly and implicitly derived information. The constructed atlas can serve various downstream tasks, such as a multi-round dialogue system based on the knowledge atlas, is used for replacing manual customer service to answer the questions of users, and can also be applied to retrieval based on the knowledge atlas, recommendation based on the knowledge atlas and the like.
In the embodiment of the specification, by acquiring the action data and the first entity data in the rule description text, confirming the association relation between the action data and the first entity data, constructing triple data based on the association relation, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triple data. The method comprises the steps of obtaining action data in a rule description text to realize analysis and extraction of programming information in the rule description text, obtaining first entity data in the rule description text to realize extraction of factual information in the rule description text, and further confirming association relation between the action data and the first entity data, so that the programming information in the rule description text and the factual information are associated, and based on the extracted action data, the first entity data and the association data, a knowledge graph corresponding to the rule description document can be automatically built efficiently, so that the generated knowledge graph has higher understanding degree on the rule description document, and the completion effect of downstream application knowledge graph tasks is improved.
Referring to fig. 6, a flowchart of a map generating method is provided in an embodiment of the present disclosure. As shown in fig. 6, the method of the embodiments of the present specification may include the following steps S202-S238.
S202, acquiring an execution action in the rule description text;
specifically, the confirmation rule describes which execution actions exist in the text, and taking the activity rule description text of "code cashback" in the above-described embodiment of the specification as an example, execution actions "login", "click", "payment", "get" and "win" may be extracted therefrom. Wherein the winning may be an action extending from the rule description text, i.e. after a "get" action, the current user action may be determined to be "winning", i.e. the scratch card is successfully acquired.
S204, confirming the execution sequence of the execution action based on the rule description text;
specifically, for the acquired execution actions, the execution order of each execution action is confirmed according to the rule description text.
S206, confirming the parameters of the execution action from the rule description text based on the key parameter types;
Specifically, parameters related to executing the action are confirmed from the rule description text according to the key parameter types. It will be appreciated that the rule description text may be a pre-processed text, i.e. a set of key elements that have been extracted, from which the parameters of the execution action are validated according to a pre-set type of key parameters of interest.
In one embodiment of the present description, the key parameter types include one or more of an object, a time, a place, and a manner of performing the action.
S208, confirming action data according to the execution sequence and the parameters;
Specifically, the execution order and the parameters are used as references for forming the execution operation side relationship, and the operation data is obtained. Referring to fig. 7, fig. 7 is an exemplary schematic diagram of a map generating method according to an embodiment of the present disclosure, in which the execution actions are arranged in order, a relationship map between the execution actions is formed, and the execution direction is marked by a unidirectional arrow; in addition, fig. 7 also shows a relationship diagram formed by the execution action and the parameters, and the relationship between the parameters and the execution action is confirmed by the key parameter type, for example, the location of the execution action is taken on a payment interface, and the object is a scratch card.
S210, acquiring a first entity in the rule description text;
specifically, the first entity contained in the text is described from the rule.
S212, confirming description information related to the first entity from the rule description text based on the key description type;
Specifically, description information related to the first entity is confirmed according to the key description type. It will be appreciated that there may be a variety of information associated with the first entity, and thus descriptive information associated with the first entity may be obtained in the rule description text based on the pre-identified key description types.
In one embodiment of the present description, the key description type includes one or more of a sub-entity, an attribute, and a state of the first entity. Wherein classification by hierarchy can be performed for each entity, so that a sub-entity is an entity contained in the hierarchical relationship of the entities.
S214, constructing and obtaining first entity data based on the description information and the first entity;
Specifically, the relationship between the description information and the first entity is confirmed according to the description information, for example, the first entity may be a head entity, the description information may be a tail entity, the key description type corresponding to the description information may be a side relationship, and a relationship map corresponding to each first entity may be constructed to obtain the first entity data. Referring to fig. 8, fig. 8 is an exemplary schematic diagram of a map generating method provided in the embodiment of the present disclosure, in which fig. 8 shows entities mined in the process described in the rule description text, and a map of a correspondence between the entities and the description information thereof is constructed by obtaining the description information, for example, 100000 attributes "total" of the scratch cards, and 1 hour after the state "dead time" of the scratch cards is taken.
S216, confirming a second entity contained in the action data;
S218, confirming a third entity identical to the second entity from the first entity data;
S220, establishing an association relation between the action data and the first entity data based on the second entity and the third entity;
Specifically, when confirming the association relationship of the action data and the first entity data, the action data and the first entity data may be associated by searching for the same entity of the action data and the first entity data. And the second entity contained in the action data is confirmed, and then a third entity identical to the second entity is confirmed in the first entity data, and the third entity in the first entity data is associated with the same second entity in the action data, so that the association relation between the action data and the first entity data is established. Referring to fig. 9, fig. 9 is an exemplary schematic diagram of a map generating method provided in this embodiment, for example, an object of "picking up" an action in action data is "scratch card", meanwhile, "scratch card" is an entity in first entity data, an attribute of "scratch card" is "total 100000 pieces", a state of "scratch card" is "failure after picking up one hour", and then "picking up" in action data and "failure after picking up one hour" in first entity data may be associated, and "scratch card" in action data and "total 100000 pieces" in first entity data may be associated.
S222, taking the execution action in the action data as a first node;
Specifically, when the triples are constructed based on the action data, the first entity data, and the association data, each execution action in the action data may be referred to as a node, i.e., a first node, i.e., an entity.
S224, confirming the parameters of the execution action in the action data as a second node;
specifically, parameters related to executing the action in the action data are confirmed as the second node.
S226, taking the sequence between the first nodes as edges, and constructing a first connection relation between the first nodes;
Specifically, the first nodes are executing actions, so that the first connection relationship between the first nodes is confirmed according to the execution sequence between the executing actions confirmed in advance.
S228, constructing a second connection relation between the first node and the second node by taking the key parameter type of the parameter as an edge;
specifically, according to the key parameter type corresponding to the parameter being an edge, the key parameter type is used as a second connection relation between the first node and the second grounding point.
S230, taking a first entity in the first entity data as a third node;
specifically, each first entity included in the first entity data is taken as a third node.
S232, confirming the description information related to the first entity in the first entity data as a fourth node;
Specifically, the description information corresponding to the first entity is confirmed as the fourth node.
S234, confirming a third connection relation between the third node and the fourth node by taking the key description type of the description information as an edge;
specifically, a third connection relationship between the third node and the fourth node is confirmed according to the key description type, that is, the key description type is taken as an edge, and the third node and the fourth node are connected.
S236, confirming a fourth connection relation between the first node and the second node and the third node and the fourth node based on the association relation;
specifically, according to the association relation between the first entity data and the action data, the fourth connection relation between the first node and the second node corresponding to the action data and the third node and the fourth node in the first entity data is confirmed. It can be understood that the association relationship may indicate an entity associated with the action data in the first entity data, so that the first entity data and the action data may be directly connected by an edge to obtain a fourth connection relationship.
S238, generating triplet data based on the first connection relation, the second connection relation, the third connection relation and the fourth connection relation, and generating a knowledge graph corresponding to the rule description text based on the triplet data;
Specifically, the basic unit of the knowledge graph is: and generating triplet data according to the confirmed first connection relation, second connection relation, third connection relation and fourth connection relation, carrying out knowledge fusion on the triplet data, such as entity alignment, and further generating a knowledge graph corresponding to the rule description text based on the triplet data.
Further, in one embodiment of the present specification, the method may include the steps of:
s302, acquiring sample triplet data marked as positive examples or negative examples;
Specifically, sample triplet data labeled as either positive or negative is obtained. Sample triplet data may be obtained, for example, by extracting sample triples from sample description text and labeling the generated sample triples according to their rationality.
S304, evaluating the triplet data based on the sample triplet data to obtain an evaluation result;
specifically, the generated triplet data is evaluated according to the sample triplet data with the label, for example, whether the relationship type between the entities is correct or not can be judged, so as to obtain an evaluation result of the triplet data.
S306, eliminating the error triples from the triples based on the evaluation result to obtain target triples in the triples;
Specifically, the error triples are confirmed according to the evaluation result, and it can be understood that the error triples can be the triples with error association or the triples with lower accuracy, and the target triples are obtained after the error triples in the triples data are removed.
And S308, generating a knowledge graph corresponding to the rule description text based on the target triplet.
Specifically, describing a knowledge graph corresponding to the text according to the screened target triplet generation rule so as to improve the quality of the generated knowledge graph.
In this embodiment of the present disclosure, by obtaining an execution action in a rule description text, confirming an execution sequence of the execution action based on the rule description text, confirming parameters of the execution action from the rule description text based on key parameter types, confirming action data according to the execution sequence and the parameters, obtaining a first entity in the rule description text, confirming description information related to the first entity from the rule description text based on the key description types, constructing first entity data based on the description information and the first entity, confirming a second entity contained in the action data, confirming a third entity identical to the second entity from the first entity data, establishing an association relation between the action data and the first entity data based on the second entity and the third entity, regarding the execution action in the action data as a first node, confirming the parameters of the execution action in the action data as a second node based on the order between the first node, constructing a first connection relation between the first nodes based on the key parameter types of the parameters as edges, constructing a second connection relation between the first node and the second node based on the first entity in the first entity data, confirming the first entity in the first entity data as a third entity, establishing a third connection relation between the first entity in the first entity and the first node and the fourth node, confirming the fourth connection relation between the fourth entity and the fourth node and the third node, confirming relation between the fourth node and the third node corresponding relation is generated based on the third connection relation, and the third connection relation is established between the fourth connection relation and the third node and the third connection relation and the third node corresponding relation confirming relation. By sequentially extracting the elements of interest in the rule description text, both the procedural information and the factual information in the rule description text can be mined, and by confirming that the first entity data and the same entity contained in the action data construct an association relationship, the procedural information and the factual information can be jointly represented when the knowledge graph is generated. In addition, after generating the triplet data, the triplet data can be screened according to the marked sample triplet so as to obtain a target triplet, and the quality of the knowledge graph is further improved.
Referring to fig. 10, a flowchart of a map generating method is provided in an embodiment of the present disclosure. As shown in fig. 10, the method of the embodiment of the present specification may include the following steps S502 to S510.
S502, acquiring question text characterization data of the input question data;
Specifically, in the question-answering dialogue application scenario, after the knowledge graph is generated, the questions can be answered according to the knowledge graph. The input question data is acquired, and question text characterization data of the question data is acquired, for example, the question text characterization data can be obtained by extracting feature data of the question data through a text encoder.
Further, in an embodiment of the present disclosure, before the matching the problem data with the knowledge graph based on the heuristic rule to obtain matching information, the method further includes the following steps S5022-S5026:
S5022, obtaining candidate text characterization data of a candidate rule description text;
Specifically, after the input problem data is acquired, rule description texts need to be screened from candidate rule description texts. It will be appreciated that the rule description text based on each type of rule may constitute one or more candidate rule description texts, and thus, in order to identify the rule description text most likely to contain an answer, candidate text characterization data for the candidate rule description text may be obtained first.
S5024, confirming second semantic similarity of the question text characterization data of the question data and the candidate text characterization data;
Specifically, the second semantic similarity between the question text characterization data and the candidate text characterization data of the question data is calculated, and for example, the similarity degree between the question text and the candidate rule description text can be obtained by calculating cosine similarity.
And S5026, confirming the rule description text from the candidate rule description text based on the second text similarity.
Specifically, the second text similarity of all the candidate rule description texts is ranked, and the rule description text with the highest similarity is taken as the answer to find.
S504, confirming the question type of the question data based on the question text characterization data;
Specifically, the question type of the question data is confirmed according to the question text characterization data. The types of questions are classified into three types altogether, the 1 st type being a factual question, which refers to some information about the questioning entity, such as what the questioning entity's elements are; class 2 is a procedural question, which refers to a question related to action operations on a map, such as the time place where the action occurs, etc.; class 3 is a non-uniformity problem, which refers to a problem described by a user that does not exactly match directly what is described in the document, but requires further reasoning to find what the non-uniformity problem may be for. In one possible implementation, it is possible to determine which of the three types the problem belongs to through a classification model, specifically, manually annotating corresponding data, and then training a classification model to classify the problem.
S506, confirming a corresponding heuristic rule according to the problem type;
Specifically, each type has a specific written heuristic to make inferences about the answer to improve the accuracy of the answer obtained. For example, there may be heuristic rules corresponding to the procedural (procedural) problem: 1. what are actions in the process? ; 2. what are the parameters of the action (the object, time, place and way of the action)? ; 3. is the order of actions? ; heuristic rules may correspond to the facts: 4. what is the underlying entity of the program? ; 5. what are the details of the entities (sub-entities, attributes, states? ; there may be heuristic rules corresponding to the non-uniformity problem: 6. is an entity or sub-entity associated with a parameter of an action? ; 7. is the status associated with the step? .
S508, matching the problem data with the knowledge graph based on the heuristic rule to obtain matching information;
Specifically, after the heuristic rule is obtained, the problem data is matched with the knowledge graph according to the instruction of the heuristic rule, that is, the matching direction of the problem data can be confirmed according to the heuristic rule, for example, the data related to the aspect of the problem data needs to be obtained from the knowledge graph.
Further, in an embodiment of the present disclosure, the matching the problem data with the knowledge-graph based on the heuristic rule to obtain matching information includes the following steps S5082-S5086:
S5082, confirming action keywords and entity keywords in the question data;
Specifically, the actions and entities in the issue data are content of interest, and thus action keywords and entity keywords are determined from the issue data. For example, "i have completed payment but i have not picked up scratch cards," the keywords extracted may be payment, pick up, and scratch cards.
S5084, matching the action keywords and the entity keywords with the knowledge graph to obtain matching nodes in the knowledge graph;
Specifically, confirming nodes matched with action keywords and entity keywords in the knowledge graph to obtain matched nodes.
And S5086, confirming the matching information related to the matching node based on the heuristic rule.
Specifically, starting from the matching node, the knowledge related to the matching node in the knowledge graph is found according to different heuristic rules, for example, if the matching node is an action, the action occurring before the action can be found, the corresponding element of the action, the entity related to the action, the element corresponding to the entity and the like, and all related matching information is obtained.
And S510, generating answer data of the question data based on the matching information.
Specifically, from all relevant matching information, the closest content is found by means of semantic matching as answer data of the question data.
Further, in one embodiment of the present disclosure, the generating answer data of the question data based on the matching information includes the following steps S5102-S5106:
S5102, obtaining matching information text characterization data of the matching information;
s5104, calculating first semantic similarity based on the matching information text characterization data and the problem text characterization data;
S5106, confirming answer data of the question data from the matching information based on the first semantic similarity.
Specifically, the matching information is encoded to obtain matching information text characterization data, first semantic similarity between the matching information text characterization data and question text characterization data is calculated, and answer data of the question data is confirmed from the matching information according to the first semantic similarity. The first semantic similarity may also be obtained by calculating a cosine distance.
Referring to fig. 11, fig. 11 is an exemplary schematic diagram of a map generating method according to the embodiment of the present specification, where the problem data is "i have completed payment, but i have not received a scratch card", and this fig. 11 is taken as an example: the customer has the problem that payment is completed but the scratch card is not picked up, so that the reasons can be found from two aspects, namely whether the action before picking up is completed or not is concerned; on the other hand, the specific requirement or limitation of finding and picking up the corresponding entity 'scratch card' to view the scratch card is not satisfied, namely, a clue is found from the triples corresponding to the scratch card. For example, answer data may be generated as "scratch cards need to be picked up on a payment interface, the effective time of the scratch cards is one hour after picking up, pick up fails when the effective time is exceeded, the total number of scratch cards is 100000, and the pick up is completed.
Optionally, the generated answer data can be provided for the robot to reply to the user questions, and also can be provided for the manual customer service, and the answers are highlighted and marked in the rule description text so as to play an auxiliary role in reply to the manual customer service, so that the manual customer service can conveniently and quickly locate the answers.
In the embodiment of the specification, the question type of the question data is confirmed based on the question text characterization data by acquiring the question text characterization data of the inputted question data, corresponding heuristic rules are confirmed according to the question type, the question data and the knowledge graph are matched based on the heuristic rules to obtain matching information, and answer data of the question data is generated based on the matching information. The heuristic rules can be used for matching answer data from the knowledge graph extracted from the rule description text according to the input question data, so that the interpretability of the answer is improved, and questions about the facts, the programming and the non-consistency of the rule description text can be answered to accept the input complex questions.
Referring to fig. 12, a flowchart of a training method of a atlas-generation model is provided in the embodiment of the present disclosure. As shown in fig. 12, the method of the embodiment of the present specification may include the following steps S602 to S606.
S602, extracting sample triplet data from a sample rule description text set based on heuristic rules;
It will be appreciated that in one embodiment, the process of building a knowledge-graph based on rule description text may be implemented by a pre-trained graph generation model. In one embodiment, the photo-generated model may be trained by self-supervised learning (self-supervised learning). The self-supervision learning is self-learning from the unlabeled data, and a learning mode of labeling data is not needed, so that learning based on unlabeled sample images can be realized. In the embodiment of the present specification, the sample for self-supervision training learning is a graph constructed by heuristically extracting information from the interior of the sample rule description text set.
In constructing the sample profile data set, each sample profile data set may be obtained in either an artificial or automated manner by: sample action data in a sample rule description text is obtained according to heuristic rules, sample entity data in the sample rule description text is obtained, a sample association relation between the sample action data and the sample entity data is confirmed, and sample triplet data is constructed based on the sample association relation, the sample action data and the sample entity data. The sample rule description text set refers to a text which is used for selecting rule descriptions related to specific target products for training, restraining behaviors of product users and describing attributes of the products and the product users. Based on the difference of the knowledge graph extraction objects required, a corresponding sample rule description text set can be introduced in the training stage. The sample triplet data is triplet data extracted from the sample rule description text set, and specifically may include (entity-relationship-entity). Heuristic rules are rules that direct how data extraction from sample rule description text occurs. Illustratively, the heuristic rules may include: 1. what are actions in the sample rule description text described? ;2. what are the parameters of the action (the object, time, place and way of the action)? ;3. is the order of actions? ;4. what is the underlying entity of the program? ; 5. what are the details of the entities (sub-entities, attributes, states? ; 6. is an entity or sub-entity associated with a parameter of an action? ; 7. is the status associated with the step? The heuristic rules can guide the elements concerned in the rule description text, namely the sample entity data and the sample action data which need to be acquired, and can guide the matching of the association relationship between the sample entity data and the sample action data.
Optionally, the acquiring sample action data in the sample rule description text according to the heuristic rule includes:
Acquiring a sample execution action in the sample rule description text;
Confirming the execution sequence of the sample execution action based on the sample rule description text;
confirming sample parameters of the sample execution action from the rule description text based on key parameter types;
And confirming sample action data according to the sample execution sequence and the sample parameters.
Optionally, the acquiring sample entity data in the sample rule description text includes:
Acquiring a first sample entity in the sample rule description text;
Confirming sample description information related to the first sample entity from the sample rule description text based on a key description type;
and constructing sample entity data based on the sample description information and the first sample entity.
Optionally, the confirming the sample association relationship between the sample action data and the sample entity data includes:
confirming a second sample entity contained in the action data;
identifying a third sample entity from the sample entity data that is identical to the second sample entity;
And establishing a sample association relation between the sample action data and the sample entity data based on the second sample entity and the third sample entity.
S604, constructing a corresponding sample map data set based on the sample triplet data;
specifically, the validation of node and edge relationships is performed according to the sample triplet data to form a sample map data set.
Optionally, the construction of one sample picture data in the sample picture data set may be achieved by:
performing a action with a sample in the sample action data as a first sample node;
Confirming sample parameters of sample execution actions in the sample action data as a second sample node;
Taking the sequence between the first sample nodes as edges, and constructing a first sample connection relation between the first sample nodes;
constructing a second sample connection relationship between a first sample node and the second sample node by taking the key parameter type of the sample parameter as an edge;
taking a first sample entity in the sample entity data as a third node;
Confirming sample description information related to the first sample entity in the sample entity data as a fourth sample node;
confirming a third sample connection relationship between the third sample node and the fourth sample node by taking the key description type of the sample description information as an edge;
Confirming a fourth connection relationship between the first and second sample nodes and the third and fourth sample nodes based on the sample association relationship;
Generating sample triplet data based on the first sample connection relation, the second sample connection relation, the third sample connection relation and the fourth sample connection relation, and generating a sample knowledge graph corresponding to the sample rule description text based on the sample triplet data.
S606, optimizing network parameters of the pattern generation model by adopting a self-supervision training module based on the sample pattern data set, and obtaining the trained pattern generation model when the network parameters meet preset training requirements.
In an embodiment, the graph generation model may be a graph attention model (GAT), and the core working principle of the graph generation model is that the relationship between nodes is calculated through an attention mechanism, and known side information is used to guide learning of a graph attention network to any two nodes, so that the probability of occurrence of edges between the nodes can be calculated. For example, the self-supervision training may be a contrast training or a generation training, and the training process may be described in the prior art, which is not described herein. For example, the side relationship predicted by the map generation model may be checked against the side relationship existing in the sample map data set, and if the side relationship is the same, the model prediction is accurate, and if the side relationship is different, the model needs further training. The self-monitoring training module is used for optimizing network parameters of the pattern generation model, judging whether the pattern generation model after updating the optimized model parameters is converged or not, if the pattern is converged, stopping training to obtain the pattern generation model after training, and if the pattern is not converged, continuing training the pattern generation model based on the sample pattern data set until the pattern is converged. The training-completed atlas generating model can automatically analyze the input rule description text into a knowledge atlas (atlas structure). The pattern generation model automatically executes action data in the rule description text by adopting the pattern generation model when the pattern generation model acquires the rule description text by learning the sample image data set, acquires first entity data in the rule description text, confirms the association relationship between the action data and the first entity data, builds triplet data based on the association relationship, the action data and the first entity data, generates a knowledge pattern corresponding to the rule description text based on the triplet data, realizes better knowledge understanding of the rule description text, and further efficiently and accurately builds the knowledge pattern.
In the embodiment of the specification, sample triplet data is extracted from a sample rule description text set based on heuristic rules, a corresponding sample spectrum data set is constructed based on the sample triplet data, network parameters of a spectrum generation model are optimized by adopting a self-supervision training module based on the sample spectrum data set, and a trained spectrum generation model is obtained when the network parameters meet preset training requirements. Self-supervised learning is performed by using the graph attention network and automatically mining data available for training from the graph, so that control of the graph structure information by the model is enhanced.
The map generating apparatus according to the embodiment of the present invention will be described in detail with reference to fig. 13. It should be noted that, the map generating apparatus in fig. 13 is used to perform the method of the embodiment shown in fig. 2 to 11 of the present specification, and for convenience of explanation, only the portion relevant to the embodiment of the present specification is shown, and specific technical details are not disclosed, please refer to the embodiment shown in fig. 2 to 11 of the present specification.
Referring to fig. 12, a schematic diagram of a map generating apparatus according to an exemplary embodiment of the present disclosure is shown. The map generation apparatus may be implemented as all or part of the apparatus by software, hardware or a combination of both. The device 1 comprises a first acquisition module 11, a second acquisition module 12, a fusion module 13 and a construction module 14.
A first obtaining module 11, configured to obtain action data in the rule description text;
a second obtaining module 12, configured to obtain first entity data in the rule description text;
a fusion module 13, configured to confirm an association relationship between the action data and the first entity data;
And the construction module 14 is configured to construct triplet data based on the association relationship, the action data and the first entity data, and generate a knowledge graph corresponding to the rule description text based on the triplet data.
Optionally, the first obtaining module 11 is specifically configured to obtain an execution action in the rule description text;
Confirming the execution sequence of the execution action based on the rule description text;
validating parameters of the execution action from the rule description text based on key parameter types;
and confirming action data according to the execution sequence and the parameters.
Optionally, the key parameter type includes one or more of an object, a time, a place, and a mode of the performing action.
Optionally, the second obtaining module 12 is specifically configured to obtain a first entity in the rule description text;
Confirming descriptive information related to the first entity from the rule description text based on a key description type;
And constructing and obtaining first entity data based on the description information and the first entity.
Optionally, the key description type includes one or more of a sub-entity, an attribute, and a state of the first entity.
Optionally, the fusion module 13 is specifically configured to confirm the second entity included in the action data;
Identifying a third entity identical to the second entity from the first entity data;
And establishing an association relation between the action data and the first entity data based on the second entity and the third entity.
Optionally, the building module 14 is specifically configured to take the execution action in the action data as a first node;
Confirming the parameters of the execution action in the action data as a second node;
taking the sequence between the first nodes as edges, and constructing a first connection relation between the first nodes;
Constructing a second connection relation between the first node and the second node by taking the key parameter type of the parameter as an edge;
taking a first entity in the first entity data as a third node;
confirming descriptive information related to the first entity in the first entity data as a fourth node;
Confirming a third connection relation between the third node and the fourth node by taking the key description type of the description information as an edge;
confirming a fourth connection relationship between the first node and the second node and the third node and the fourth node based on the association relationship;
Generating triplet data based on the first connection relation, the second connection relation, the third connection relation and the fourth connection relation, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
Optionally, the construction module 14 is specifically configured to obtain sample triplet data labeled as a positive example or a negative example;
Evaluating the triplet data based on the sample triplet data to obtain an evaluation result;
Removing an error triplet from the triplet data based on the evaluation result to obtain a target triplet in the triplet data;
and generating a knowledge graph corresponding to the rule description text based on the target triplet.
Alternatively, please refer to fig. 14, which illustrates a schematic structural diagram of the map generating apparatus provided in an exemplary embodiment of the present specification. As shown in fig. 14, the map generating apparatus further includes a question-answering module 15 for: acquiring question text characterization data of the input question data;
confirming a question type of the question data based on the question text characterization data;
confirming a corresponding heuristic rule according to the problem type;
Matching the problem data with the knowledge graph based on the heuristic rule to obtain matching information;
And generating answer data of the question data based on the matching information.
Optionally, the question and answer module 15 is specifically configured to confirm action keywords and entity keywords in the question data;
matching the action keywords and the entity keywords with the knowledge graph to obtain matching nodes in the knowledge graph;
And confirming the matching information related to the matching node based on the heuristic rule.
Optionally, the question-answering module 15 is specifically configured to obtain matching information text characterization data of the matching information;
Calculating a first semantic similarity based on the matching information text characterization data and the question text characterization data;
And confirming answer data of the question data from the matching information based on the first semantic similarity.
Optionally, the question-answering module 15 is specifically configured to obtain candidate text characterization data of the candidate rule description text;
Confirming second semantic similarity of the question text characterization data of the question data and the candidate text characterization data;
and confirming the rule description text from the candidate rule description text based on the second text similarity.
Further, referring to the map generating apparatus shown in fig. 15, the map generating apparatus in fig. 15 is used to perform the method of the embodiment shown in fig. 12 of the present specification, for convenience of explanation, only the portion relevant to the embodiment of the present specification is shown, and specific technical details are not disclosed, please refer to the embodiment shown in fig. 12 of the present specification.
Referring to fig. 15, a schematic diagram of a map generating apparatus according to an exemplary embodiment of the present disclosure is shown. The map generation apparatus may be implemented as all or part of the apparatus by software, hardware or a combination of both. The apparatus 2 comprises a dataset construction module 21, a feature extraction module 22, a self-supervised training module 23.
A data set construction module 21 for extracting sample triplet data from the sample rule description text set based on heuristic rules;
a feature extraction module 22, configured to construct a corresponding sample map data set based on the sample triplet data;
The self-monitoring training module 23 is configured to optimize network parameters of the spectrum generation model by using the self-monitoring training module based on the sample spectrum data set, and obtain a trained spectrum generation model when the network parameters meet a preset training requirement.
It should be noted that, when the map generating apparatus provided in the foregoing embodiment performs the map generating method and the map generating model training method, only the division of the foregoing functional modules is used as an example, and in practical application, the foregoing functional allocation may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiments of the map generating apparatus, the map generating method, and the map generating model training method provided in the foregoing embodiments belong to the same concept, which embody detailed implementation procedures in the method embodiments, and are not described herein again.
The foregoing embodiment numbers of the present specification are merely for description, and do not represent advantages or disadvantages of the embodiments. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The embodiment of the present disclosure further provides a storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for generating a map according to the embodiment shown in fig. 2 to fig. 12 is implemented, and the specific implementation process may refer to the specific description of the embodiment shown in fig. 2 to fig. 12, which is not repeated herein.
Referring to fig. 16, a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure is shown. The electronic device in this specification may include one or more of the following: processor 110, memory 120, input device 130, output device 140, and bus 150. The processor 110, the memory 120, the input device 130, and the output device 140 may be connected by a bus 150.
Processor 110 may include one or more processing cores. The processor 110 connects various parts within the overall electronic device using various interfaces and lines, performs various functions of the terminal 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and invoking data stored in the memory 120. Alternatively, the processor 110 may be implemented in at least one hardware form of digital signal Processing (DIGITAL SIGNAL Processing, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 110 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), a modem, etc. The CPU mainly processes an operating system, a user page, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 110 and may be implemented solely by a single communication chip.
The memory 120 may include a random access memory (Random Access Memory, RAM) or a Read-only memory (ROM). Optionally, the memory 120 includes a Non-transitory computer readable medium (Non-Transitory Computer-Readable Storage Medium). Memory 120 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, which may be an Android (Android) system, including an Android system-based deep development system, an IOS system developed by apple corporation, including an IOS system-based deep development system, or other systems, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like.
Memory 120 may be divided into an operating system space in which the operating system runs and a user space in which native and third party applications run. In order to ensure that different third party application programs can achieve better operation effects, the operating system allocates corresponding system resources for the different third party application programs. However, the requirements of different application scenarios in the same third party application program on system resources are different, for example, under the local resource loading scenario, the third party application program has higher requirement on the disk reading speed; in the animation rendering scene, the third party application program has higher requirements on the GPU performance. The operating system and the third party application program are mutually independent, and the operating system often cannot timely sense the current application scene of the third party application program, so that the operating system cannot perform targeted system resource adaptation according to the specific application scene of the third party application program.
In order to enable the operating system to distinguish specific application scenes of the third-party application program, data communication between the third-party application program and the operating system needs to be communicated, so that the operating system can acquire current scene information of the third-party application program at any time, and targeted system resource adaptation is performed based on the current scene.
The input device 130 is configured to receive input instructions or data, and the input device 130 includes, but is not limited to, a keyboard, a mouse, a camera, a microphone, or a touch device. The output device 140 is used to output instructions or data, and the output device 140 includes, but is not limited to, a display device, a speaker, and the like. In one example, the input device 130 and the output device 140 may be combined, and the input device 130 and the output device 140 are touch display screens.
The touch display screen may be designed as a full screen, a curved screen, or a contoured screen. The touch display screen may also be designed as a combination of a full screen and a curved screen, a combination of a special-shaped screen and a curved screen, and the embodiments of the present disclosure are not limited thereto.
In addition, those skilled in the art will appreciate that the configuration of the electronic device shown in the above-described figures does not constitute a limitation of the electronic device, and the electronic device may include more or less components than illustrated, or may combine certain components, or may have a different arrangement of components. For example, the electronic device further includes components such as a radio frequency circuit, an input unit, a sensor, an audio circuit, a WiFi module, a power supply, and a bluetooth module, which are not described herein.
In the electronic device shown in fig. 16, the processor 110 may be configured to invoke a computer application program stored in the memory 120, and specifically perform the following operations:
Acquiring action data in a rule description text;
acquiring first entity data in the rule description text;
confirming the association relation between the action data and the first entity data;
And constructing triplet data based on the association relation, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
In one embodiment, the processor 110, when executing the action data in the acquisition rule description text, specifically performs the following operations:
acquiring an execution action in the rule description text;
Confirming the execution sequence of the execution action based on the rule description text;
validating parameters of the execution action from the rule description text based on key parameter types;
and confirming action data according to the execution sequence and the parameters.
In one embodiment, the key parameter types include one or more of an object, a time, a place, and a manner in which the action is performed.
In one embodiment, the processor 110, when executing the acquiring the first body data in the rule description text, specifically performs the following operations:
acquiring a first entity in the rule description text;
Confirming descriptive information related to the first entity from the rule description text based on a key description type;
And constructing and obtaining first entity data based on the description information and the first entity.
In one embodiment, the key description type includes one or more of a sub-entity, an attribute, and a state of the first entity.
In one embodiment, the processor 110, when executing the confirming the association relationship between the action data and the first entity data, specifically executes the following operations:
Confirming a second entity contained in the action data;
Identifying a third entity identical to the second entity from the first entity data;
And establishing an association relation between the action data and the first entity data based on the second entity and the third entity.
In one embodiment, when executing the building of the triplet based on the association relationship, the action data and the first entity data, the processor 110 specifically executes the following operations when generating a knowledge graph corresponding to the rule description text based on the triplet:
Taking the execution action in the action data as a first node;
Confirming the parameters of the execution action in the action data as a second node;
taking the sequence between the first nodes as edges, and constructing a first connection relation between the first nodes;
Constructing a second connection relation between the first node and the second node by taking the key parameter type of the parameter as an edge;
taking a first entity in the first entity data as a third node;
confirming descriptive information related to the first entity in the first entity data as a fourth node;
Confirming a third connection relation between the third node and the fourth node by taking the key description type of the description information as an edge;
confirming a fourth connection relationship between the first node and the second node and the third node and the fourth node based on the association relationship;
Generating triplet data based on the first connection relation, the second connection relation, the third connection relation and the fourth connection relation, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
In one embodiment, when executing the building of the triplet based on the association relationship, the action data and the first entity data, the processor 110 specifically executes the following operations when generating a knowledge graph corresponding to the rule description text based on the triplet:
acquiring sample triplet data marked as positive examples or negative examples;
Evaluating the triplet data based on the sample triplet data to obtain an evaluation result;
Removing an error triplet from the triplet data based on the evaluation result to obtain a target triplet in the triplet data;
and generating a knowledge graph corresponding to the rule description text based on the target triplet.
In one embodiment, the processor 110 may also be used to invoke a computer application stored in the memory 120 and specifically:
Acquiring question text characterization data of the input question data;
confirming a question type of the question data based on the question text characterization data;
confirming a corresponding heuristic rule according to the problem type;
Matching the problem data with the knowledge graph based on the heuristic rule to obtain matching information;
And generating answer data of the question data based on the matching information.
In one embodiment, when the processor 110 performs the matching of the problem data with the knowledge-graph based on the heuristic rule to obtain matching information, the following operations are specifically performed:
Confirming action keywords and entity keywords in the problem data;
matching the action keywords and the entity keywords with the knowledge graph to obtain matching nodes in the knowledge graph;
And confirming the matching information related to the matching node based on the heuristic rule.
In one embodiment, the processor 110, when executing the answer data for generating the question data based on the matching information, specifically performs the following operations:
Acquiring matching information text characterization data of the matching information;
Calculating a first semantic similarity based on the matching information text characterization data and the question text characterization data;
And confirming answer data of the question data from the matching information based on the first semantic similarity.
In one embodiment, before executing the matching of the problem data with the knowledge-graph based on the heuristic rules, the processor 110 further performs the following operations:
Acquiring candidate text characterization data of a candidate rule description text;
Confirming second semantic similarity of the question text characterization data of the question data and the candidate text characterization data;
and confirming the rule description text from the candidate rule description text based on the second text similarity.
In one embodiment, the processor 110 may also be used to invoke a computer application stored in the memory 120 and specifically:
extracting sample triplet data from the sample rule description text set based on the heuristic rule;
constructing a corresponding sample map data set based on the sample triplet data;
And optimizing network parameters of the pattern generation model by adopting a self-supervision training module based on the sample pattern data set, and obtaining the trained pattern generation model when the network parameters meet preset training requirements.
In the embodiment of the specification, by acquiring the action data and the first entity data in the rule description text, confirming the association relationship between the action data and the first entity data, constructing triple data based on the association relationship, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triple data. The method comprises the steps of obtaining action data in a rule description text to realize analysis and extraction of programming information in the rule description text, obtaining first entity data in the rule description text to realize extraction of factual information in the rule description text, and further confirming association relation between the action data and the first entity data, so that the programming information in the rule description text and the factual information are associated, and based on the extracted action data, the first entity data and the association data, a knowledge graph corresponding to the rule description document can be automatically built efficiently, so that the generated knowledge graph has higher understanding degree on the rule description document, and the completion effect of downstream application knowledge graph tasks is improved.
Further, by acquiring an execution action in a rule description text, confirming the execution sequence of the execution action based on the rule description text, confirming the parameters of the execution action from the rule description text based on key parameter types, acquiring a first entity in the rule description text, confirming description information related to the first entity based on the key description types from the rule description text, constructing and obtaining first entity data based on the description information and the first entity, confirming a second entity contained in the action data, confirming a third entity identical to the second entity from the first entity data, establishing an association relation between the action data and the first entity data based on the second entity and the third entity, confirming the parameters of the execution action in the action data as a first node, confirming the parameters of the execution action in the action data as a second node based on the order between the first node, constructing a first connection relation between the first node based on the key parameter types as the side, constructing a second connection relation between the first node and the second node based on the key parameter types, constructing and the first entity in the first entity data as a third node, confirming the third entity in the first entity data, establishing an association relation between the first entity in the first entity data and the first entity data as the third node, establishing an association relation between the first entity and the third entity data as the third node, confirming the third node, and the fourth connection relation between the fourth entity and the fourth node in the third node and the third node in the third node and third node, confirming relation, and the first connection relation between the first node and third node information, and third relation between corresponding relation between and third node and third relation between and third node. By sequentially extracting the elements of interest in the rule description text, both the procedural information and the factual information in the rule description text can be mined, and by confirming that the first entity data and the same entity contained in the action data construct an association relationship, the procedural information and the factual information can be jointly represented when the knowledge graph is generated. In addition, after generating the triplet data, the triplet data can be screened according to the marked sample triplet so as to obtain a target triplet, and the quality of the knowledge graph is further improved.
Further, the question text characterization data of the input question data are obtained, the question type of the question data is confirmed based on the question text characterization data, corresponding heuristic rules are confirmed according to the question type, the question data and the knowledge graph are matched based on the heuristic rules, so that matching information is obtained, and answer data of the question data are generated based on the matching information. The heuristic rules can be used for matching answer data from the knowledge graph extracted from the rule description text according to the input question data, so that the interpretability of the answer is improved, and questions about the facts, the programming and the non-consistency of the rule description text can be answered to accept the input complex questions.
Further, through extracting sample triplet data from a sample rule description text set based on heuristic rules, constructing a corresponding sample spectrum data set based on the sample triplet data, optimizing network parameters of a spectrum generation model by adopting a self-supervision training module based on the sample spectrum data set, and obtaining a trained spectrum generation model when the network parameters meet preset training requirements. Self-supervised learning is performed by using the graph attention network and automatically mining data available for training from the graph, so that control of the graph structure information by the model is enhanced.
In addition, embodiments of the present description provide a computer program product comprising a computer program which, when executed by a processor of an electronic device, causes the processor to at least implement a method as provided in the embodiments shown in the foregoing fig. 2-12.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), or the like.
The foregoing disclosure is only illustrative of the preferred embodiments of the present invention and is not to be construed as limiting the scope of the claims, which follow the meaning of the claims of the present invention.

Claims (18)

1. A method of atlas generation, the method comprising:
Acquiring action data in a rule description text;
acquiring first entity data in the rule description text;
confirming the association relation between the action data and the first entity data;
And constructing triplet data based on the association relation, the action data and the first entity data, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
2. The method of claim 1, the obtaining action data in a rule description text, comprising:
acquiring an execution action in the rule description text;
Confirming the execution sequence of the execution action based on the rule description text;
validating parameters of the execution action from the rule description text based on key parameter types;
and confirming action data according to the execution sequence and the parameters.
3. The method of claim 2, the key parameter types comprising one or more of an object, a time, a place, and a manner of the performing an action.
4. The method of claim 1, the obtaining the first body data in the rule description text comprising:
acquiring a first entity in the rule description text;
Confirming descriptive information related to the first entity from the rule description text based on a key description type;
And constructing and obtaining first entity data based on the description information and the first entity.
5. The method of claim 4, wherein the key description type includes one or more of a sub-entity, an attribute, and a state of the first entity.
6. The method of claim 1, the confirming the association of the action data with the first entity data, comprising:
Confirming a second entity contained in the action data;
Identifying a third entity identical to the second entity from the first entity data;
And establishing an association relation between the action data and the first entity data based on the second entity and the third entity.
7. The method of claim 1, wherein the constructing a triplet based on the association relationship, the action data, and the first entity data, and generating a knowledge-graph corresponding to the rule description text based on the triplet, comprises:
Taking the execution action in the action data as a first node;
Confirming the parameters of the execution action in the action data as a second node;
taking the sequence between the first nodes as edges, and constructing a first connection relation between the first nodes;
Constructing a second connection relation between the first node and the second node by taking the key parameter type of the parameter as an edge;
taking a first entity in the first entity data as a third node;
confirming descriptive information related to the first entity in the first entity data as a fourth node;
Confirming a third connection relation between the third node and the fourth node by taking the key description type of the description information as an edge;
confirming a fourth connection relationship between the first node and the second node and the third node and the fourth node based on the association relationship;
Generating triplet data based on the first connection relation, the second connection relation, the third connection relation and the fourth connection relation, and generating a knowledge graph corresponding to the rule description text based on the triplet data.
8. The method of claim 1, wherein the constructing a triplet based on the association relationship, the action data, and the first entity data, and generating a knowledge-graph corresponding to the rule description text based on the triplet, comprises:
acquiring sample triplet data marked as positive examples or negative examples;
Evaluating the triplet data based on the sample triplet data to obtain an evaluation result;
Removing an error triplet from the triplet data based on the evaluation result to obtain a target triplet in the triplet data;
and generating a knowledge graph corresponding to the rule description text based on the target triplet.
9. The method of claim 1, the method further comprising:
Acquiring question text characterization data of the input question data;
confirming a question type of the question data based on the question text characterization data;
confirming a corresponding heuristic rule according to the problem type;
Matching the problem data with the knowledge graph based on the heuristic rule to obtain matching information;
And generating answer data of the question data based on the matching information.
10. The method of claim 9, the matching the problem data with the knowledge-graph based on the heuristic rules to obtain matching information, comprising:
Confirming action keywords and entity keywords in the problem data;
matching the action keywords and the entity keywords with the knowledge graph to obtain matching nodes in the knowledge graph;
And confirming the matching information related to the matching node based on the heuristic rule.
11. The method of claim 9, the generating answer data to the question data based on the matching information, comprising:
Acquiring matching information text characterization data of the matching information;
Calculating a first semantic similarity based on the matching information text characterization data and the question text characterization data;
And confirming answer data of the question data from the matching information based on the first semantic similarity.
12. The method of claim 9, wherein the matching the problem data with the knowledge-graph based on the heuristic rules to obtain matching information is preceded by:
Acquiring candidate text characterization data of a candidate rule description text;
Confirming second semantic similarity of the question text characterization data of the question data and the candidate text characterization data;
and confirming the rule description text from the candidate rule description text based on the second text similarity.
13. A method of atlas-generating model training, the method comprising:
extracting sample triplet data from the sample rule description text set based on the heuristic rule;
constructing a corresponding sample map data set based on the sample triplet data;
And optimizing network parameters of the pattern generation model by adopting a self-supervision training module based on the sample pattern data set, and obtaining the trained pattern generation model when the network parameters meet preset training requirements.
14. A map generation apparatus comprising:
the first acquisition module is used for acquiring action data in the rule description text;
the second acquisition module is used for acquiring the first entity data in the rule description text;
The fusion module is used for confirming the association relation between the action data and the first entity data;
And the construction module is used for constructing triplet data based on the association relation, the action data and the first entity data and generating a knowledge graph corresponding to the rule description text based on the triplet data.
15. A map generation apparatus comprising:
The data set construction module is used for extracting sample triplet data from the sample rule description text set based on heuristic rules;
the feature extraction module is used for constructing a corresponding sample map data set based on the sample triplet data;
And the self-supervision training module is used for optimizing network parameters of the spectrum generation model by adopting the self-supervision training module based on the sample spectrum data set, and obtaining the trained spectrum generation model when the network parameters meet the preset training requirements.
16. An electronic device, comprising: a processor and a memory; in which a memory stores a computer program adapted to be loaded by the processor and to perform the steps of the method according to any one of claims 1 to 13.
17. A storage medium storing a computer program which, when executed by a processor, implements the steps of the method of any one of claims 1 to 13.
18. A computer program product comprising: computer program which, when executed by a processor of an electronic device, causes the processor to perform the steps of the method according to any one of claims 1 to 13.
CN202410048559.5A 2024-01-11 2024-01-11 Map generation method, device, equipment and storage medium Pending CN117933384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410048559.5A CN117933384A (en) 2024-01-11 2024-01-11 Map generation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410048559.5A CN117933384A (en) 2024-01-11 2024-01-11 Map generation method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117933384A true CN117933384A (en) 2024-04-26

Family

ID=90751653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410048559.5A Pending CN117933384A (en) 2024-01-11 2024-01-11 Map generation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117933384A (en)

Similar Documents

Publication Publication Date Title
CN110377716B (en) Interaction method and device for conversation and computer readable storage medium
CN112819153B (en) Model transformation method and device
CN117521675A (en) Information processing method, device, equipment and storage medium based on large language model
CN112948534A (en) Interaction method and system for intelligent man-machine conversation and electronic equipment
CN116737908A (en) Knowledge question-answering method, device, equipment and storage medium
CN107958059B (en) Intelligent question answering method, device, terminal and computer readable storage medium
CN108304376B (en) Text vector determination method and device, storage medium and electronic device
CN116521893A (en) Control method and control device of intelligent dialogue system and electronic equipment
CN111444677A (en) Reading model optimization method, device, equipment and medium based on big data
EP4364044A1 (en) Automated troubleshooter
CN114706966A (en) Voice interaction method, device and equipment based on artificial intelligence and storage medium
CN112966076A (en) Intelligent question and answer generating method and device, computer equipment and storage medium
CN110069769A (en) Using label generating method, device and storage equipment
CN115803734A (en) Natural language enrichment using action interpretation
CN116797195A (en) Work order processing method, apparatus, computer device, and computer readable storage medium
CN118114679A (en) Service dialogue quality control method, system, electronic equipment and storage medium
CN117667979A (en) Data mining method, device, equipment and medium based on large language model
CN117371950A (en) Robot flow automation method, device, all-in-one machine and storage medium
CN117332062A (en) Data processing method and related device
CN116701604A (en) Question and answer corpus construction method and device, question and answer method, equipment and medium
CN111400443A (en) Information processing method, device and storage medium
CN117933384A (en) Map generation method, device, equipment and storage medium
CN114626388A (en) Intention recognition method and device, electronic equipment and storage medium
CN111753548A (en) Information acquisition method and device, computer storage medium and electronic equipment
CN118535715B (en) Automatic reply method, equipment and storage medium based on tree structure knowledge base

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination