CN115688920B - Knowledge extraction method, training device, training equipment and training medium for model - Google Patents

Knowledge extraction method, training device, training equipment and training medium for model Download PDF

Info

Publication number
CN115688920B
CN115688920B CN202211466616.9A CN202211466616A CN115688920B CN 115688920 B CN115688920 B CN 115688920B CN 202211466616 A CN202211466616 A CN 202211466616A CN 115688920 B CN115688920 B CN 115688920B
Authority
CN
China
Prior art keywords
knowledge
type
text
label
types
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211466616.9A
Other languages
Chinese (zh)
Other versions
CN115688920A (en
Inventor
张鑫睿
刘欢
陈坤斌
何伯磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu International Technology Shenzhen Co ltd
Original Assignee
Baidu International Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu International Technology Shenzhen Co ltd filed Critical Baidu International Technology Shenzhen Co ltd
Priority to CN202211466616.9A priority Critical patent/CN115688920B/en
Publication of CN115688920A publication Critical patent/CN115688920A/en
Application granted granted Critical
Publication of CN115688920B publication Critical patent/CN115688920B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a knowledge extraction method, a training device, training equipment and training media for a knowledge extraction model, and relates to the fields of natural language processing, deep learning, cloud computing and the like. The specific implementation scheme is as follows: acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; extracting features of the input text to obtain semantic features of the input text; and carrying out knowledge extraction on the semantic features by adopting knowledge extraction models corresponding to the knowledge types respectively to obtain knowledge points under at least one knowledge type in the knowledge types. Therefore, knowledge extraction models corresponding to a plurality of known knowledge types can be realized, knowledge points matched with prompt information in the text to be extracted can be extracted, and common extraction of the knowledge points under the plurality of knowledge types can be realized, so that extraction requirements of different users can be met.

Description

Knowledge extraction method, training device, training equipment and training medium for model
Technical Field
The disclosure relates to the field of artificial intelligence, in particular to the technical fields of natural language processing, deep learning, cloud computing and the like, and particularly relates to a knowledge extraction method, a training method of a knowledge extraction model, a training device, equipment and a medium.
Background
In an office scenario of an enterprise, a text (such as a knowledge base document, conference content, daily report, weekly report, OKR (Objectives and Key Results, objective and key result) deposited by an employee) includes knowledge ATT (Attention point) in a plurality of specific fields, and employees with different function sequences have different Attention points according to their own working directions. How to effectively mine out useful key information from the unstructured text so as to assist knowledge recommendation and improve office efficiency of staff is very important.
Disclosure of Invention
The present disclosure provides a training method, apparatus, device and medium for knowledge extraction method, knowledge extraction model.
According to an aspect of the present disclosure, there is provided a knowledge extraction method, including:
acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted;
splicing the text to be extracted and the prompt information to obtain an input text;
Extracting features of the input text to obtain semantic features of the input text;
and carrying out knowledge extraction on the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types respectively to obtain knowledge points under at least one knowledge type in the plurality of knowledge types.
According to another aspect of the present disclosure, there is provided a training method of a knowledge extraction model, including:
acquiring training texts and prompt information, wherein the prompt information is used for indicating information to be extracted in the training texts, and the labeling information of the training texts is used for indicating the corresponding relation between at least one labeling knowledge type and labeling knowledge points;
splicing the training text and the prompt information to obtain an input text;
extracting features of the input text to obtain semantic features of the input text;
knowledge extraction is respectively carried out on the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types so as to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types;
and carrying out joint training on the knowledge extraction models of the knowledge types according to the labeling knowledge points under the labeling knowledge types and the prediction knowledge points under the prediction knowledge types.
According to still another aspect of the present disclosure, there is provided a knowledge extraction apparatus including:
the device comprises a first acquisition module, a second acquisition module and a display module, wherein the first acquisition module is used for acquiring a text to be extracted and prompt information, and the prompt information is used for indicating information to be extracted in the text to be extracted;
the splicing module is used for splicing the text to be extracted and the prompt information to obtain an input text;
the extraction module is used for extracting the characteristics of the input text to obtain semantic characteristics of the input text;
and the extraction module is used for extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types respectively so as to obtain knowledge points under at least one knowledge type in the plurality of knowledge types.
According to still another aspect of the present disclosure, there is provided a training apparatus of a knowledge extraction model, including:
the system comprises an acquisition module, a judgment module and a judgment module, wherein the acquisition module is used for acquiring training texts and prompt information, the prompt information is used for indicating information to be extracted in the training texts, and the annotation information of the training texts is used for indicating the corresponding relation between at least one annotation knowledge type and annotation knowledge points;
the splicing module is used for splicing the training text and the prompt information to obtain an input text;
The extraction module is used for extracting the characteristics of the input text to obtain semantic characteristics of the input text;
the extraction module is used for extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types;
the training module is used for carrying out joint training on the knowledge extraction models of the knowledge types according to the labeling knowledge points under the labeling knowledge types and the prediction knowledge points under the prediction knowledge types.
According to yet another aspect of the present disclosure, there is provided an electronic device comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the knowledge extraction method set forth in one aspect of the disclosure or to perform the training method of the knowledge extraction model set forth in another aspect of the disclosure.
According to yet another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium of computer instructions for causing the computer to perform the knowledge extraction method set forth in the above aspect of the present disclosure, or to perform the training method of the knowledge extraction model set forth in the above aspect of the present disclosure.
According to a further aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the knowledge extraction method set forth in the above aspect of the present disclosure, or implements the training method of the knowledge extraction model set forth in the above aspect of the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a knowledge extraction method according to an embodiment of the disclosure;
fig. 2 is a flow chart of a knowledge extraction method according to a second embodiment of the disclosure;
FIG. 3 is a schematic diagram of a knowledge extraction framework corresponding to the knowledge extraction model provided in the present disclosure;
fig. 4 is a flow chart of a knowledge extraction method according to a third embodiment of the disclosure;
FIG. 5 is a flowchart of a training method of a knowledge extraction model according to a fourth embodiment of the disclosure;
FIG. 6 is a flowchart of a training method of a knowledge extraction model according to a fifth embodiment of the disclosure;
FIG. 7 is a flowchart of a training method of a knowledge extraction model according to a sixth embodiment of the disclosure;
FIG. 8 is a sample of samples provided by the present disclosure;
FIG. 9 is a schematic diagram of a complete flow of the Prompt+MRC-based domain-specific knowledge point extraction service provided by the present disclosure;
fig. 10 is a schematic structural diagram of a knowledge extraction device according to a seventh embodiment of the disclosure;
FIG. 11 is a schematic structural diagram of a training device for knowledge extraction model according to an embodiment of the present disclosure;
FIG. 12 illustrates a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In an office scenario of an enterprise, how to effectively mine useful key information from unstructured texts and recommend proper related documents, solutions and the like according to the concerned information, wherein the text deposited by staff contains knowledge ATT in a plurality of specific fields, is one of the very important problems of enterprise knowledge management.
Although the number of the texts deposited by the staff is large, the effective information contained in the texts is sparse, and meanwhile, a unified method is lacking to acquire knowledge information (or knowledge points) under multiple knowledge types, so that the construction of the information resource pool of the specific field in the actual scene is difficult. Therefore, how to mine the effective information in the text generated by the staff to assist knowledge recommendation and improve office efficiency is important.
Currently, knowledge information extraction tasks are mainly classified into "nested" (a knowledge point may belong to multiple knowledge types, such as "knowledge recommendation" belongs to both product and algorithm types) information extraction tasks and "non-nested" (a knowledge point belongs to only one knowledge type) information extraction tasks.
The 'non-nested' information extraction task is mostly solved by adopting a sequence labeling mode, and the basic framework is as follows: (1) Firstly, coding an input text through a deep neural network (such as CNN (Convolutional Neural Network, convolutional neural network), LSTM (Long Short-Term Memory network), transformer and the like), and extracting semantic features of the input text; (2) And then using CRF (Conditional Random Filed, conditional random field) as a sequence labeling layer to predict the boundaries of knowledge points (such as BIO mode, B-knowledge point starting position, I-knowledge point other positions and O-other characters), and further analyzing the final result.
It should be noted that, the conventional neural network method represented by CNN and LSTM requires a large amount of supervised corpora in the field, and training network parameters from scratch, so that the model can learn the information related to the task better. The appearance of large-scale pre-training models (such as BERT (Bidirectional Encoder Representations from Transformer, a transform-based bi-directional encoder representation), ernie (Enhanced language Representation with Informative Entities, a knowledge-enhanced semantic representation model), and the like) not only relieves the dilemma of needing large-scale corpus for de novo training, but also enriches the coding expression of texts, and brings stronger semantic information.
The method for solving the nested information extraction task is to construct a piped model structure, namely, firstly extracting all entity information in an input text, and then judging the entity type of each entity.
Although the information extraction task is widely existed in the actual business scene, the information extraction task based on the neural network has more researches, the following defects still exist in solving modes of the 'nested' information extraction task and the 'non-nested' information extraction task:
On the one hand, the sequence labeling method needs to predict labels (such as BIO, B-entity starting position, I-entity other positions and O-other characters) corresponding to each character in knowledge points (such as entities). That is, only a single label can be assigned to a specific character, and thus only the case of the "non-nested" information extraction task can be solved. If it is desired to handle the "nested" information extraction task by way of sequence labeling, multiple CRF layers need to be trained at the output layer, thus increasing the complexity of the model.
On the other hand, the current mature 'nested' information extraction task generally adopts a pipeline structure (firstly extracting all entity information in an input text and then classifying the entities), and although the mode is simpler, probability errors exist in two split subtasks, and the overall accuracy can be reduced due to the fact that the errors of entity extraction are direct. Moreover, it is not straightforward to consider the "nested" information extraction task and the "non-nested" information extraction task separately.
In view of at least one of the above problems, the present disclosure proposes a knowledge extraction method, a training device, a training apparatus, and a training medium for a knowledge extraction model.
The knowledge extraction method, the training method of the knowledge extraction model, the device, the equipment and the medium of the embodiment of the disclosure are described below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a knowledge extraction method according to an embodiment of the disclosure.
The knowledge extraction method is configured in a knowledge extraction device for example, and the knowledge extraction device can be applied to any electronic equipment so that the electronic equipment can execute a knowledge extraction function.
The electronic device may be any device with computing capability, for example, a PC (Personal Computer ), a mobile terminal, a server, and the like, and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, and other hardware devices with various operating systems, touch screens, and/or display screens.
As shown in fig. 1, the knowledge extraction method may include the steps of:
step 101, obtaining a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted.
In the embodiment of the present disclosure, the text to be extracted may include a knowledge base document, conference content, daily newspaper, weekly newspaper, OKR, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, the manner of obtaining the text to be extracted is not limited, for example, the text to be extracted may be collected online by a web crawler technology, or the text to be extracted may be collected offline, or the text to be extracted may be input by a user, etc., which is not limited in this disclosure.
In the embodiment of the disclosure, the prompt information may be used to indicate information to be extracted in the text to be extracted. The prompt information can be set according to the requirement.
As an example, the prompt information may be set according to a specific domain direction (such as a specific application domain direction), for example, when the application domain direction is an algorithm direction, the prompt information may be "skill attention point of algorithm direction", for example, when the application domain direction is a recommended service direction, the prompt information may be "attention point of recommended service direction", and so on, which are not listed herein.
As another example, a broad hint may be set or structured to apply to different knowledge type scenarios in order to apply to all application domain directions or application scenarios. For example, the hint information may be "knowledge points in text", "entities in text", etc.
For example, assuming the text to be extracted is "update user behavior model: itg update user behavior model, the recognition rate is improved, the service of the map star cloud is relied on, and the user has communicated with the map star cloud team to obtain the performance test permission. The hint information may be "knowledge points in text".
And 102, splicing the text to be extracted and the prompt information to obtain an input text.
In the embodiment of the disclosure, the text to be extracted and the prompt information can be spliced to obtain the input text.
Still referring to the above example, the text to be extracted and the prompt message are spliced, and the obtained input text may update the user behavior model for "knowledge points in the text: itg updates the user behavior model, improves the recognition rate, relies on the service of the map star cloud, and has been communicated with the map star cloud team to obtain the performance test permission. Or updating the user behavior model itg, wherein the service of which the recognition rate depends on the map star cloud is communicated with the map star cloud team to acquire the performance test permission. Knowledge points in text).
It should be noted that, the above example of splicing the text to be extracted and the prompt information is merely exemplary, and in practical application, a specific splicing manner may be selected according to needs, so that the text to be extracted and the prompt information are spliced, which is not limited in this disclosure.
And 103, extracting features of the input text to obtain semantic features of the input text.
In the embodiment of the disclosure, feature extraction may be performed on an input text to obtain semantic features of the input text, for example, a BERT model, an Ernie model, and the like may be used to perform feature extraction on the input text to obtain semantic features of the input text.
And 104, carrying out knowledge extraction on the semantic features by adopting knowledge extraction models corresponding to the knowledge types respectively to obtain knowledge points under at least one knowledge type in the knowledge types.
In the disclosed embodiments, knowledge points may include entities, phrases, important vocabularies, text fragments, and the like.
It should be noted that the number of knowledge points may be one or may be plural, which is not limited in this disclosure.
In the embodiment of the disclosure, the knowledge type may be used to indicate a type to which the knowledge point belongs, for example, the knowledge point is "distributed deployment", the type to which the knowledge point belongs (i.e., knowledge type) may be "backend", and for another example, the knowledge point is "knowledge recommendation", and the type to which the knowledge point belongs (i.e., knowledge type) may be "product" or "algorithm".
In the disclosed embodiments, the plurality of knowledge types may be preset, i.e., the plurality of knowledge types are known.
In the disclosed embodiments, for any of a plurality of knowledge types, the any knowledge type may have a corresponding knowledge extraction model.
In the embodiment of the disclosure, knowledge extraction models corresponding to a plurality of knowledge types may be used to extract knowledge from semantic features, so as to obtain knowledge points under at least one knowledge type in the plurality of knowledge types.
That is, after knowledge extraction is performed on the semantic features by using knowledge extraction models corresponding to the plurality of knowledge types, knowledge points under part of the knowledge types in the plurality of knowledge types may be obtained, or knowledge points under all of the knowledge types in the plurality of knowledge types may be obtained, which is not limited in the present disclosure.
According to the knowledge extraction method, the text to be extracted and the prompt information are obtained, wherein the prompt information is used for indicating the information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; extracting features of the input text to obtain semantic features of the input text; and carrying out knowledge extraction on the semantic features by adopting knowledge extraction models corresponding to the knowledge types respectively to obtain knowledge points under at least one knowledge type in the knowledge types. Therefore, the knowledge extraction model corresponding to a plurality of known knowledge types can be realized, the knowledge points matched with the prompt information in the text to be extracted can be extracted, the knowledge points matched with the prompt information can be extracted from the text to be extracted, and the knowledge type to which the knowledge points belong can be determined at the same time. Moreover, the common extraction of knowledge points under a plurality of knowledge types can be realized, so as to meet the extraction requirements of different users.
It should be noted that, in the technical solution of the present disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing, etc. of the personal information of the user are all performed on the premise of proving the consent of the user, and all conform to the rules of the related laws and regulations, and do not violate the popular regulations of the public order.
In order to clearly explain how knowledge extraction is performed on semantic features by using knowledge extraction models corresponding to a plurality of knowledge types in the above embodiments of the disclosure, so as to obtain knowledge points under at least one knowledge type in the plurality of knowledge types, the disclosure further provides a knowledge extraction method.
Fig. 2 is a flowchart of a knowledge extraction method according to a second embodiment of the disclosure.
As shown in fig. 2, the knowledge extraction method may include the steps of:
step 201, obtaining a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted.
And 202, splicing the text to be extracted and the prompt information to obtain an input text.
The explanation of steps 201 to 202 may be referred to the relevant descriptions in any embodiment of the disclosure, and will not be repeated here.
In one possible implementation of the embodiments of the present disclosure, a separator may be added before the first character of the text to be extracted, and a hint may be stitched before the separator, where the last character of the hint may be adjacent to the separator.
In embodiments of the present disclosure, separators may be used to isolate text to be extracted from the hint information, such as may be [ SEP ].
As an example, assume that text to be extracted is "complete model distributed deployment, online knowledge recommendation project. The Prompt information Prompt is a text attention point, a separator [ SEP ] is added before the first character of the text to be extracted, the Prompt information is spliced before the separator, and the obtained input text can be the [ CLS ] text attention point [ SEP ] to complete model distributed deployment and online knowledge recommendation items. [ SEP ] ", wherein [ CLS ] is the initiator.
Therefore, the text to be extracted and the prompt information are connected through the segmenter, namely, the text to be extracted and the prompt information are spliced, and sufficient semantic interaction between the text to be extracted and the prompt information in subsequent processing can be facilitated.
In another possible implementation of the embodiments of the present disclosure, a separator may be added after the last character of the text to be extracted, and a hint may be stitched after the separator, where the first character of the hint may be adjacent to the separator.
Still exemplified by the above example, the input text may be, for example, "[ CLS ] completion model distributed deployment, online knowledge recommendation project. [ SEP ] text focus [ SEP ] ", where [ CLS ] is the initiator.
Therefore, the text to be extracted and the prompt information can be spliced in different modes, and the flexibility and applicability of the method can be improved.
And 203, extracting features of the input text to obtain semantic features of the input text.
The explanation of step 203 may be referred to the relevant descriptions in any embodiment of the disclosure, and will not be repeated here.
Step 204, for any knowledge type in the plurality of knowledge types, classifying the semantic features by using a first classification network corresponding to the knowledge type to obtain a first classification label.
It should be noted that the explanation of the knowledge points and the knowledge types in step 104 is also applicable to this embodiment, and will not be repeated here.
In an embodiment of the present disclosure, the first classification label may be used to indicate a starting position of the knowledge point in the text to be extracted, i.e. to indicate a position of a first character of the knowledge point in the text to be extracted.
In an embodiment of the present disclosure, for any knowledge type of a plurality of knowledge types, a knowledge extraction model corresponding to the any knowledge type may include a first classification network, and the semantic features may be classified by using the first classification network corresponding to the any knowledge type, so as to obtain a first classification label.
It should be noted that, when the semantic features are classified by using the first classification network corresponding to any knowledge type, the number of the obtained first classification labels may be, but is not limited to, one, which is not limited in the disclosure.
As an example, for any knowledge type, the semantic features may be classified by using a first classification network corresponding to the any knowledge type, so as to obtain classification labels of each character in the text to be extracted, and the first classification label may be determined from each classification label. For example, the first classification network may be a classification network, and if the label output by the classification network is 0 and indicates that the corresponding character is not the first character of the knowledge point, and if the label output by the classification network is 1 and indicates that the corresponding character is the first character of the knowledge point, the first classification label may be 1.
And step 205, classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label.
In an embodiment of the present disclosure, the second classification label may be used to indicate an end position of the knowledge point in the text to be extracted, i.e. indicate a position of a tail character of the knowledge point in the text to be extracted.
In an embodiment of the present disclosure, for any knowledge type of a plurality of knowledge types, the knowledge extraction model corresponding to the any knowledge type may include a second classification network, and the semantic features may be classified by using the second classification network corresponding to the any knowledge type, so as to obtain a second classification label.
It should be noted that, when the semantic features are classified by using the second classification network corresponding to any knowledge type, the number of the obtained second classification labels may be, but is not limited to, one, which is not limited in the disclosure.
As an example, the semantic features may be classified by using a second classification network corresponding to the arbitrary knowledge type, so as to obtain classification labels of each character in the text to be extracted, and the second classification label may be determined from each classification label. For example, the second classification network may be a classification network, and if the label output by the classification network is 0 and indicates that the corresponding character is not the tail character of the knowledge point, and if the label output by the classification network is 1 and indicates that the corresponding character is the tail character of the knowledge point, the second classification label may be a label of 1.
Step 206, determining knowledge points under knowledge type according to the first classification label and the second classification label.
In the embodiment of the disclosure, knowledge points under any knowledge type can be determined according to the first classification label and the second classification label.
As a possible implementation manner, for any knowledge type, the text to be extracted may be intercepted from the starting position of the knowledge point indicated by the first classification label corresponding to the knowledge type to the ending position of the knowledge point indicated by the second classification label corresponding to the knowledge type, so as to obtain the knowledge point under the knowledge type.
As an example, when the number of the obtained first classification tag and the number of the obtained second classification tag are both one, the text to be extracted may be directly intercepted from the start position of the knowledge point indicated by the first classification tag to the end position of the knowledge point indicated by the second classification tag, so as to obtain the knowledge point under the knowledge type.
As still another example, when the number of the obtained first classification tag and second classification tag is plural, for example, the first classification tag includes a first classification tag 1 and a first classification tag 2, the second classification tag includes a second classification tag 1 and a second classification tag 2, the text to be extracted may be intercepted from the start position of the knowledge point indicated by the first classification tag 1 to the end position of the knowledge point indicated by the second classification tag 2 closest to the first classification tag 1 (i.e., the distance between the character corresponding to the second classification tag 2 and the character corresponding to the first classification tag 1 is smaller than the distance between the character corresponding to the second classification tag 1 and the character corresponding to the first classification tag 1), so as to obtain the knowledge point 1 under a certain knowledge type, and may be intercepted from the start position of the knowledge point indicated by the first classification tag 2 to the end position of the knowledge point indicated by the second classification tag 1 closest to the first classification tag 2, so as to obtain the knowledge point 2 under the certain knowledge type.
It should be noted that, the direction of intercepting the knowledge points under the knowledge type in the text to be extracted may be from left to right, or may be from right to left, which is not limited in the disclosure.
Therefore, knowledge points under knowledge types can be effectively and accurately determined through the first classification labels and the second classification labels.
It will be appreciated that the first class labels and the second class labels are presented in pairs, i.e. knowledge points under each knowledge type correspond to at least one first class label and at least one second class label.
In any one of the embodiments of the present disclosure, for any one of the at least one knowledge type, after determining knowledge points under the knowledge type in the text to be extracted, candidate resources matching the knowledge type may be obtained from a resource pool; and the target resource matched with the knowledge point under the knowledge type can be determined from the candidate resources, and the target resource can be displayed.
In embodiments of the present disclosure, information resources under multiple knowledge types may be stored in a resource pool.
In the embodiment of the present disclosure, the candidate resources may be information resources in the resource pool, which are matched with any knowledge type, and the number of the candidate resources may be one or may be multiple, which is not limited in the present disclosure.
In the embodiment of the present disclosure, the target resource may be an information resource that matches with a knowledge point under any knowledge type in the candidate resource, and the number of target resources may be one or may be multiple, which is not limited in the present disclosure.
For example, in an office scenario of an enterprise, after a knowledge point under a certain knowledge type is mined in a text (to be extracted text in the present disclosure) generated from an employee a, a candidate resource matching the knowledge type may be obtained from a resource pool, a target resource matching the knowledge point under the knowledge type may be determined from the candidate resource, and the target resource may be displayed to the employee a, so that knowledge recommendation may be accurately performed to the employee a, and office efficiency of the employee may be improved.
In summary, after determining knowledge points under each knowledge type in the text to be extracted, target resources recommended to the user can be accurately obtained from the resource pool based on each knowledge type and the knowledge points under each knowledge type, and the use experience of the user can be improved.
As an example, to achieve the extraction task of "nested" information and "non-nested" information at the same time, the knowledge extraction model of the present disclosure may use 2×num (num is the number of knowledge types) of two classification networks or two classification models at the output end, that is, the knowledge extraction model corresponding to any knowledge type includes a first classification network for predicting the starting position of the knowledge point under any knowledge type in the text to be extracted, and a second classification network for predicting the ending position of the knowledge point under any knowledge type in the text to be extracted, where the knowledge extraction network frame corresponding to the knowledge extraction model may be as shown in fig. 3, so that the knowledge extraction model may not only extract the knowledge point matched with the hint information from the text to be extracted, but also determine the knowledge type to which the knowledge point belongs at the same time.
According to the knowledge extraction method, semantic features are classified by adopting a first classification network corresponding to a knowledge type aiming at any knowledge type in a plurality of knowledge types to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in a text to be extracted; classifying semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted; and determining knowledge points under the knowledge type according to the first classification label and the second classification label. Therefore, the method and the system can realize the prediction of the starting position and the ending position of the knowledge points in the text to be extracted through the first classification network and the second classification network corresponding to any knowledge type, and effectively and accurately determine the knowledge points matched with the prompt information.
In order to clearly explain how to use knowledge extraction models corresponding to a plurality of knowledge types to extract knowledge from semantic features in the above embodiment, the disclosure further provides a knowledge extraction method.
Fig. 4 is a flowchart of a knowledge extraction method according to a third embodiment of the disclosure.
As shown in fig. 4, the knowledge extraction method may include the steps of:
step 401, obtaining a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted.
And step 402, splicing the text to be extracted and the prompt information to obtain an input text.
Step 403, extracting features of the input text to obtain semantic features of the input text.
The execution of steps 401 to 403 may refer to the execution of any embodiment of the present disclosure, and will not be described herein.
And step 404, for any knowledge type in the plurality of knowledge types, performing sequence labeling on the semantic features by adopting a sequence labeling model corresponding to the knowledge type to obtain a first labeling label and a second labeling label.
It should be noted that the explanation of the knowledge points and the knowledge types in step 104 is also applicable to this embodiment, and will not be repeated here.
In the embodiment of the disclosure, the first labeling tag may be used to indicate a starting position of the knowledge point in the text to be extracted, for example, the first labeling tag may be represented by B (Begin).
In an embodiment of the disclosure, the second label may be used to indicate the End position of the knowledge point in the text to be extracted, e.g. the second label may be denoted by E (End).
In the embodiment of the disclosure, for any knowledge type of a plurality of knowledge types, the knowledge extraction model corresponding to the any knowledge type may include a sequence labeling model, and the sequence labeling model corresponding to the any knowledge type may be used to label the semantic features in sequence, so as to obtain a first labeling label and a second labeling label.
It should be noted that, when the sequence labeling model corresponding to any knowledge type is used for sequence labeling of the semantic features, the obtained first labeling label and second labeling label are in pairs, and the number of the first labeling label and the second labeling label can be one, but is not limited to one.
As an example, for any knowledge type, a sequence labeling model corresponding to the any knowledge type may be used to label the semantic features in sequence, so as to obtain labeling labels of each character in the text to be extracted, so that a first labeling label and a second labeling label may be determined from each labeling label. For example, tag B may be the first label and tag E may be the second label.
Step 405, determining knowledge points under the knowledge type according to the first labeling label and the second labeling label.
In the embodiment of the disclosure, knowledge points under any knowledge type can be determined according to the first labeling label and the second labeling label.
As a possible implementation manner, for any knowledge type, the text to be extracted may be intercepted from the starting position of the knowledge point indicated by the first labeling label corresponding to the knowledge type to the ending position of the knowledge point indicated by the second labeling label corresponding to the knowledge type, so that the knowledge point under the knowledge type may be obtained.
As an example, when the number of the obtained first label tag and the number of the obtained second label tag are both one, the text to be extracted may be directly intercepted from the start position of the knowledge point indicated by the first label tag to the end position of the knowledge point indicated by the second label tag, so as to obtain the knowledge point under the knowledge type.
As another example, when the number of the obtained first label tag and the second label tag is multiple, for example, the first label tag includes a first label tag 1 and a first label tag 2, the second label tag includes a second label tag 1 and a second label tag 2, the text to be extracted may be intercepted from the start position of the knowledge point indicated by the first label tag 1 to the end position of the knowledge point indicated by the second label tag 1 closest to the first label tag 1, so as to obtain the knowledge point 1 under a certain knowledge type, and may be intercepted from the start position of the knowledge point indicated by the first label tag 2 to the end position of the knowledge point indicated by the second label tag 2 closest to the first label tag 2, so as to obtain the knowledge point 2 under the certain knowledge type.
It should be noted that, the direction of intercepting the knowledge points under the knowledge type in the text to be extracted may be from left to right, or may be from right to left, which is not limited in the disclosure.
Therefore, knowledge points under knowledge types can be effectively and accurately determined through the first labeling label and the second labeling label.
According to the knowledge extraction method, for any knowledge type in a plurality of knowledge types, a sequence labeling model corresponding to the knowledge type is adopted to label the semantic features in sequence, so that a first labeling label and a second labeling label are obtained; and determining the knowledge point under the knowledge type according to the first labeling label and the second labeling label, wherein the first labeling label is used for indicating the starting position of the knowledge point in the text to be extracted, and the second labeling label is used for indicating the ending position of the knowledge point in the text to be extracted. Therefore, the method can be used for predicting the starting position and the ending position of the knowledge points in the text to be extracted through the sequence labeling model corresponding to any knowledge type, and effectively and accurately determining the knowledge points under the knowledge type matched with the prompt information.
The above embodiments correspond to the application method of the knowledge extraction model (i.e., the knowledge extraction method), and the disclosure further provides a training method of the knowledge extraction model.
Fig. 5 is a flowchart of a training method of a knowledge extraction model according to a fourth embodiment of the disclosure.
As shown in fig. 5, the training method of the knowledge extraction model may include the following steps:
step 501, obtaining a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the labeling information of the training text is used for indicating a corresponding relation between at least one labeling knowledge type and labeling knowledge points.
In embodiments of the present disclosure, the training text may include knowledge base documents, meeting content, daily newspapers, weekly newspapers, OKR, and the like, which is not limiting in this disclosure.
In the embodiment of the present disclosure, the manner of obtaining the training text is not limited, for example, the training text may be obtained from an existing training set, or the training text may also be collected online, for example, may be obtained online through a web crawler technology, or the training text may also be provided for a user, or the like, which is not limited in this disclosure.
In the embodiment of the disclosure, the prompt information may be used to indicate information to be extracted in the training text, and the prompt information may be set according to needs.
As an example, the prompt information may be set according to a specific domain direction (such as a specific application domain direction), for example, when the application domain direction is an algorithm direction, the prompt information may be "skill attention point of algorithm direction", for example, when the application domain direction is a recommended service direction, the prompt information may be "attention point of recommended service direction", and so on, which are not listed herein.
As another example, a broad hint may be set or structured to apply to different knowledge type scenarios in order to apply to all application domain directions or application scenarios. For example, the hint information may be "knowledge points in text", "entities in text", etc.
In embodiments of the present disclosure, labeling knowledge points may be used to indicate knowledge points that training text includes, where knowledge points may include entities, phrases, important vocabularies, text segments, and the like, which is not limiting in this disclosure.
In the embodiment of the present disclosure, the labeling knowledge points may be one or may be multiple, which is not limited in this disclosure.
In the embodiment of the present disclosure, the labeling knowledge type may be used to indicate a knowledge type to which a knowledge point included in the training text belongs, where the knowledge type may be, for example, "back end", "front end", "product", "algorithm", and the like, which is not limited in this disclosure.
In the embodiment of the present disclosure, the labeling knowledge type may be one or may be plural, which is not limited in this disclosure.
In the embodiment of the disclosure, the labeling information of the training text may be used to indicate a correspondence between at least one labeling knowledge type and a labeling knowledge point.
It should be noted that, the number of the correspondence between the labeling knowledge type indicated by the labeling information of the training text and the labeling knowledge point may be one or may be multiple, which is not limited in this disclosure.
As one example, the training text is "complete model distributed deployment, online knowledge recommendation project. The annotation information of the training text indicates the corresponding relation between the annotation knowledge type 'back end' and 'distributed deployment', the corresponding relation between the annotation knowledge type 'algorithm' and 'knowledge recommendation', and the corresponding relation between the annotation knowledge type 'product' and 'knowledge recommendation'.
Step 502, the training text and the prompt message are spliced to obtain an input text.
It should be noted that, the method for splicing the training text and the prompt information is similar to the method for splicing the text to be extracted and the prompt information in step 202, and will not be described in detail herein.
In step 503, feature extraction is performed on the input text to obtain semantic features of the input text.
In the embodiment of the disclosure, feature extraction may be performed on an input text to obtain semantic features of the input text, for example, a BERT model, an Ernie model, and the like may be used to perform feature extraction on the input text to obtain semantic features of the input text.
And step 504, knowledge extraction is carried out on the semantic features by adopting knowledge extraction models corresponding to the knowledge types so as to obtain predicted knowledge points under at least one predicted knowledge type in the knowledge types.
In the disclosed embodiments, the predicted knowledge type may be one of a plurality of knowledge types, or a plurality of knowledge types, which the present disclosure does not limit.
In the embodiment of the disclosure, knowledge extraction models corresponding to a plurality of knowledge types can be adopted to extract knowledge from semantic features, so that a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types can be obtained.
And 505, performing joint training on knowledge extraction models of a plurality of knowledge types according to the labeling knowledge points under each labeling knowledge type and the prediction knowledge points under each prediction knowledge type.
In the embodiment of the disclosure, knowledge extraction models of multiple knowledge types can be jointly trained according to the labeling knowledge points under each labeling knowledge type and the prediction knowledge points under each prediction knowledge type.
According to the knowledge extraction method, the training text and the prompt information are obtained, wherein the prompt information is used for indicating information to be extracted in the training text, and the labeling information of the training text is used for indicating the corresponding relation between at least one labeling knowledge type and labeling knowledge points; splicing the training text and the prompt information to obtain an input text; extracting features of the input text to obtain semantic features of the input text; knowledge extraction is respectively carried out on semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types so as to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types; and carrying out joint training on knowledge extraction models of a plurality of knowledge types according to the labeling knowledge points under each labeling knowledge type and the prediction knowledge points under each prediction knowledge type. Therefore, based on the deep learning technology, the knowledge extraction model is trained, and accuracy and reliability of a model prediction result can be improved.
In order to clearly explain how knowledge extraction models corresponding to a plurality of knowledge types are adopted to extract the semantic features respectively in the above embodiments of the disclosure, so as to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types, the disclosure further provides a training method of the knowledge extraction models.
Fig. 6 is a flowchart of a training method of a knowledge extraction model according to a fifth embodiment of the disclosure.
As shown in fig. 6, the training method of the knowledge extraction model may include the following steps:
in step 601, a training text and a prompt message are obtained, wherein the prompt message is used for indicating information to be extracted in the training text, and the labeling message of the training text is used for indicating a corresponding relation between at least one labeling knowledge type and labeling knowledge points.
And step 602, splicing the training text and the prompt information to obtain an input text.
In step 603, feature extraction is performed on the input text to obtain semantic features of the input text.
The execution of steps 601 to 603 may refer to the execution of any embodiment of the present disclosure, and will not be described herein.
Step 604, for any knowledge type in the plurality of knowledge types, classifying the semantic features by using a first classification network corresponding to the knowledge type to obtain a first classification label of each character in the training text.
In the embodiment of the disclosure, the first classification label of each character may be used to indicate whether the corresponding character is the first character of the knowledge point under the corresponding knowledge type, i.e. the first classification label of each character may be used to indicate whether the corresponding character is the starting position of the knowledge point under the corresponding knowledge type in the training text.
In an embodiment of the present disclosure, for any knowledge type of a plurality of knowledge types, a knowledge extraction model corresponding to the any knowledge type may include a first classification network, and the first classification network corresponding to the any knowledge type may be used to classify semantic features to obtain a first classification label of each character in the training text.
Step 605, classifying the semantic features by using a second classification network corresponding to the knowledge type to obtain a second classification label of each character.
In the embodiment of the disclosure, the second classification label of each character may be used to indicate whether the corresponding character is a tail character of the knowledge point under the corresponding knowledge type, i.e. the second classification label of each character may be used to indicate whether the corresponding character is an end position of the knowledge point under the corresponding knowledge type in the training text.
In the embodiment of the present disclosure, for any knowledge type of a plurality of knowledge types, the knowledge extraction model corresponding to the any knowledge type may further include a second classification network, and the semantic features may be classified by using the second classification network corresponding to the any knowledge type, so as to obtain a second classification label of each character in the training text.
It is understood that the first classification network and the second classification network of the knowledge extraction model are independent of each other.
In step 606, in the case that each first classification tag includes a first target tag and each second classification tag includes a second target tag, the knowledge type is used as a predicted knowledge type, and a predicted knowledge point under the predicted knowledge type is determined according to the first target tag and the second target tag.
The first target tag may be used to indicate that the corresponding character is the first character of the predicted knowledge point.
It should be noted that, when the semantic features are classified by using the first classification network corresponding to any knowledge type, the number of the first target labels included in the first classification labels of each character in the obtained training text may be, but is not limited to, one, which is not limited in the disclosure.
For example, assume that the training text is "complete model distributed deployment, online knowledge recommendation project. The semantic features can be classified by adopting a first classification network corresponding to the knowledge type (such as a product), the obtained first classification label 1 indicates that the predicted character is known as the first character of the knowledge point under the corresponding knowledge type (such as the product), the semantic features can be classified by adopting a first classification network corresponding to the knowledge type (such as the rear end), and the obtained first classification label 2 indicates that the predicted character is classified as the first character of the knowledge point under the corresponding knowledge type (such as the rear end).
The second target tag may be used to indicate that the corresponding character is the tail character of the predicted knowledge point.
It should be noted that, when the semantic features are classified by using the second classification network corresponding to any knowledge type, the number of the second target labels included in the second classification labels of the characters in the obtained training text may be, but is not limited to, one, which is not limited in the disclosure.
For example, still exemplified by the above example, the training text is "complete model distributed deployment, online knowledge recommendation project. The semantic features can be classified by adopting a second classification network corresponding to the knowledge type (such as a product), the obtained second classification label 1 indicates that the predicted character is the tail character of the knowledge point under the corresponding knowledge type (such as the product), the semantic features can be classified by adopting a second classification network corresponding to the knowledge type (such as the rear end), and the obtained second classification label 2 indicates that the predicted character is the tail character of the knowledge point under the corresponding knowledge type (such as the rear end).
In the embodiment of the disclosure, when each first classification tag includes a first target tag and each second classification tag includes a second target tag, the corresponding knowledge type may be used as a predicted knowledge type, and a predicted knowledge point under the predicted knowledge type may be determined according to the first target tag and the second target tag.
As a possible implementation manner, the training text may be cut from the first character of the predicted knowledge point indicated by the first target label, and the tail character of the predicted knowledge point indicated by the second target label may be cut, so as to obtain the predicted knowledge point under the predicted knowledge type.
As an example, when the number of the first target tag and the number of the second target tag are both one, for the training text, the training text may be directly intercepted from the first character of the predicted knowledge point indicated by the first target tag, and the tail character of the predicted knowledge point indicated by the second target tag may be intercepted, so as to obtain the predicted knowledge point under the predicted knowledge type.
As yet another example, when the number of the obtained first target tag and second target tag is plural, for example, the first target tag includes a first target tag 1 and a first target tag 2, the second target tag includes a second target tag 1 and a second target tag 2, for the training text, a tail character of the predicted knowledge point indicated by the second target tag 1 closest to the first target tag 1 may be intercepted from a first character of the predicted knowledge point indicated by the second target tag 1, a predicted knowledge point 1 under the predicted knowledge type may be obtained, and a tail character of the predicted knowledge point indicated by the second target tag 2 closest to the first target tag 2 may be intercepted from a first character of the predicted knowledge point indicated by the first target tag 2, so as to obtain the predicted knowledge point 2 under the predicted knowledge type.
It should be noted that, the direction of intercepting the predicted knowledge points under the predicted knowledge type in the training text may be from left to right, or may be from right to left, which is not limited in the disclosure.
Step 607, performing joint training on knowledge extraction models of multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
The execution of step 607 may be referred to as the execution of any embodiment of the present disclosure, and will not be described herein.
In one possible implementation manner of the embodiment of the present disclosure, for any knowledge type in multiple knowledge types, the knowledge extraction model corresponding to the any knowledge type may further include a sequence labeling model, and the semantic feature may be input into the sequence labeling model corresponding to the knowledge type, so as to obtain a prediction label of each character in the training text output by the sequence labeling model corresponding to the knowledge type; in the case where the prediction tag of each character includes the first prediction tag and the second prediction tag, the knowledge type may be used as the prediction knowledge type, and the prediction knowledge point under the prediction knowledge type may be determined according to the first prediction tag and the second prediction tag.
The first predictive label may be used to indicate that the corresponding character is the first character of the predictive knowledge point, and the second predictive label may be used to indicate that the corresponding character is the last character of the predictive knowledge point.
In embodiments of the present disclosure, the predictive labels of the characters may be used to indicate whether the corresponding character is a first character, a last character, other characters, etc. of the knowledge point under the corresponding knowledge type.
For example, the training text is "complete model distributed deployment, online knowledge recommendation project. ", the predictive labels of the characters are respectively: o, O, O, O, B, O, O, O, E, O, O, O, B, O, O, E, O, O, O, wherein B may be used to indicate that the corresponding character is the first character of the knowledge point under the corresponding knowledge type, and B is the first predictive label; e may be used to indicate that the corresponding character is the tail character of the knowledge point under the corresponding knowledge type, and E is the second predictive label, O may be used to indicate that the corresponding character is other characters of the knowledge point under the corresponding knowledge type.
It should be noted that, the representation of the prediction labels of the characters in the above examples is merely exemplary, and in practical applications, the prediction labels of the characters may be set according to needs, which is not limited in this disclosure.
As a possible implementation manner, for the training text, the first character of the predicted knowledge point indicated by the first prediction tag may be cut, and the last character of the predicted knowledge point indicated by the second prediction tag may be cut, so that the predicted knowledge point under the predicted knowledge type may be obtained.
Therefore, a sequence labeling model corresponding to each knowledge type can be adopted, and effective prediction of knowledge points under each knowledge type in the training text can be realized.
According to the training method of the knowledge extraction model, semantic features are classified by adopting a first classification network corresponding to the knowledge type aiming at any knowledge type in a plurality of knowledge types, so that a first classification label of each character in a training text is obtained; classifying semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character; under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, taking the knowledge type as a predicted knowledge type, and determining a predicted knowledge point under the predicted knowledge type according to the first target label and the second target label; the first target label is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second target label is used for indicating that the corresponding character is the last character of the predicted knowledge point. Therefore, the prediction of the head and tail characters of the knowledge points in the training text based on the first classification network and the second classification network corresponding to each knowledge type can be realized, and the prediction knowledge points under the prediction knowledge type can be effectively determined.
In order to clearly illustrate how, in any embodiment of the present disclosure, knowledge extraction models of multiple knowledge types are jointly trained according to labeled knowledge points under each labeled knowledge type and predicted knowledge points under each predicted knowledge type, a training method of the knowledge extraction model is further provided in the present disclosure.
Fig. 7 is a flowchart of a training method of a knowledge extraction model according to a sixth embodiment of the disclosure.
As shown in fig. 7, the training method of the knowledge extraction model may include the following steps:
step 701, obtaining a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the labeling information of the training text is used for indicating a corresponding relation between at least one labeling knowledge type and labeling knowledge points.
And step 702, splicing the training text and the prompt information to obtain an input text.
In step 703, feature extraction is performed on the input text to obtain semantic features of the input text.
And step 704, knowledge extraction is performed on the semantic features by adopting knowledge extraction models corresponding to the knowledge types, so as to obtain predicted knowledge points under at least one predicted knowledge type in the knowledge types.
The execution of steps 701 to 704 may refer to the execution of any embodiment of the disclosure, and will not be described herein.
Step 705, comparing each labeling knowledge type with each prediction knowledge type to take the same labeling knowledge type and prediction knowledge type as a first knowledge type and/or take different labeling knowledge type and prediction knowledge type as a second knowledge type.
In the embodiment of the present disclosure, the first knowledge type may be a corresponding knowledge type when the labeling knowledge type and the prediction knowledge type are the same in each labeling knowledge type and each prediction knowledge type.
In the embodiment of the present disclosure, the second knowledge type may be a knowledge type of each labeled knowledge type excluding the first knowledge type and a knowledge type of each predicted knowledge type excluding the first knowledge type, that is, a labeled knowledge type and a predicted knowledge type corresponding to when the labeled knowledge type and the predicted knowledge type are different from each other.
In the embodiment of the disclosure, each labeling knowledge type and each prediction knowledge type may be compared, and the same labeling knowledge type and prediction knowledge type may be used as a first knowledge type, and/or different labeling knowledge type and prediction knowledge type may be used as a second knowledge type.
For example, assume that the plurality of knowledge types includes knowledge type 1, knowledge type 2, knowledge type 3, knowledge type 4, and knowledge type 5, the labeling knowledge types include knowledge type 1, knowledge type 3, knowledge type 4, and the predicting knowledge types include knowledge type 1 and knowledge type 2, where knowledge type 1 is a first knowledge type, knowledge type 2, knowledge type 3, and knowledge type 4 is a second knowledge type.
Step 706, generating a first loss value according to the difference between the predicted knowledge points and the labeled knowledge points belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge points or the labeled knowledge points belonging to the second knowledge type.
In the embodiment of the disclosure, the first loss value may be generated according to a difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type.
As one possible implementation manner, for any first knowledge type, a first difference between a first position of a first character in the training text in a labeled knowledge point belonging to the first knowledge type and a second position of the first character in the training text in a predicted knowledge point belonging to the first knowledge type may be determined; and determining a second difference between a third position of the tail character in the training text in the labeled knowledge point belonging to the first knowledge type and a fourth position of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type; finally, a first loss value may be determined based on the first difference and the second difference.
Therefore, the loss value can be determined according to the position difference of the first character and the last character of the knowledge point (including the labeling knowledge point or the prediction knowledge point) under the first knowledge type in the training text, namely, the position difference between the positive labels of the first character and the position difference between the positive labels of the last character of the knowledge point in the training text, so that the model can be trained according to the loss value, and the training effect of the model, namely, the prediction accuracy of the model is improved.
It should be noted that, for the training text, the proportion of the first character and the last character of the knowledge point (including the labeling knowledge point or the predicting knowledge point) belonging to the first knowledge type in the training text is smaller, that is, the positive label for indicating the first character and the last character of the knowledge point in the training text is sparse, while the proportion of the first character and the last character of the knowledge point not belonging to the first knowledge type in the training text is larger, that is, the negative label for indicating the first character and the last character of the knowledge point in the training text is larger. In order to prevent the knowledge extraction model from being over-fitted in the training process, the loss of the negative labels can also be calculated. Thus, in one possible implementation of the disclosed embodiments, the first loss value may also be determined by the following steps:
First, a first sub-loss value may be determined based on the first difference and the second difference.
The first sub-loss value and the first difference are in a forward relationship (i.e., in a positive correlation), and the first sub-loss value and the second difference are also in a forward relationship (i.e., in a positive correlation), i.e., the smaller the first difference or the second difference is, the smaller the value of the first sub-loss value is, whereas the larger the first difference or the second difference is, the larger the value of the first sub-loss value is.
As an example, a loss value corresponding to the first character may be determined according to the first difference, a loss value corresponding to the last character may be determined according to the second difference, and finally, a sum of a loss value losssstart corresponding to the first character and a loss value lossend corresponding to the last character may be determined, so as to use the sum of the loss value losssstart corresponding to the first character and the loss value lossend corresponding to the last character as the first sub-loss value.
Second, a second sub-loss value may be determined based on a third difference between the first text and the second text.
The first text may be obtained by removing the first character and the last character of the labeled knowledge point belonging to the first knowledge type from the training text, and the second text may be obtained by removing the first character and the last character of the predicted knowledge point belonging to the first knowledge type from the training text.
For example, the training text is "complete model distributed deployment, online knowledge recommendation project. The method comprises the steps of providing a first text, namely completing distributed deployment of a model, wherein the first character of a marked knowledge point belonging to a first knowledge type is known, the last character of the marked knowledge point belonging to the first knowledge type is referred to as a reference, the first character of a predicted knowledge point belonging to the first knowledge type is known, the last character of the marked knowledge point is referred to as a target, and the first text is referred to as an online recognition item. The second text is "complete model distributed deployment, online identity recommendation. ".
As one possible implementation manner, a character loss value of each character in the first text may be determined according to a third difference between the first text and the second text; and determining target characters from the characters of the first text according to the set proportion; thus, the second sub-loss value can be determined from the character loss value of each target character.
In the embodiment of the present disclosure, the setting ratio may be preset, for example, may be 50%,20%, or the like, which is not limited in the present disclosure.
In the embodiment of the disclosure, the target character may be determined from the characters of the first text according to the set proportion, for example, the target character of the set proportion may be selected from the characters of the first text randomly according to the set proportion.
As a possible implementation manner, the setting proportion may be determined according to the total number of characters of the training text, for example, the setting proportion may be in a negative correlation with the total number of characters of the training text, that is, the larger the total number of characters of the training text is, the smaller the setting proportion is, the smaller the total number of characters of the training text is, and the larger the setting proportion is. Thus, the target characters can be controlled in a proper range so as not to be too large or too small in number.
It should be noted that, the third difference between the second sub-loss value and the first text and the second text is a forward relationship (i.e., is a positive correlation), that is, the smaller the third difference between the first text and the second text is, the smaller the character loss value is, and the smaller the value of the second sub-loss value is, whereas, the larger the third difference between the first text and the second text is, the larger the character loss value is, and the larger the value of the second sub-loss value is.
Therefore, through carrying out negative sampling operation on the negative label, only the loss value corresponding to the negative label can be effectively determined based on the character loss value corresponding to part of characters corresponding to the negative label, namely, each target character, without adopting the character loss values corresponding to all the characters corresponding to the negative label to determine the loss value corresponding to the negative label, the subsequent calculation can be facilitated, and the phenomenon that the knowledge extraction model is fitted in the training process can be avoided.
Further, a type loss value for the first knowledge type may be determined based on the first sub-loss value and the second sub-loss value.
For example, a sum of the first sub-loss value and the second sub-loss value may be determined, and the sum of the first sub-loss value and the second sub-loss value may be used as a type loss value corresponding to the first knowledge type.
Finally, a first loss value may be determined based on the type loss values for each first knowledge type.
For example, a sum of type penalty values for each first knowledge type may be determined and the sum of type penalty values for each first knowledge type may be taken as the first penalty value. Alternatively, the type penalty values for each first knowledge type may be weighted and summed to obtain a first penalty value.
Therefore, the first loss value can be effectively determined by fusing the first sub-loss value of the positive label and the second sub-loss value of the negative label, and the phenomenon that the knowledge extraction model is fitted in the training process can be avoided.
In the embodiment of the disclosure, the second loss value may be generated according to the predicted knowledge points or the labeled knowledge points belonging to the second knowledge type.
For example, the second loss value may be generated based on the number of predicted knowledge points or annotated knowledge points belonging to the second knowledge type. The number and the second loss value are in a forward relation, namely the larger the number is, the larger the second loss value is, the smaller the number is, and the smaller the second loss value is.
Or, the type loss value of each second knowledge type can be determined according to the position of the first character and the last character in the predicted knowledge point or the labeled knowledge point belonging to each second knowledge type in the training text, so that the second loss value can be determined according to the type loss value of each second knowledge type. For example, a sum of type penalty values for each second knowledge type may be determined and the sum of type penalty values for each second knowledge type may be taken as the second penalty value. Alternatively, the type penalty values for each second knowledge type may be weighted and summed to obtain a second penalty value.
Step 707, performing joint training on the knowledge extraction models of the plurality of knowledge types according to the first loss value and/or the second loss value.
In embodiments of the present disclosure, knowledge extraction models of multiple knowledge types may be jointly trained based on the first loss value and/or the second loss value.
As an example, knowledge extraction models of multiple knowledge types may be jointly trained based on the first loss value to minimize the value of the first loss value; or, according to the second loss value, carrying out joint training on knowledge extraction models of a plurality of knowledge types so as to minimize the value of the second loss value.
As another example, knowledge extraction models of multiple knowledge types may be jointly trained based on the first and second loss values, e.g., the first and second loss values may be weighted and summed to obtain a target loss value, or the sum of the first and second loss values may be used as the target loss value, such that knowledge extraction models of multiple knowledge types may be jointly trained based on the target loss value to minimize the target loss value.
It should be noted that, the foregoing example is only implemented by taking the termination condition of model training as the minimization of the loss value, and other termination conditions may be set in practical application, for example, the training frequency reaches the set frequency, the training duration reaches the set duration, the loss value converges, and the disclosure is not limited to this.
It will be appreciated that when predicting knowledge points in the training text, in the case where the first knowledge type and the second knowledge type exist, and in the case where the knowledge type does not predict the corresponding knowledge points, the above example will be described, where the training text corresponds to the knowledge types including knowledge type 1, knowledge type 2, knowledge type 3, knowledge type 4, and knowledge type 5, the labeling knowledge types including knowledge type 1, knowledge type 3, knowledge type 4, and the predicting knowledge types including knowledge type 1 and knowledge type 2, where knowledge type 1 is the first knowledge type, knowledge type 2, knowledge type 3, and knowledge type 4 is the second knowledge type, however, for knowledge type 5, knowledge type 5 does not belong to the labeling knowledge type, nor does knowledge type 5 predict the corresponding knowledge points (i.e., knowledge type 5 does not predict the corresponding knowledge points), and in this disclosure, knowledge types other than the first knowledge type and the second knowledge type may be regarded as the third knowledge type, for example, knowledge type 5 belongs to the third knowledge type in the above example. In order to improve the prediction precision of the model, a loss value of the third knowledge type can be determined, and the knowledge extraction models of the plurality of knowledge types are jointly trained according to the loss value of the third knowledge type. That is, in one possible implementation manner of the embodiment of the present disclosure, the third loss values of the plurality of knowledge types may also be determined according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type; determining a fourth loss value according to the third loss values of the plurality of knowledge types; and carrying out joint training on knowledge extraction models of a plurality of knowledge types according to the fourth loss value.
That is, according to the labeling knowledge points under each labeling knowledge type and the prediction knowledge points under each prediction knowledge type, a third loss value corresponding to any one of the plurality of knowledge types (including the first knowledge type, the second knowledge type, and the third knowledge type) is determined, and a fourth loss value may be determined according to the third loss values of the plurality of knowledge types, for example, a sum of the third loss values of the plurality of knowledge types may be determined, and the sum of the third loss values of the plurality of knowledge types may be taken as the fourth loss value, or the third loss values of the plurality of knowledge types may be weighted and summed to obtain the fourth loss value. Thus, in the present disclosure, knowledge extraction models of multiple knowledge types may be jointly trained based on the fourth loss value.
As one example, knowledge extraction models of multiple knowledge types may be jointly trained based on the fourth loss value to minimize the fourth loss value.
It should be noted that, the foregoing example is only implemented by taking the termination condition of model training as the minimization of the fourth loss value, and other termination conditions may be set in practical application, for example, the training frequency reaches the set frequency, the training duration reaches the set duration, the fourth loss value converges, and the disclosure is not limited to this.
Therefore, the knowledge extraction model is trained by comprehensively considering the loss values corresponding to various knowledge types, so that the prediction accuracy of the model can be improved, namely the accuracy and the reliability of the model extraction result are improved.
According to the training method of the knowledge extraction model, the labeling knowledge types and the prediction knowledge types are compared, so that the same labeling knowledge type and prediction knowledge type are used as a first knowledge type, and/or different labeling knowledge types and prediction knowledge types are used as a second knowledge type; generating a first loss value according to the difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type; and carrying out joint training on the knowledge extraction models of the multiple knowledge types according to the first loss value and/or the second loss value. Therefore, the knowledge extraction models of the knowledge types can be jointly trained according to the predicted knowledge points and the labeled knowledge points under the knowledge types, so that the accuracy of the model predicted result is improved.
In summary, by using the method provided by any embodiment of the present disclosure, important knowledge points can be automatically extracted from a huge amount of unstructured text, and in particular, when knowledge points under multiple knowledge types need to be extracted simultaneously, the knowledge extraction model can be greatly simplified through the knowledge extraction framework provided in fig. 3.
In the present disclosure, by converting a knowledge extraction problem into an extraction type reading understanding task and constructing a natural language extraction target (Prompt for prompting learning), the present disclosure is recorded as prompting information to replace the reading understanding problem, so as to achieve a better extraction effect, and the method mainly includes the following aspects:
in a first aspect, a decimated MRC (Machine Reading Comprehension, read understanding) task paradigm is constructed with the problem replaced by Prompt, in combination with text to be decimated as model input.
In the second aspect, an input encoder is constructed based on a pre-trained large model, knowledge points in a text are determined at an output end in a mode of predicting the head-to-tail positions of the knowledge points or ATTs, tag losses are calculated in a combined mode, and knowledge points under multiple knowledge types are mined.
In a third aspect, negative label loss is calculated by negative sampling, preventing model overfitting.
Specifically, a natural language extraction target promt may be first constructed according to a specific domain direction, for example, the promt may be "skill focus in algorithm direction", "focus in recommended service direction", etc., or, in order to be applicable to all application scenarios, a broad extraction target promt may be constructed to be applied to different knowledge type scenarios, for example, the promt may be "knowledge point in text", "entity in text", etc. The Prompt information of the text to be extracted is explicitly introduced into the model through the Prompt, so that semantic interactive understanding is enhanced; secondly, after the text to be extracted is encoded by a pre-trained large model, the head and tail positions of knowledge points under different knowledge types are predicted through 2 x num (knowledge type number) two-class networks or two-class models, and the nesting and non-nesting conditions are uniformly solved; and finally, when the negative label loss is calculated, negative sampling operation can be adopted to prevent the model from being over fitted.
The knowledge extraction method provided by the disclosure can be used for obtaining a better extraction effect under the condition that a small amount of training samples are adopted to train the model by means of a strong pre-training model and skillfully converting the problem mode.
As an example, using the knowledge extraction model shown in fig. 3, the above process may be described in detail in conjunction with the following steps:
and step one, constructing data. The method can convert the input data mined by the ATT into a removable reading understanding data form (f < query, text > answer), wherein the query is constructed in a Prompt mode, and the Prompt and the text to be extracted are spliced and then used as the input of a knowledge extraction model, so that external natural language Prompt information can be fused efficiently; the answer is the extracted knowledge point.
The traditional information extraction task or knowledge extraction task is mainly solved by a sequence labeling mode, the method takes a text to be extracted as input, and a specific entity or knowledge point is extracted by an encoder+CRF mode. The paradigm can obtain better extraction effect under the condition that entities are not nested (different entities belong to different categories and no character overlapping phenomenon exists), but cannot solve the condition that the entities are nested (one entity possibly belongs to a plurality of entity types).
Based on the above, the knowledge extraction method provided by the embodiment of the present disclosure may convert the knowledge extraction task into a reading understanding task. The mode is that for a given question, a corresponding answer segment is extracted in the text to be extracted (learning the mapping of f < query, text > →answer): the final entity information is extracted by predicting the starting and ending positions of the answers. The extraction target can be customized, the query in the MRC can be constructed in a Prompt mode, and the complete data sample can be shown in the table 1:
TABLE 1
Splicing the extraction target Prompt and the text to be extracted to construct an input text: "[ CLS ] Prompt [ SEP ] text [ SEP ]".
By taking data in Table 1 as an example, the constructed input text is knowledge points [ SEP ] in [ CLS ] text, the itg user behavior model is updated, the recognition rate is improved, and the system is communicated with a map star cloud team to obtain performance test permission depending on the service of the map star cloud. [ SEP ] ".
And secondly, inputting the constructed input text into a pre-training large model such as Ernie, so that the text and the Prompt can fully perform semantic interaction after network coding, and corresponding semantic features are obtained.
And thirdly, knowledge extraction is carried out on the semantic features by adopting knowledge extraction models corresponding to the knowledge types respectively so as to predict knowledge points under at least one knowledge type in the knowledge types.
At the output layer of the knowledge extraction model, 2 classification models (the output network of each knowledge type is independent of each other) are designed for the knowledge extraction model corresponding to each knowledge type, and are marked as a first classification network and a second classification network in the disclosure, so that the starting position and the ending position of a knowledge point under the knowledge type can be respectively predicted.
Therefore, the text to be extracted can be intercepted from the predicted starting position of the knowledge point under the corresponding knowledge type to the ending position of the knowledge point under the corresponding knowledge type, so that the knowledge point under the corresponding knowledge type can be obtained.
It should be noted that, since the extraction result only exists in the text to be extracted, in the training stage of the knowledge extraction model, only the loss value of the output part corresponding to the training text is calculated, so as to train the knowledge extraction model according to the loss value; in the prediction stage, the final result is predicted only on the output layer corresponding to the text to be extracted.
It should be further noted that, by adopting the above structural design, the knowledge extraction model not only can extract a plurality of knowledge points existing in the text, but also can distinguish the situation that the same knowledge point belongs to different knowledge types, for example, the sampling case has a sample of "text attention point", and the text of "complete distributed deployment of the model, and the online knowledge recommendation item. The knowledge types include "backend", "algorithm" and "product", and the extracted knowledge points "knowledge recommendations" belong to the knowledge types "algorithm" and "product", and the extracted samples are shown in fig. 8, for example.
Therefore, the knowledge extraction model can solve nested and non-nested knowledge extraction tasks at the same time, and the knowledge extraction model can extract the head and tail positions of the knowledge points in the text and can predict the knowledge types of the knowledge points simultaneously by adopting a prediction mode of 2 x num (the number of knowledge categories) two categories at the output end.
In the training phase of the knowledge extraction model, the first sub-loss value of the training phase may be a sum of a start position loss and an end position loss, which is defined as follows:
loss=loss start +loss end =CE(y start ,p start )+CE(y end ,p end );
wherein loss is start For a first position y in the training text of a first character in the annotation knowledge point according to a first knowledge type (the same annotation knowledge type and the same prediction knowledge type are taken as the first knowledge type) start And a second position p of the first character in the training text in the predicted knowledge point belonging to the first knowledge type start First difference between and by a loss function CE (y start ,p start ) And (5) determining a loss value. Where CE represents the Cross Entropy loss function (Cross Entropy).
Wherein loss is end For a third position y in the training text of the tail character in the annotation knowledge point according to the annotation knowledge point belonging to the first knowledge type (the same annotation knowledge type and the prediction knowledge type are taken as the first knowledge type) end And a fourth position p of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type end And by a second difference between the loss functions CE (y end ,p end ) And (5) determining a loss value.
Since the positive labels (indicating the head-to-tail positions of the knowledge points) are very sparse in the training text, the number of negative labels (not used to indicate the head-to-tail positions of the knowledge points) is a vast majority, in order to prevent training from fitting over, the negative label loss (noted as the second sub-loss value in this disclosure) can be calculated. When the loss of the negative label is calculated, a negative sampling operation (i.e., when the loss of the corresponding position of the negative label is calculated, according to a preset threshold value, the loss calculation is abandoned probabilistically (which can be realized through dropout)), so that the training effect of the model can be further improved.
Specifically, a character loss value of each character in the first text can be determined according to a third difference between the first text and the second text, wherein the first text is obtained by removing the first character and the tail character of the marked knowledge point belonging to the first knowledge type from the training text, and the second text is obtained by removing the first character and the tail character of the predicted knowledge point belonging to the first knowledge type from the training sample; the target character can be determined from the characters of the first text according to the set proportion; and a second sub-loss value may be determined based on the character loss value for each target character.
Further, a type loss value of the first knowledge type may be determined from the first sub-loss value and the second sub-loss value; and may determine a first loss value based on the type loss values for each first knowledge type.
To sum up, a complete flow diagram of the promt+mrc-based domain-specific knowledge point extraction service of the present disclosure is shown in fig. 9.
In the present disclosure, knowledge extraction models of multiple knowledge types may be trained, i.e., parameters of the knowledge extraction models of multiple knowledge types are adjusted, according to the first loss value, so that the first loss value may be minimized.
The knowledge extraction method provided by the present disclosure can be applied to knowledge extraction problems or information extraction problems in any scenario of subdivision field, including but not limited to job sequence knowledge focus extraction, named entity recognition, comment extraction, etc. For example, attention points of staff can be actively found according to the text deposited by staff of an enterprise, so that full connection of people to people, people to knowledge and knowledge to people is established, the knowledge is maximally multiplexed and streamed in work, and a production-consumption positive cycle is formed.
The knowledge extraction model, the knowledge extraction method and the knowledge extraction idea adopted by the method are independent of products, are applicable to knowledge point extraction scenes in any specific field, and mainly comprise the following parts: input data construction, knowledge point prediction under multiple knowledge types, and negative sampling loss calculation. In the data construction stage, a proper extraction target can be constructed according to a specific service scene, and additional prompt information is provided for target extraction; in the aspects of model coding and knowledge point prediction under knowledge types, the current advanced pre-training large model can be replaced, the model can be further pre-trained according to actual conditions, and the coding capacity of a network can be improved; on target extraction, 2 x num (the number of knowledge types) two classification models are adopted to respectively predict the head and tail positions of knowledge attention points (recorded as knowledge points in the disclosure) under different knowledge types, so that unified extraction of the knowledge points under multiple knowledge types is realized; finally, the loss function of the negative label can be calculated in a negative sampling mode, so that the model is prevented from being over fitted.
The knowledge extraction method can be applied to resource construction of knowledge attention points in different functional directions, and knowledge point extraction of a plurality of knowledge types is integrated into one model frame in a unified modeling mode, so that the complexity of knowledge extraction is greatly simplified. The function sequence knowledge focus mining is used as a ring which is the most basic of knowledge distribution system projects in some enterprise knowledge management, and has the function of providing a high-quality resource base for matching and recommending downstream 'knowledge point-documents' and 'knowledge point-experts', so as to promote the positive circulation of knowledge production-consumption.
The knowledge extraction method of the present disclosure may at least include:
1. an extraction target (promt) can be constructed according to service requirements, and the traditional sequence standard problem is converted into an MRC mode to be solved skillfully, so that the information extraction capacity is further improved by introducing additional external information;
2. according to the number of knowledge categories to which the extraction information (noted as knowledge points in the disclosure) belongs, a plurality of classification networks can be constructed, and the conditions of nesting and non-nesting of the extraction information are combined, so that the knowledge extraction framework is more elegant and concise;
3. the prediction result is subjected to certain rule processing according to specific business, so that the extraction effect can be further improved, the accuracy of extracting knowledge attention points in each function direction is more than 90%, and the practical standard is achieved;
In summary, the knowledge extraction method provided by the disclosure can be applied to any business scenario of information extraction, and has strong practical significance and high expansibility.
Corresponding to the knowledge extraction method provided by the embodiments of fig. 1 to 4, the present disclosure further provides a knowledge extraction device, and since the knowledge extraction device provided by the embodiments of the present disclosure corresponds to the knowledge extraction method provided by the embodiments of fig. 1 to 4, the implementation of the knowledge extraction method is also applicable to the knowledge extraction device provided by the embodiments of the present disclosure, and will not be described in detail in the embodiments of the present disclosure.
Fig. 10 is a schematic structural diagram of a knowledge extraction device according to a seventh embodiment of the disclosure.
As shown in fig. 10, the knowledge extraction apparatus 1000 may include: a first acquisition module 1001, a splicing module 1002, an extraction module 1003, and an extraction module 1004.
The first obtaining module 1001 is configured to obtain a text to be extracted and a prompt message, where the prompt message is used to indicate information to be extracted in the text to be extracted.
And a splicing module 1002, configured to splice the text to be extracted and the prompt information to obtain an input text.
The extracting module 1003 is configured to perform feature extraction on the input text to obtain semantic features of the input text.
The extraction module 1004 is configured to perform knowledge extraction on the semantic features by using knowledge extraction models corresponding to the multiple knowledge types, so as to obtain knowledge points under at least one knowledge type in the multiple knowledge types.
In one possible implementation of the embodiment of the disclosure, the knowledge extraction model includes a first classification network and a second classification network, and the extraction module 1004 may specifically be used to: for any knowledge type in a plurality of knowledge types, classifying semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in a text to be extracted; classifying semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted; and determining knowledge points under the knowledge type according to the first classification label and the second classification label.
In one possible implementation of the embodiments of the present disclosure, the extracting module 1004 may specifically be configured to: and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first classification label to the end position of the knowledge point indicated by the second classification label so as to obtain the knowledge point under the knowledge type.
In one possible implementation manner of the embodiment of the present disclosure, the knowledge extraction model includes a sequence labeling model, and the extraction module 1004 may specifically be used to: aiming at any knowledge type in a plurality of knowledge types, a sequence labeling model corresponding to the knowledge type is adopted to label the semantic features in sequence, so as to obtain a first labeling label and a second labeling label; determining knowledge points under the knowledge type according to the first labeling label and the second labeling label; the first labeling label is used for indicating the initial position of the knowledge point in the text to be extracted, and the second labeling label is used for indicating the end position of the knowledge point in the text to be extracted.
In one possible implementation of the embodiments of the present disclosure, the extracting module 1004 may specifically be configured to: and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first labeling label to the end position of the knowledge point indicated by the second labeling label so as to obtain the knowledge point under the knowledge type.
In one possible implementation of the embodiments of the present disclosure, the splicing module 1002 may specifically be used to: adding a separator before the first character of the text to be extracted; the prompt is spliced before the separator, wherein the last character of the prompt is adjacent to the separator.
In one possible implementation of an embodiment of the disclosure, the apparatus 1000 may further include:
and the second acquisition module is used for acquiring candidate resources matched with the knowledge types from the resource pool aiming at any knowledge type of the at least one knowledge type.
And the determining module is used for determining target resources matched with the knowledge points under the knowledge type from the candidate resources.
And the display module is used for displaying the target resource.
According to the knowledge extraction device, the text to be extracted and the prompt information are obtained, wherein the prompt information is used for indicating the information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; extracting features of the input text to obtain semantic features of the input text; and carrying out knowledge extraction on the semantic features by adopting knowledge extraction models corresponding to the knowledge types respectively to obtain knowledge points under at least one knowledge type in the knowledge types. Therefore, the knowledge extraction model corresponding to a plurality of known knowledge types can be realized, the knowledge points matched with the prompt information in the text to be extracted can be extracted, the knowledge points matched with the prompt information can be extracted from the text to be extracted, and the knowledge type to which the knowledge points belong can be determined at the same time. Moreover, the common extraction of knowledge points under a plurality of knowledge types can be realized, so as to meet the extraction requirements of different users.
Corresponding to the training method of the knowledge extraction model provided in the embodiment of fig. 5 to 7, the present disclosure further provides a training device of the knowledge extraction model, and since the training device of the knowledge extraction model provided in the embodiment of the present disclosure corresponds to the training method of the knowledge extraction model provided in the embodiment of fig. 5 to 7, the implementation of the training method of the knowledge extraction model is also applicable to the training device of the knowledge extraction model provided in the embodiment of the present disclosure, which is not described in detail in the embodiment of the present disclosure.
Fig. 11 is a schematic structural diagram of a training device of a knowledge extraction model according to an embodiment of the disclosure.
As shown in fig. 11, the training apparatus 1100 of the knowledge extraction model may include: an acquisition module 1101, a stitching module 1102, an extraction module 1103, an extraction module 1104, and a training module 1105.
The obtaining module 1101 is configured to obtain a training text and a prompt message, where the prompt message is used to indicate information to be extracted in the training text, and the labeling information of the training text is used to indicate a correspondence between at least one labeling knowledge type and a labeling knowledge point.
And the splicing module 1102 is used for splicing the training text and the prompt information to obtain an input text.
The extracting module 1103 is configured to perform feature extraction on the input text to obtain semantic features of the input text.
The extraction module 1104 is configured to perform knowledge extraction on the semantic features by using knowledge extraction models corresponding to the multiple knowledge types, so as to obtain a predicted knowledge point under at least one predicted knowledge type in the multiple knowledge types.
The training module 1105 is configured to perform joint training on knowledge extraction models of multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
In one possible implementation of an embodiment of the disclosure, the knowledge extraction model includes a first classification network and a second classification network, and the extraction module 1104 may specifically be used to: aiming at any knowledge type in a plurality of knowledge types, classifying semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label of each character in the training text; classifying semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character; under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, taking the knowledge type as a predicted knowledge type, and determining a predicted knowledge point under the predicted knowledge type according to the first classification label and the second classification label; the first target label is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second target label is used for indicating that the corresponding character is the last character of the predicted knowledge point.
In one possible implementation of the embodiment of the disclosure, the knowledge extraction model includes a sequence annotation model, and the extraction module 1104 may be specifically configured to: inputting semantic features into a sequence labeling model corresponding to the knowledge types aiming at any knowledge type in the plurality of knowledge types to obtain a prediction label of each character in a training text output by the sequence labeling model corresponding to the knowledge type; under the condition that the prediction labels of the characters comprise a first prediction label and a second prediction label, taking the knowledge type as a prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first prediction label and the second prediction label; the first prediction tag is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second prediction tag is used for indicating that the corresponding character is the last character of the predicted knowledge point.
In one possible implementation of the embodiment of the present disclosure, the training module 1105 may specifically be configured to: comparing the labeling knowledge types with the prediction knowledge types to take the same labeling knowledge type and prediction knowledge type as a first knowledge type and/or taking different labeling knowledge types and prediction knowledge types as a second knowledge type; generating a first loss value according to the difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type; and carrying out joint training on the knowledge extraction models of the multiple knowledge types according to the first loss value and/or the second loss value.
In one possible implementation of the embodiments of the present disclosure, the training module 1105 may be specifically configured to: determining a first difference between a first position of a first character in the training text in a labeling knowledge point belonging to the first knowledge type and a second position of the first character in the training text in a prediction knowledge point belonging to the first knowledge type according to any first knowledge type; determining a second difference between a third position of the tail character in the training text in the labeled knowledge point belonging to the first knowledge type and a fourth position of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type; a first loss value is determined based on the first difference and the second difference.
In one possible implementation of the embodiments of the present disclosure, the training module 1105 may be specifically configured to: determining a first sub-loss value according to the first difference and the second difference; determining a second sub-loss value based on a third difference between the first text and the second text; the first text is obtained by removing the first characters and the tail characters of the marked knowledge points belonging to the first knowledge type from the training text, and the second text is obtained by removing the first characters and the tail characters of the predicted knowledge points belonging to the first knowledge type from the training text; determining a type loss value of the first knowledge type according to the first sub-loss value and the second sub-loss value; and determining a first loss value according to the type loss value of each first knowledge type.
In one possible implementation of the embodiments of the present disclosure, the training module 1105 may be specifically configured to: determining a character loss value of each character in the first text according to a third difference between the first text and the second text; determining target characters from the characters of the first text according to the set proportion; and determining a second sub-loss value according to the character loss value of each target character.
According to the training device of the knowledge extraction model, training texts and prompt information are obtained, wherein the prompt information is used for indicating information to be extracted in the training texts, and marking information of the training texts is used for indicating the corresponding relation between at least one marking knowledge type and marking knowledge points; splicing the training text and the prompt information to obtain an input text; extracting features of the input text to obtain semantic features of the input text; knowledge extraction is respectively carried out on semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types so as to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types; and carrying out joint training on knowledge extraction models of a plurality of knowledge types according to the labeling knowledge points under each labeling knowledge type and the prediction knowledge points under each prediction knowledge type. Therefore, based on the deep learning technology, the knowledge extraction model is trained, and accuracy and reliability of a model prediction result can be improved.
To achieve the above embodiments, the present disclosure also provides an electronic device that may include at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the knowledge extraction method set forth in any one of the embodiments of the disclosure, or to perform the training method of the knowledge extraction model set forth in any one of the embodiments of the disclosure.
To achieve the above embodiments, the present disclosure further provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the knowledge extraction method set forth in any one of the above embodiments of the present disclosure, or to execute the training method of the knowledge extraction model set forth in any one of the above embodiments of the present disclosure.
To achieve the above embodiments, the present disclosure further provides a computer program product, which includes a computer program that, when executed by a processor, implements the knowledge extraction method set forth in any one of the above embodiments of the present disclosure, or implements the training method of the knowledge extraction model set forth in any one of the above embodiments of the present disclosure.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
FIG. 12 illustrates a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure. The electronic device may include the server and the client in the above embodiments. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 12, the electronic apparatus 1200 includes a computing unit 1201 that can perform various appropriate actions and processes according to a computer program stored in a ROM (Read-Only Memory) 1202 or a computer program loaded from a storage unit 1208 into a RAM (Random Access Memory ) 1203. In the RAM 1203, various programs and data required for the operation of the electronic device 1200 may also be stored. The computing unit 1201, the ROM 1202, and the RAM 1203 are connected to each other via a bus 1204. An I/O (Input/Output) interface 1205 is also connected to bus 1204.
Various components in the electronic device 1200 are connected to the I/O interface 1205, including: an input unit 1206 such as a keyboard, mouse, etc.; an output unit 1207 such as various types of displays, speakers, and the like; a storage unit 1208 such as a magnetic disk, an optical disk, or the like; and a communication unit 1209, such as a network card, modem, wireless communication transceiver, etc. The communication unit 1209 allows the electronic device 1200 to exchange information/data with other devices through a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1201 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1201 include, but are not limited to, a CPU (Central Processing Unit ), GPU (Graphic Processing Units, graphics processing unit), various dedicated AI (Artificial Intelligence ) computing chips, various computing units running machine learning model algorithms, DSPs (Digital Signal Processor, digital signal processors), and any suitable processors, controllers, microcontrollers, and the like. The computing unit 1201 performs the various methods and processes described above, such as the knowledge extraction method described above or the training method of the knowledge extraction model. For example, in some embodiments, the knowledge extraction method described above may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1200 via the ROM1202 and/or the communication unit 1209. When the computer program is loaded into the RAM 1203 and executed by the computing unit 1201, one or more steps of the knowledge extraction method or training method of the knowledge extraction model described above may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured to perform the knowledge extraction method or training method of the knowledge extraction model described above in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit System, FPGA (Field Programmable Gate Array ), ASIC (Application-Specific Integrated Circuit, application-specific integrated circuit), ASSP (Application Specific Standard Product, special-purpose standard product), SOC (System On Chip ), CPLD (Complex Programmable Logic Device, complex programmable logic device), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, RAM, ROM, EPROM (Electrically Programmable Read-Only-Memory, erasable programmable read-Only Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., CRT (Cathode-Ray Tube) or LCD (Liquid Crystal Display ) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network ), WAN (Wide Area Network, wide area network), internet and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service (Virtual Private Server, virtual special servers) are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be noted that, artificial intelligence is a subject of studying a certain thought process and intelligent behavior (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a person, and has a technology at both hardware and software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.
Deep learning is a new research direction in the field of machine learning. It is the inherent law and presentation hierarchy of learning sample data, and the information obtained during these learning processes greatly helps the interpretation of data such as text, images and sounds. Its final goal is to have the machine have analytical learning capabilities like a person, and to recognize text, image, and sound data.
Cloud computing (cloud computing) refers to a technical system that accesses an elastically extensible shared physical or virtual resource pool through a network, wherein resources can include servers, operating systems, networks, software, applications, storage devices and the like, and can be deployed and managed in an on-demand and self-service manner. Through cloud computing technology, high-efficiency and powerful data processing capability can be provided for technical application such as artificial intelligence and blockchain, and model training.
According to the technical scheme of the embodiment of the disclosure, the text to be extracted and the prompt information are obtained, wherein the prompt information is used for indicating the information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; extracting features of the input text to obtain semantic features of the input text; and carrying out knowledge extraction on the semantic features by adopting knowledge extraction models corresponding to the knowledge types respectively to obtain knowledge points under at least one knowledge type in the knowledge types. Therefore, the knowledge extraction model corresponding to a plurality of known knowledge types can be realized, the knowledge points matched with the prompt information in the text to be extracted can be extracted, the knowledge points matched with the prompt information can be extracted from the text to be extracted, and the knowledge type to which the knowledge points belong can be determined at the same time. Moreover, the common extraction of knowledge points under a plurality of knowledge types can be realized, so as to meet the extraction requirements of different users.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions presented in the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (24)

1. A knowledge extraction method, the method comprising:
acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted;
splicing the text to be extracted and the prompt information to obtain an input text;
extracting features of the input text to obtain semantic features of the input text;
knowledge extraction is carried out on the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types respectively so as to obtain knowledge points under at least one knowledge type in the plurality of knowledge types;
the knowledge extraction model includes a first classification network and a second classification network,
the knowledge extraction is performed on the semantic features by respectively adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain knowledge points under at least one knowledge type in the plurality of knowledge types, including:
For any knowledge type in the plurality of knowledge types, classifying the semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in the text to be extracted;
classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted;
determining knowledge points under the knowledge type according to the first classification tag and the second classification tag;
the knowledge extraction model comprises a sequence annotation model,
the knowledge extraction is performed on the semantic features by respectively adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain knowledge points under at least one knowledge type in the plurality of knowledge types, including:
for any knowledge type in the plurality of knowledge types, performing sequence labeling on the semantic features by adopting a sequence labeling model corresponding to the knowledge type to obtain a first labeling label and a second labeling label;
Determining knowledge points under the knowledge type according to the first labeling label and the second labeling label;
the first labeling label is used for indicating the initial position of the knowledge point in the text to be extracted, and the second labeling label is used for indicating the end position of the knowledge point in the text to be extracted.
2. The method of claim 1, wherein the determining a knowledge point under the knowledge type from the first classification tag and the second classification tag comprises:
and starting to intercept the text to be extracted from the initial position of the knowledge point indicated by the first classification label to intercept the text to be extracted to the end position of the knowledge point indicated by the second classification label so as to obtain the knowledge point under the knowledge type.
3. The method of claim 1, wherein the determining a knowledge point under the knowledge type from the first annotation tag and the second annotation tag comprises:
and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first labeling label to the end position of the knowledge point indicated by the second labeling label so as to obtain the knowledge point under the knowledge type.
4. The method according to any one of claims 1-3, wherein the splicing the text to be extracted and the prompt message to obtain an input text includes:
adding a separator before the first character of the text to be extracted;
and splicing the prompt information before the separator, wherein the last character of the prompt information is adjacent to the separator.
5. A method according to any one of claims 1-3, wherein the method further comprises:
for any knowledge type of the at least one knowledge type, acquiring candidate resources matched with the knowledge type from a resource pool;
determining target resources matched with knowledge points under the knowledge type from the candidate resources;
and displaying the target resource.
6. A method of training a knowledge extraction model, the method comprising:
acquiring training texts and prompt information, wherein the prompt information is used for indicating information to be extracted in the training texts, and the labeling information of the training texts is used for indicating the corresponding relation between at least one labeling knowledge type and labeling knowledge points;
splicing the training text and the prompt information to obtain an input text;
Extracting features of the input text to obtain semantic features of the input text;
knowledge extraction is respectively carried out on the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types so as to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types;
performing joint training on knowledge extraction models of the knowledge types according to the labeling knowledge points under the labeling knowledge types and the prediction knowledge points under the prediction knowledge types;
the knowledge extraction model includes a first classification network and a second classification network,
the knowledge extraction method for extracting knowledge from semantic features by using knowledge extraction models corresponding to a plurality of knowledge types to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types includes:
for any knowledge type in the plurality of knowledge types, classifying the semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label of each character in the training text;
classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character;
Under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, taking the knowledge type as the predicted knowledge type, and determining a predicted knowledge point under the predicted knowledge type according to the first target label and the second target label;
the first target label is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second target label is used for indicating that the corresponding character is the last character of the predicted knowledge point;
the knowledge extraction model comprises a sequence annotation model,
the knowledge extraction method for extracting knowledge from semantic features by using knowledge extraction models corresponding to a plurality of knowledge types to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types includes:
inputting the semantic features into a sequence labeling model corresponding to the knowledge types aiming at any knowledge type in the plurality of knowledge types to obtain a prediction label of each character in the training text output by the sequence labeling model corresponding to the knowledge type;
under the condition that the prediction labels of the characters comprise a first prediction label and a second prediction label, taking the knowledge type as a prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first prediction label and the second prediction label;
The first prediction tag is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second prediction tag is used for indicating that the corresponding character is the last character of the predicted knowledge point.
7. The method of claim 6, wherein the jointly training the knowledge extraction models of the plurality of knowledge types based on the labeled knowledge points for each of the labeled knowledge types and the predicted knowledge points for each of the predicted knowledge types comprises:
comparing each labeling knowledge type with each predicting knowledge type to take the same labeling knowledge type and predicting knowledge type as a first knowledge type and/or taking different labeling knowledge type and predicting knowledge type as a second knowledge type;
generating a first loss value according to the difference between the predicted knowledge point and the marked knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the marked knowledge point belonging to the second knowledge type;
and carrying out joint training on the knowledge extraction models of the knowledge types according to the first loss value and/or the second loss value.
8. The method of claim 7, wherein the generating a first loss value from differences between predicted knowledge points and annotated knowledge points belonging to the first knowledge type comprises:
determining, for any of the first knowledge types, a first difference between a first position of a first character in the labeled knowledge point belonging to the first knowledge type in the training text and a second position of the first character in the predicted knowledge point belonging to the first knowledge type in the training text;
determining a second difference between a third position in the training text of the tail character in the noted knowledge point belonging to the first knowledge type and a fourth position in the training text of the tail character in the predicted knowledge point belonging to the first knowledge type;
and determining a first loss value according to the first difference and the second difference.
9. The method of claim 8, wherein the determining a first loss value from the first difference and the second difference comprises:
determining a first sub-loss value according to the first difference and the second difference;
determining a second sub-loss value based on a third difference between the first text and the second text; the first text is obtained by removing the first characters and the tail characters of the marked knowledge points belonging to the first knowledge type from the training text, and the second text is obtained by removing the first characters and the tail characters of the predicted knowledge points belonging to the first knowledge type from the training text;
Determining a type loss value of the first knowledge type according to the first sub-loss value and the second sub-loss value;
and determining the first loss value according to the type loss value of each first knowledge type.
10. The method of claim 9, wherein the generating a second sub-loss value from a third difference between the first text and the second text comprises:
determining a character loss value of each character in the first text according to a third difference between the first text and the second text;
determining target characters from the characters of the first text according to the set proportion;
and determining the second sub-loss value according to the character loss value of each target character.
11. The method of claim 6, wherein the jointly training the knowledge extraction models of the plurality of knowledge types based on the labeled knowledge points for each of the labeled knowledge types and the predicted knowledge points for each of the predicted knowledge types comprises:
determining third loss values of the knowledge types according to the labeling knowledge points under the labeling knowledge types and the prediction knowledge points under the prediction knowledge types;
Determining a fourth loss value according to the third loss values of the plurality of knowledge types;
and carrying out joint training on the knowledge extraction models of the knowledge types according to the fourth loss value.
12. A knowledge extraction apparatus, the apparatus comprising:
the device comprises a first acquisition module, a second acquisition module and a display module, wherein the first acquisition module is used for acquiring a text to be extracted and prompt information, and the prompt information is used for indicating information to be extracted in the text to be extracted;
the splicing module is used for splicing the text to be extracted and the prompt information to obtain an input text;
the extraction module is used for extracting the characteristics of the input text to obtain semantic characteristics of the input text;
the extraction module is used for extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types respectively so as to obtain knowledge points under at least one knowledge type in the plurality of knowledge types;
the knowledge extraction model comprises a first classification network and a second classification network, and the extraction module is specifically configured to:
for any knowledge type in the plurality of knowledge types, classifying the semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in the text to be extracted;
Classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted;
determining knowledge points under the knowledge type according to the first classification tag and the second classification tag;
the knowledge extraction model comprises a sequence annotation model, and the extraction module is specifically used for:
for any knowledge type in the plurality of knowledge types, performing sequence labeling on the semantic features by adopting a sequence labeling model corresponding to the knowledge type to obtain a first labeling label and a second labeling label;
determining knowledge points under the knowledge type according to the first labeling label and the second labeling label;
the first labeling label is used for indicating the initial position of the knowledge point in the text to be extracted, and the second labeling label is used for indicating the end position of the knowledge point in the text to be extracted.
13. The apparatus of claim 12, wherein the extraction module is specifically configured to:
and starting to intercept the text to be extracted from the initial position of the knowledge point indicated by the first classification label to intercept the text to be extracted to the end position of the knowledge point indicated by the second classification label so as to obtain the knowledge point under the knowledge type.
14. The apparatus of claim 12, wherein the extraction module is specifically configured to:
and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first labeling label to the end position of the knowledge point indicated by the second labeling label so as to obtain the knowledge point under the knowledge type.
15. The apparatus according to any one of claims 12-14, wherein the stitching module is specifically configured to:
adding a separator before the first character of the text to be extracted;
and splicing the prompt information before the separator, wherein the last character of the prompt information is adjacent to the separator.
16. The apparatus of any of claims 12-14, wherein the apparatus further comprises:
a second obtaining module, configured to obtain, for any one of the at least one knowledge type, a candidate resource matching the knowledge type from a resource pool;
the determining module is used for determining target resources matched with knowledge points under the knowledge type from the candidate resources;
and the display module is used for displaying the target resource.
17. A training apparatus for a knowledge extraction model, the apparatus comprising:
The system comprises an acquisition module, a judgment module and a judgment module, wherein the acquisition module is used for acquiring training texts and prompt information, the prompt information is used for indicating information to be extracted in the training texts, and the annotation information of the training texts is used for indicating the corresponding relation between at least one annotation knowledge type and annotation knowledge points;
the splicing module is used for splicing the training text and the prompt information to obtain an input text;
the extraction module is used for extracting the characteristics of the input text to obtain semantic characteristics of the input text;
the extraction module is used for extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain predicted knowledge points under at least one predicted knowledge type in the plurality of knowledge types;
the training module is used for carrying out joint training on the knowledge extraction models of the knowledge types according to the labeling knowledge points under the labeling knowledge types and the prediction knowledge points under the prediction knowledge types;
the knowledge extraction model comprises a first classification network and a second classification network, and the extraction module is specifically configured to:
for any knowledge type in the plurality of knowledge types, classifying the semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label of each character in the training text;
Classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character;
under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, taking the knowledge type as the predicted knowledge type, and determining a predicted knowledge point under the predicted knowledge type according to the first classification label and the second classification label;
the first target label is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second target label is used for indicating that the corresponding character is the last character of the predicted knowledge point;
the knowledge extraction model comprises a sequence annotation model, and the extraction module is specifically used for:
inputting the semantic features into a sequence labeling model corresponding to the knowledge types aiming at any knowledge type in the plurality of knowledge types to obtain a prediction label of each character in the training text output by the sequence labeling model corresponding to the knowledge type;
under the condition that the prediction labels of the characters comprise a first prediction label and a second prediction label, taking the knowledge type as a prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first prediction label and the second prediction label;
The first prediction tag is used for indicating that the corresponding character is the first character of the predicted knowledge point, and the second prediction tag is used for indicating that the corresponding character is the last character of the predicted knowledge point.
18. The apparatus of claim 17, wherein the training module is specifically configured to:
comparing each labeling knowledge type with each predicting knowledge type to take the same labeling knowledge type and predicting knowledge type as a first knowledge type and/or taking different labeling knowledge type and predicting knowledge type as a second knowledge type;
generating a first loss value according to the difference between the predicted knowledge point and the marked knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the marked knowledge point belonging to the second knowledge type;
and carrying out joint training on the knowledge extraction models of the knowledge types according to the first loss value and/or the second loss value.
19. The apparatus of claim 18, wherein the training module is specifically configured to:
determining, for any of the first knowledge types, a first difference between a first position of a first character in the labeled knowledge point belonging to the first knowledge type in the training text and a second position of the first character in the predicted knowledge point belonging to the first knowledge type in the training text;
Determining a second difference between a third position in the training text of the tail character in the noted knowledge point belonging to the first knowledge type and a fourth position in the training text of the tail character in the predicted knowledge point belonging to the first knowledge type;
and determining a first loss value according to the first difference and the second difference.
20. The apparatus of claim 19, wherein the training module is specifically configured to:
determining a first sub-loss value according to the first difference and the second difference;
determining a second sub-loss value based on a third difference between the first text and the second text; the first text is obtained by removing the first characters and the tail characters of the marked knowledge points belonging to the first knowledge type from the training text, and the second text is obtained by removing the first characters and the tail characters of the predicted knowledge points belonging to the first knowledge type from the training text;
determining a type loss value of the first knowledge type according to the first sub-loss value and the second sub-loss value;
and determining the first loss value according to the type loss value of each first knowledge type.
21. The apparatus of claim 20, wherein the training module is specifically configured to:
determining a character loss value of each character in the first text according to a third difference between the first text and the second text;
determining target characters from the characters of the first text according to the set proportion;
and determining the second sub-loss value according to the character loss value of each target character.
22. The apparatus of claim 17, wherein the training module is specifically configured to:
determining third loss values of the knowledge types according to the labeling knowledge points under the labeling knowledge types and the prediction knowledge points under the prediction knowledge types;
determining a fourth loss value according to the third loss values of the plurality of knowledge types;
and carrying out joint training on the knowledge extraction models of the knowledge types according to the fourth loss value.
23. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the knowledge extraction method of any one of claims 1-5 or to perform the training method of the knowledge extraction model of any one of claims 6-11.
24. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the knowledge extraction method of any one of claims 1-5 or to perform the training method of the knowledge extraction model of any one of claims 6-11.
CN202211466616.9A 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model Active CN115688920B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211466616.9A CN115688920B (en) 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211466616.9A CN115688920B (en) 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model

Publications (2)

Publication Number Publication Date
CN115688920A CN115688920A (en) 2023-02-03
CN115688920B true CN115688920B (en) 2023-08-25

Family

ID=85053438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211466616.9A Active CN115688920B (en) 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model

Country Status (1)

Country Link
CN (1) CN115688920B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116244517B (en) * 2023-03-03 2023-11-28 北京航空航天大学 Multi-scene multi-task model training method based on hierarchical information extraction network
CN116483982B (en) * 2023-06-25 2023-10-13 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN117349425B (en) * 2023-12-04 2024-03-22 北京仁科互动网络技术有限公司 Knowledge item generation method, device, equipment and storage medium
CN117421426B (en) * 2023-12-18 2024-03-05 广东信聚丰科技股份有限公司 Knowledge point labeling method and system based on artificial intelligence assistance

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201615371D0 (en) * 2015-11-11 2016-10-26 Adobe Systems Inc Structured knowledge modeling and extraction from images
CN111881298A (en) * 2020-08-04 2020-11-03 上海交通大学 Semi-structured text processing and analyzing method
CN112948535A (en) * 2019-12-10 2021-06-11 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN113392182A (en) * 2021-05-11 2021-09-14 宜通世纪物联网研究院(广州)有限公司 Knowledge matching method, device, equipment and medium fusing context semantic constraints
CN113656556A (en) * 2021-08-20 2021-11-16 上海大参林医疗健康科技有限公司 Text feature extraction method and knowledge graph construction method
CN114417785A (en) * 2021-12-27 2022-04-29 科大讯飞股份有限公司 Knowledge point annotation method, model training method, computer device, and storage medium
CN114595686A (en) * 2022-03-11 2022-06-07 北京百度网讯科技有限公司 Knowledge extraction method, and training method and device of knowledge extraction model
CN114610894A (en) * 2022-01-26 2022-06-10 清华大学 Multi-task joint knowledge mining method and device based on discourse context
CN114969371A (en) * 2022-05-31 2022-08-30 北京智谱华章科技有限公司 Heat sorting method and device of combined knowledge graph

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8335754B2 (en) * 2009-03-06 2012-12-18 Tagged, Inc. Representing a document using a semantic structure

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201615371D0 (en) * 2015-11-11 2016-10-26 Adobe Systems Inc Structured knowledge modeling and extraction from images
CN112948535A (en) * 2019-12-10 2021-06-11 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN111881298A (en) * 2020-08-04 2020-11-03 上海交通大学 Semi-structured text processing and analyzing method
CN113392182A (en) * 2021-05-11 2021-09-14 宜通世纪物联网研究院(广州)有限公司 Knowledge matching method, device, equipment and medium fusing context semantic constraints
CN113656556A (en) * 2021-08-20 2021-11-16 上海大参林医疗健康科技有限公司 Text feature extraction method and knowledge graph construction method
CN114417785A (en) * 2021-12-27 2022-04-29 科大讯飞股份有限公司 Knowledge point annotation method, model training method, computer device, and storage medium
CN114610894A (en) * 2022-01-26 2022-06-10 清华大学 Multi-task joint knowledge mining method and device based on discourse context
CN114595686A (en) * 2022-03-11 2022-06-07 北京百度网讯科技有限公司 Knowledge extraction method, and training method and device of knowledge extraction model
CN114969371A (en) * 2022-05-31 2022-08-30 北京智谱华章科技有限公司 Heat sorting method and device of combined knowledge graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种新型高效的文库知识图谱实体关系抽取算法;胡代旺,焦一源,李雁妮;《西安电子科技大学学报(自然科学版)》;第75-83页 *

Also Published As

Publication number Publication date
CN115688920A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
JP7127106B2 (en) Question answering process, language model training method, apparatus, equipment and storage medium
CN115688920B (en) Knowledge extraction method, training device, training equipment and training medium for model
CN112668671B (en) Method and device for acquiring pre-training model
US10831796B2 (en) Tone optimization for digital content
US20210312139A1 (en) Method and apparatus of generating semantic feature, method and apparatus of training model, electronic device, and storage medium
EP3896618A1 (en) Method for generating user interactive information processing model and method for processing user interactive information
CN111221984A (en) Multimodal content processing method, device, equipment and storage medium
KR20210091076A (en) Method and apparatus for processing video, electronic device, medium and computer program
US20220027575A1 (en) Method of predicting emotional style of dialogue, electronic device, and storage medium
US20220358292A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN113590776A (en) Text processing method and device based on knowledge graph, electronic equipment and medium
CN111859953A (en) Training data mining method and device, electronic equipment and storage medium
CN114092759A (en) Training method and device of image recognition model, electronic equipment and storage medium
EP4057283A2 (en) Method for detecting voice, method for training, apparatuses and smart speaker
US20230008897A1 (en) Information search method and device, electronic device, and storage medium
CN112507090A (en) Method, apparatus, device and storage medium for outputting information
WO2023155678A1 (en) Method and apparatus for determining information
CN113392179A (en) Text labeling method and device, electronic equipment and storage medium
CN112380104A (en) User attribute identification method and device, electronic equipment and storage medium
CN114860913B (en) Intelligent question-answering system construction method, question-answering processing method and device
CN111524043A (en) Method and device for automatically generating litigation risk assessment questionnaire
CN115033739A (en) Search method, model training method, device, electronic equipment and medium
CN110991183A (en) Method, device, equipment and storage medium for determining predicate of problem
CN112712056A (en) Video semantic analysis method and device, storage medium and electronic equipment
CN117436438A (en) Emotion analysis method, training method and device for large language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant