CN115688920A - Knowledge extraction method, model training method, device, equipment and medium - Google Patents

Knowledge extraction method, model training method, device, equipment and medium Download PDF

Info

Publication number
CN115688920A
CN115688920A CN202211466616.9A CN202211466616A CN115688920A CN 115688920 A CN115688920 A CN 115688920A CN 202211466616 A CN202211466616 A CN 202211466616A CN 115688920 A CN115688920 A CN 115688920A
Authority
CN
China
Prior art keywords
knowledge
type
text
character
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211466616.9A
Other languages
Chinese (zh)
Other versions
CN115688920B (en
Inventor
张鑫睿
刘欢
陈坤斌
何伯磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu International Technology Shenzhen Co ltd
Original Assignee
Baidu International Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu International Technology Shenzhen Co ltd filed Critical Baidu International Technology Shenzhen Co ltd
Priority to CN202211466616.9A priority Critical patent/CN115688920B/en
Publication of CN115688920A publication Critical patent/CN115688920A/en
Application granted granted Critical
Publication of CN115688920B publication Critical patent/CN115688920B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The disclosure provides a knowledge extraction method, a knowledge extraction model training device, knowledge extraction model training equipment and knowledge extraction model training media, and relates to the fields of natural language processing, deep learning, cloud computing and the like. The specific implementation scheme is as follows: acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; performing feature extraction on an input text to obtain semantic features of the input text; and respectively adopting knowledge extraction models corresponding to the multiple knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the multiple knowledge types. Therefore, the extraction of the knowledge points matched with the prompt information in the text to be extracted based on the knowledge extraction models corresponding to the known knowledge types can be realized, and the common extraction of the knowledge points under the knowledge types can be realized so as to meet the extraction requirements of different users.

Description

Knowledge extraction method, model training method, device, equipment and medium
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to the technical fields of natural language processing, deep learning, cloud computing, and the like, and more particularly, to a method, an apparatus, a device, and a medium for training a knowledge extraction method and a knowledge extraction model.
Background
In the office scene of an enterprise, texts deposited by employees (such as knowledge base documents, meeting contents, daily reports, weekly reports, OKR (Objectives and Key Results, targets and Key Results, and the like)) contain knowledge ATTs (Attention) in a plurality of specific fields, and employees in different functional sequences have different Attention points according to the working directions of the employees. How to effectively mine useful key information from the unstructured texts to assist knowledge recommendation and improve the office efficiency of staff is very important.
Disclosure of Invention
The present disclosure provides a training method, apparatus, device and medium for knowledge extraction methods, knowledge extraction models.
According to an aspect of the present disclosure, there is provided a knowledge extraction method including:
acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted;
splicing the text to be extracted and the prompt information to obtain an input text;
performing feature extraction on the input text to obtain semantic features of the input text;
and respectively adopting knowledge extraction models corresponding to a plurality of knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the knowledge types.
According to another aspect of the present disclosure, there is provided a training method of a knowledge extraction model, including:
acquiring a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the marking information of the training text is used for indicating the corresponding relation between at least one marking knowledge type and marking knowledge points;
splicing the training text and the prompt information to obtain an input text;
performing feature extraction on the input text to obtain semantic features of the input text;
respectively extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types;
and performing joint training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
According to still another aspect of the present disclosure, there is provided a knowledge extraction apparatus including:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a text to be extracted and prompt information, and the prompt information is used for indicating information to be extracted in the text to be extracted;
the splicing module is used for splicing the text to be extracted and the prompt information to obtain an input text;
the extraction module is used for extracting the characteristics of the input text to obtain the semantic characteristics of the input text;
and the extraction module is used for respectively adopting a knowledge extraction model corresponding to a plurality of knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the plurality of knowledge types.
According to still another aspect of the present disclosure, there is provided a training apparatus of a knowledge extraction model, including:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a training text and prompt information, the prompt information is used for indicating information to be extracted in the training text, and the marking information of the training text is used for indicating the corresponding relation between at least one marking knowledge type and marking knowledge points;
the splicing module is used for splicing the training text and the prompt information to obtain an input text;
the extraction module is used for extracting the characteristics of the input text to obtain the semantic characteristics of the input text;
the extraction module is used for extracting knowledge from the semantic features by adopting a knowledge extraction model corresponding to a plurality of knowledge types respectively so as to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types;
and the training module is used for performing combined training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
According to yet another aspect of the present disclosure, there is provided an electronic device comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of knowledge extraction as set forth in the above-described aspect of the disclosure or a method of training a knowledge extraction model as set forth in the above-described aspect of the disclosure.
According to still another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium of computer instructions for causing a computer to perform the method of knowledge extraction set forth in the above-described aspect of the present disclosure or the method of training a knowledge extraction model set forth in the above-described aspect of the present disclosure.
According to yet another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of knowledge extraction set forth in the above-mentioned aspect of the present disclosure, or implements the method of training a knowledge extraction model set forth in the above-mentioned aspect of the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart illustrating a knowledge extraction method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart illustrating a knowledge extraction method according to a second embodiment of the disclosure;
FIG. 3 is a schematic diagram of a knowledge extraction framework corresponding to the knowledge extraction model provided by the present disclosure;
fig. 4 is a schematic flow chart of a knowledge extraction method according to a third embodiment of the present disclosure;
FIG. 5 is a schematic flow chart illustrating a training method of a knowledge extraction model according to a fourth embodiment of the present disclosure;
FIG. 6 is a schematic flow chart illustrating a training method of a knowledge extraction model according to a fifth embodiment of the present disclosure;
FIG. 7 is a schematic flow chart illustrating a method for training a knowledge extraction model according to a sixth embodiment of the present disclosure;
FIG. 8 is a sampling example provided by the present disclosure;
FIG. 9 is a complete flow diagram of a Prompt + MRC based domain-specific knowledge point extraction service provided by the present disclosure;
fig. 10 is a schematic structural diagram of a knowledge extraction device according to a seventh embodiment of the disclosure;
FIG. 11 is a schematic structural diagram of a training apparatus for knowledge extraction model according to an eighth embodiment of the present disclosure;
FIG. 12 shows a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In an office scene of an enterprise, texts deposited by employees contain knowledge ATTs in many specific fields, and how to effectively mine useful key information from the unstructured texts and recommend appropriate related documents, solutions and the like according to the attention information is one of the problems of great attention of enterprise knowledge management.
Although the number of texts deposited by employees is large, effective information contained in the texts is sparse, and a unified method is also lacked to acquire knowledge information (or called knowledge points) under various knowledge types, so that it is difficult to construct a specific field information resource pool in an actual scene. Therefore, it is very important how to mine effective information in texts generated by employees to assist knowledge recommendation and improve office efficiency.
At present, the knowledge information extraction tasks are mainly divided into a "nested type" (a knowledge point may belong to a plurality of knowledge types, such as "knowledge recommendation" belonging to two knowledge types of products and algorithms) information extraction task and a "non-nested type" (a knowledge point belongs to only one knowledge type) information extraction task.
The 'non-nested' information extraction task is mostly solved in a sequence labeling mode, and the basic framework is as follows: (1) Firstly, an input text is coded through a deep Neural Network (such as CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory), transformer and the like), and semantic features of the input text are extracted; (2) And then, predicting the boundary of the knowledge point (such as a BIO mode, a B-knowledge point starting position, an I-knowledge point other position and O-other characters) by using a CRF (Conditional Random field) as a sequence labeling layer so as to analyze a final result.
It should be noted that, the conventional neural network method represented by CNN and LSTM requires a large amount of supervised linguistic data in the field, and the network parameters are trained from scratch, so that the model can better learn the information related to the task. The appearance of large-scale pre-training models (such as BERT (Bidirectional Encoder Representation based on transform), ernie (Enhanced language Representation with information Entities, a semantic Representation model with Enhanced knowledge)) not only alleviates the dilemma of requiring large-scale corpus to be trained from the beginning, but also enriches the coding expression of texts, and brings stronger semantic information.
The solution of the nested information extraction task is that a pipeline model structure is constructed, namely all entity information in an input text is extracted first, and then the entity type of each entity is judged.
Although the information extraction task is widely existed in the actual business scene, the information extraction task based on the neural network has more researches, but the following defects still exist in the solving modes of the embedded type information extraction task and the non-embedded type information extraction task:
on one hand, the sequence labeling method needs to predict the label (e.g. BIO, B-entity start position, I-entity other position, O-other character) corresponding to each character in the knowledge point (e.g. entity). That is, only a single label can be assigned to a specific character, and therefore, only the case of the "non-nested type" information extraction task can be solved. If the nested information extraction task is to be processed by means of sequence labeling, multiple CRF layers need to be trained on the output layer, which increases the complexity of the model.
On the other hand, the current mature "nested" information extraction task generally adopts a pipeline structure (firstly extracting all entity information in the input text, and then classifying the entities), although the method is simple, the two split sub-tasks have a probability error, and the error of entity extraction directly results in the reduction of the overall accuracy. Moreover, it is not simple to separately consider the "nested type" information extraction task and the "non-nested type" information extraction task.
In view of at least one of the above problems, the present disclosure provides a knowledge extraction method, a knowledge extraction model training method, an apparatus, a device, and a medium.
A knowledge extraction method, a knowledge extraction model training method, an apparatus, a device, and a medium of the embodiments of the present disclosure are described below with reference to the drawings.
Fig. 1 is a schematic flow chart of a knowledge extraction method according to a first embodiment of the present disclosure.
The embodiment of the disclosure is exemplified by the knowledge extraction method being configured in a knowledge extraction device, which can be applied to any electronic equipment, so that the electronic equipment can execute the knowledge extraction function.
The electronic device may be any device having a computing capability, for example, a PC (Personal Computer), a mobile terminal, a server, and the like, and the mobile terminal may be a hardware device having various operating systems, touch screens, and/or display screens, such as an in-vehicle device, a mobile phone, a tablet Computer, a Personal digital assistant, and a wearable device.
As shown in fig. 1, the knowledge extraction method may include the steps of:
step 101, acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted.
In the embodiment of the present disclosure, the text to be extracted may include knowledge base documents, conference contents, daily reports, weekly reports, OKR, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, an obtaining manner of the text to be extracted is not limited, for example, the text to be extracted may be collected online by using a web crawler technology, or the text to be extracted may also be collected offline, or the text to be extracted may also be input by a user, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, the prompt message may be used to indicate information to be extracted from the text to be extracted. Wherein, the prompt message can be set according to the requirement.
As an example, the prompt information may be set according to a specific domain direction (e.g., a specific application domain direction), for example, when the application domain direction is an algorithm direction, the prompt information may be "a skill attention point of the algorithm direction", for example, when the application domain direction is a recommended service direction, the prompt information may be "a attention point of the recommended service direction", and so on, which are not listed herein.
As another example, to be applicable to all application domain directions or application scenarios, a broad set of hints may be set or constructed for application to different knowledge type scenarios. For example, the prompt may be "knowledge points in text", "entities in text", or the like.
For example, suppose the text to be extracted is 'update user behavior model', itg update user behavior model, improve recognition rate, rely on the service of map cloud, have communicated with the map cloud team, obtain performance test permission. "the prompt may be" knowledge points in text ".
And 102, splicing the text to be extracted and the prompt information to obtain an input text.
In the embodiment of the disclosure, the text to be extracted and the prompt information can be spliced to obtain the input text.
Still by way of illustration in the above example, the text to be extracted and the prompt information are spliced to obtain an input text, for example, the input text may be a "knowledge point in text update user behavior model", itg update user behavior model, improve recognition rate, rely on a map star cloud service, communicate with a map star cloud team, and obtain a performance test permission. Itg updating the user behavior model to improve the recognition rate depends on the communication between the service of the map star cloud and the map star cloud team to obtain the performance test permission. Knowledge points in text ".
It should be noted that, the above example of splicing the text to be extracted and the prompt information is only an example, and in practical application, a specific splicing manner may be selected as needed to splice the text to be extracted and the prompt information, which is not limited in this disclosure.
And 103, performing feature extraction on the input text to obtain semantic features of the input text.
In the embodiment of the present disclosure, feature extraction may be performed on an input text to obtain semantic features of the input text, for example, feature extraction may be performed on the input text by using a BERT model, an Ernie model, and the like to obtain the semantic features of the input text.
And 104, respectively adopting knowledge extraction models corresponding to the multiple knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the multiple knowledge types.
In embodiments of the present disclosure, knowledge points may include entities, phrases, important words, text fragments, and the like.
It should be noted that the number of knowledge points may be one or more, and the present disclosure is not limited thereto.
In the embodiment of the present disclosure, the knowledge type may be used to indicate a type to which the knowledge point belongs, for example, the knowledge point is "distributed deployment", the type to which the knowledge point belongs (i.e., knowledge type) may be "backend", and for example, the knowledge point is "knowledge recommendation", and the type to which the knowledge point belongs (i.e., knowledge type) may be "product" or "algorithm".
In the disclosed embodiment, the plurality of knowledge types may be preset, i.e., the plurality of knowledge types are known.
In the disclosed embodiments, for any one of a plurality of knowledge types, the any one knowledge type may have a corresponding knowledge extraction model.
In the embodiment of the disclosure, knowledge extraction may be performed on semantic features by using knowledge extraction models corresponding to a plurality of knowledge types, respectively, to obtain a knowledge point in at least one of the plurality of knowledge types.
That is, after knowledge extraction is performed on semantic features by using knowledge extraction models corresponding to a plurality of knowledge types, knowledge points under some knowledge types of the plurality of knowledge types may be obtained, or knowledge points under all knowledge types of the plurality of knowledge types may also be obtained, which is not limited by the present disclosure.
The knowledge extraction method comprises the steps of obtaining a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; performing feature extraction on the input text to obtain semantic features of the input text; and respectively adopting knowledge extraction models corresponding to the multiple knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the multiple knowledge types. Therefore, the extraction of the knowledge points matched with the prompt information in the text to be extracted based on the knowledge extraction models corresponding to the known knowledge types can be realized, the knowledge points matched with the prompt information can be extracted from the text to be extracted, and the knowledge types of the knowledge points can be determined at the same time. Moreover, the knowledge points under a plurality of knowledge types can be extracted together, so that the extraction requirements of different users are met.
In the technical scheme of the present disclosure, the processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the related user are all performed under the premise of obtaining the consent of the user, and all meet the regulations of the related laws and regulations, and do not violate the good custom of the public order.
In order to clearly illustrate how the knowledge extraction model corresponding to the plurality of knowledge types is respectively adopted to extract the knowledge of the semantic features in the above embodiments of the present disclosure, so as to obtain the knowledge point in at least one knowledge type of the plurality of knowledge types, the present disclosure further provides a knowledge extraction method.
Fig. 2 is a schematic flow chart of a knowledge extraction method according to a second embodiment of the disclosure.
As shown in fig. 2, the knowledge extraction method may include the steps of:
step 201, acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted.
And step 202, splicing the text to be extracted and the prompt information to obtain an input text.
For the explanation of step 201 to step 202, reference may be made to the related description in any embodiment of the present disclosure, which is not described herein again.
In a possible implementation manner of the embodiment of the present disclosure, a separator may be added before a first character of the text to be extracted, and the prompt message may be spliced before the separator, where a last character of the prompt message may be adjacent to the separator.
In the embodiment of the present disclosure, the separator may be used to separate the text to be extracted from the prompt message, for example, may be [ SEP ].
As an example, assuming that the text to be extracted is "complete model distributed deployment, online knowledge recommends an item. And the Prompt information Prompt is ' text focus ', a separator [ SEP ] is added before the first character of the text to be extracted, the Prompt information is spliced before the separator, and the obtained input text can be, for example, ' [ CLS ] text focus [ SEP ] to complete model distributed deployment and online knowledge recommendation items. [ SEP ] ", where [ CLS ] is the initiator.
Therefore, the text to be extracted and the prompt information are connected through the segmentation symbol, namely the text to be extracted and the prompt information are spliced, and sufficient semantic interaction can be conveniently carried out between the text to be extracted and the prompt information in subsequent processing.
In another possible implementation manner of the embodiment of the present disclosure, a separator may be added after the last character of the text to be extracted, and the prompt message may be concatenated after the separator, where the first character of the prompt message may be adjacent to the separator.
Still exemplified by the above example, the input text may be, for example, "[ CLS ] completion model distributed deployment, online knowledge recommendation item. [ SEP ] text attention point [ SEP ] ", where [ CLS ] is the initiator.
Therefore, the text to be extracted and the prompt information can be spliced in different modes, and the flexibility and the applicability of the method can be improved.
Step 203, performing feature extraction on the input text to obtain semantic features of the input text.
For the explanation of step 203, reference may be made to the related description in any embodiment of the present disclosure, which is not described herein again.
Step 204, classifying the semantic features by adopting a first classification network corresponding to the knowledge type aiming at any knowledge type in the plurality of knowledge types to obtain a first classification label.
It should be noted that the explanation of the knowledge points and knowledge types in step 104 is also applicable to this embodiment, and will not be described herein.
In the embodiment of the present disclosure, the first classification tag may be used to indicate a starting position of the knowledge point in the text to be extracted, that is, to indicate a position of an initial character of the knowledge point in the text to be extracted.
In this disclosure, for any knowledge type of the multiple knowledge types, the knowledge extraction model corresponding to the any knowledge type may include a first classification network, and the semantic features may be classified by using the first classification network corresponding to the any knowledge type to obtain a first classification label.
It should be noted that, when the semantic features are classified by using the first classification network corresponding to any knowledge type, the number of the obtained first classification tags may be, but is not limited to, one, and the disclosure does not limit this.
As an example, for any knowledge type, the semantic features may be classified by using a first classification network corresponding to the knowledge type to obtain classification tags of each character in the text to be extracted, and the first classification tag may be determined from each classification tag. For example, the first classification network may be a two-classification network, and assuming that a label output by the two-classification network is 0 to indicate that the corresponding character is not the first character of the knowledge point, and a label output by the two-classification network is 1 to indicate that the corresponding character is the first character of the knowledge point, the first classification label may be a label of 1.
And step 205, classifying the semantic features by using a second classification network corresponding to the knowledge type to obtain a second classification label.
In the embodiment of the present disclosure, the second classification tag may be used to indicate an end position of the knowledge point in the text to be extracted, that is, to indicate a position of a tail character of the knowledge point in the text to be extracted.
In this disclosure, for any knowledge type of the multiple knowledge types, the knowledge extraction model corresponding to the any knowledge type may include a second classification network, and the second classification network corresponding to the any knowledge type may be used to classify the semantic features to obtain a second classification label.
It should be noted that, when the semantic features are classified by using the second classification network corresponding to any knowledge type, the number of the obtained second classification tags may be, but is not limited to, one, and the disclosure does not limit this.
As an example, the semantic features may be classified by using a second classification network corresponding to any knowledge type to obtain classification tags of characters in the text to be extracted, and the second classification tag may be determined from the classification tags. For example, the second classification network may be a second classification network, and assuming that a label output by the second classification network is 0 to indicate that the corresponding character is not a tail character of a knowledge point, and a label output by the second classification network is 1 to indicate that the corresponding character is a tail character of a knowledge point, the second classification label may be a label of 1.
And step 206, determining knowledge points under the knowledge types according to the first classification tags and the second classification tags.
In the embodiment of the present disclosure, knowledge points under any one of the above knowledge types may be determined according to the first classification tag and the second classification tag.
As a possible implementation manner, for any knowledge type, the text to be extracted may be intercepted from the start position of the knowledge point indicated by the first classification tag corresponding to the knowledge type to the end position of the knowledge point indicated by the second classification tag corresponding to the knowledge type, so as to obtain the knowledge point in the knowledge type.
As an example, when the obtained number of the first classification tag and the second classification tag is one, the text to be extracted may be directly intercepted from the start position of the knowledge point indicated by the first classification tag to the end position of the knowledge point indicated by the second classification tag, so as to obtain the knowledge point in the knowledge type.
As another example, when the obtained first classification tag and the second classification tag are multiple in number, for example, the first classification tag includes a first classification tag 1 and a first classification tag 2, and the second classification tag includes a second classification tag 1 and a second classification tag 2, the text to be extracted may be cut from a start position of the knowledge point indicated by the first classification tag 1 to an end position of the knowledge point indicated by a second classification tag 2 closest to the first classification tag 1 (i.e., a distance between a character corresponding to the second classification tag 2 and a character corresponding to the first classification tag 1 is smaller than a distance between a character corresponding to the second classification tag 1 and a character corresponding to the first classification tag 1), so as to obtain the knowledge point 1 under a certain knowledge type, and the text to be extracted may be cut from the start position of the knowledge point indicated by the first classification tag 2 to an end position of the knowledge point indicated by the second classification tag 1 closest to the first classification tag 2, so as to obtain the knowledge point 2 under the certain knowledge type.
It should be noted that, the capturing direction of the knowledge points in the knowledge type in the text to be extracted may be from left to right, or may also be from right to left, and the like, which is not limited in this disclosure.
Therefore, the knowledge points under the knowledge types can be effectively and accurately determined through the first classification label and the second classification label.
It is to be understood that the first category label and the second category label are presented in pairs, i.e. the knowledge points under each knowledge type correspond to at least one first category label and at least one second category label.
In any embodiment of the disclosure, for any knowledge type of at least one knowledge type, after determining knowledge points under the knowledge type in the text to be extracted, candidate resources matched with the knowledge type can be acquired from a resource pool; and determining target resources matched with the knowledge points under the knowledge types from the candidate resources, and displaying the target resources.
In the embodiment of the disclosure, information resources under various knowledge types can be stored in the resource pool.
In this embodiment of the present disclosure, the candidate resource may be an information resource in the resource pool that matches any of the above knowledge types, and the number of the candidate resources may be one or multiple, which is not limited by the present disclosure.
In this embodiment of the present disclosure, the target resource may be an information resource that is matched with a knowledge point in any one of the above knowledge types in the candidate resources, and the number of the target resources may be one or multiple, which is not limited by the present disclosure.
For example, in an office scene of an enterprise, after a knowledge point under a certain knowledge type is mined from a text (denoted as a text to be extracted in the present disclosure) generated by an employee a, a candidate resource matched with the knowledge type may be acquired from a resource pool, a target resource matched with the knowledge point under the knowledge type may be determined from the candidate resource, and the target resource may be presented to the employee a, so that knowledge recommendation may be accurately performed to the employee a, and the office efficiency of the employee may be improved.
In summary, after the knowledge points under each knowledge type in the text to be extracted are determined, the target resource recommended to the user can be accurately acquired from the resource pool based on each knowledge type and the knowledge points under each knowledge type, and the use experience of the user can be improved.
As an example, in order to simultaneously implement the task of extracting "nested" information and "non-nested" information, the knowledge extraction model of the present disclosure may use 2 × num (num is the number of knowledge types) two classification networks or two classification models at the output end, that is, a knowledge extraction model corresponding to any knowledge type includes a first classification network for predicting the start position of a knowledge point in the text to be extracted under any knowledge type and a second classification network for predicting the end position of a knowledge point in the text to be extracted under any knowledge type, and the knowledge extraction network framework corresponding to the knowledge extraction model may be as shown in fig. 3, so that the knowledge extraction model may not only extract a knowledge point matching with prompt information from the text to be extracted, but also determine the knowledge type to which the knowledge point belongs.
According to the knowledge extraction method, a first classification network corresponding to a knowledge type is adopted to classify semantic features aiming at any one of a plurality of knowledge types so as to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in a text to be extracted; classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted; and determining knowledge points under the knowledge types according to the first classification label and the second classification label. Therefore, the prediction of the initial position and the end position of the knowledge point in the text to be extracted through the first classification network and the second classification network corresponding to any knowledge type can be realized, and the knowledge point matched with the prompt message can be effectively and accurately determined.
In order to clearly illustrate how the knowledge extraction model corresponding to the plurality of knowledge types is respectively adopted to extract the knowledge of the semantic features so as to obtain the knowledge points under at least one knowledge type of the plurality of knowledge types, the present disclosure further provides a knowledge extraction method.
Fig. 4 is a schematic flow chart of a knowledge extraction method according to a third embodiment of the present disclosure.
As shown in fig. 4, the knowledge extraction method may include the steps of:
step 401, acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted.
And step 402, splicing the text to be extracted and the prompt information to obtain an input text.
Step 403, performing feature extraction on the input text to obtain semantic features of the input text.
The execution process of step 401 to step 403 may refer to the execution process of any embodiment of the present disclosure, and is not described herein again.
And 404, performing sequence labeling on the semantic features by adopting a sequence labeling model corresponding to the knowledge type aiming at any knowledge type in the plurality of knowledge types to obtain a first labeling label and a second labeling label.
It should be noted that the explanation of the knowledge points and knowledge types in step 104 is also applicable to this embodiment, and will not be described herein.
In the embodiment of the present disclosure, the first annotation tag may be used to indicate the starting position of the knowledge point in the text to be extracted, for example, the first annotation tag may be represented by B (Begin).
In the embodiment of the present disclosure, the second annotation tag may be used to indicate the End position of the knowledge point in the text to be extracted, for example, the second annotation tag may be denoted by E (End).
In this disclosure, for any knowledge type of the multiple knowledge types, the knowledge extraction model corresponding to the any knowledge type may include a sequence labeling model, and sequence labeling may be performed on the semantic features by using the sequence labeling model corresponding to the any knowledge type, so as to obtain a first labeling tag and a second labeling tag.
It should be noted that, when the sequence annotation model corresponding to any knowledge type is used to perform sequence annotation on semantic features, the obtained first annotation tag and the second annotation tag appear in pairs, and the number of the first annotation tag and the number of the second annotation tag may be, but is not limited to, one, and the disclosure does not limit this.
As an example, for any knowledge type, sequence labeling may be performed on semantic features by using a sequence labeling model corresponding to the knowledge type to obtain a label of each character in a text to be extracted, so that a first label and a second label may be determined from each label. For example, the label B may be used as the first label and the label E may be used as the second label.
And step 405, determining knowledge points under the knowledge type according to the first label and the second label.
In the embodiment of the present disclosure, the knowledge point under any one of the above knowledge types may be determined according to the first annotation tag and the second annotation tag.
As a possible implementation manner, for any knowledge type, the text to be extracted may be intercepted from the start position of the knowledge point indicated by the first label tag corresponding to the knowledge type, and intercepted to the end position of the knowledge point indicated by the second label tag corresponding to the knowledge type, so as to obtain the knowledge point in the knowledge type.
As an example, when the obtained numbers of the first annotation tag and the second annotation tag are both one, the text to be extracted may be directly intercepted from the start position of the knowledge point indicated by the first annotation tag, and intercepted to the end position of the knowledge point indicated by the second annotation tag, so as to obtain the knowledge point in the knowledge type.
As another example, when the obtained first label tag and the second label tag are multiple, for example, the first label tag includes a first label tag 1 and a first label tag 2, and the second label tag includes a second label tag 1 and a second label tag 2, the text to be extracted may be intercepted from the start position of the knowledge point indicated by the first label tag 1, and intercepted to the end position of the knowledge point indicated by the second label tag 1 closest to the first label tag 1, so as to obtain the knowledge point 1 in a certain knowledge type, and intercepted from the start position of the knowledge point indicated by the first label tag 2, and intercepted to the end position of the knowledge point indicated by the second label tag 2 closest to the first label tag 2, so as to obtain the knowledge point 2 in the certain knowledge type.
It should be noted that, the capturing direction of the knowledge points in the knowledge type in the text to be extracted may be from left to right, or may also be from right to left, and the like, which is not limited in this disclosure.
Therefore, the knowledge points under the knowledge types can be effectively and accurately determined through the first labeling label and the second labeling label.
According to the knowledge extraction method, sequence labeling is carried out on semantic features by adopting a sequence labeling model corresponding to a knowledge type aiming at any one of a plurality of knowledge types, so that a first labeling label and a second labeling label are obtained; and determining the knowledge point under the knowledge type according to the first label and the second label, wherein the first label is used for indicating the initial position of the knowledge point in the text to be extracted, and the second label is used for indicating the end position of the knowledge point in the text to be extracted. Therefore, the prediction of the initial position and the end position of the knowledge point in the text to be extracted through the sequence marking model corresponding to any knowledge type can be realized, and the knowledge point under the knowledge type matched with the prompt information can be effectively and accurately determined.
In the embodiments corresponding to the application method of the knowledge extraction model (i.e., the knowledge extraction method), the present disclosure also provides a training method of the knowledge extraction model.
Fig. 5 is a schematic flow chart of a training method of a knowledge extraction model according to a fourth embodiment of the present disclosure.
As shown in fig. 5, the training method of the knowledge extraction model may include the following steps:
step 501, acquiring a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the labeling information of the training text is used for indicating a corresponding relation between at least one labeling knowledge type and labeling knowledge points.
In an embodiment of the present disclosure, the training text may include knowledge base documents, meeting content, daily reports, weekly reports, OKR, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, the obtaining manner of the training text is not limited, for example, the training text may be obtained from an existing training set, or the training text may also be collected online, for example, the training text may be obtained online by using a web crawler technology, or the training text may also be provided for the user, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, the prompt information may be used to indicate information to be extracted from the training text, and the prompt information may be set as needed.
As an example, the prompt information may be set according to a specific domain direction (e.g., a specific application domain direction), for example, when the application domain direction is an algorithm direction, the prompt information may be "a skill attention point of the algorithm direction", for example, when the application domain direction is a recommended service direction, the prompt information may be "a attention point of the recommended service direction", and so on, which are not listed herein.
As another example, to be applicable to all application domain directions or application scenarios, a broad set of hints may be set or constructed for application to different knowledge type scenarios. For example, the prompt may be "knowledge points in text", "entities in text", or the like.
In the embodiment of the present disclosure, the labeled knowledge points may be used to indicate the knowledge points included in the training text, where the knowledge points may include entities, phrases, important words, text segments, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, there may be one or more labeled knowledge points, and the present disclosure does not limit this.
In the embodiment of the present disclosure, the annotated knowledge type may be used to indicate a knowledge type to which a knowledge point included in the training text belongs, where the knowledge type may be, for example, "backend", "frontend", "product", "algorithm", and the like, and the present disclosure is not limited thereto.
In the embodiment of the present disclosure, the number of the annotation knowledge types may be one, or may be multiple, and the present disclosure does not limit this.
In the embodiment of the present disclosure, the labeling information of the training text may be used to indicate a correspondence between at least one labeled knowledge type and labeled knowledge points.
It should be noted that the number of the correspondence between the labeled knowledge type indicated by the labeled information of the training text and the labeled knowledge point may be one or multiple, which is not limited in this disclosure.
As an example, the training text is "complete model distributed deployment, online knowledge recommendation item. "the labeling information of the training text indicates the correspondence between the labeling knowledge type" back end "and" distributed deployment ", the correspondence between the labeling knowledge type" algorithm "and" knowledge recommendation ", and the correspondence between the labeling knowledge type" product "and" knowledge recommendation ".
Step 502, the training text and the prompt information are spliced to obtain an input text.
It should be noted that the method for splicing the training text and the prompt information is similar to the method for splicing the text to be extracted and the prompt information in step 202, and is not described herein again.
Step 503, performing feature extraction on the input text to obtain semantic features of the input text.
In the embodiment of the present disclosure, feature extraction may be performed on an input text to obtain semantic features of the input text, for example, feature extraction may be performed on the input text by using a BERT model, an Ernie model, and the like to obtain the semantic features of the input text.
And 504, respectively extracting knowledge from the semantic features by using knowledge extraction models corresponding to the multiple knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the multiple knowledge types.
In the disclosed embodiment, the predicted knowledge type may be one or more of a plurality of knowledge types, which is not limited by the present disclosure.
In the embodiment of the present disclosure, knowledge extraction may be performed on semantic features by using knowledge extraction models corresponding to a plurality of knowledge types, so that a predicted knowledge point in at least one predicted knowledge type of the plurality of knowledge types may be obtained.
And 505, performing joint training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
In the embodiment of the present disclosure, joint training may be performed on knowledge extraction models of multiple knowledge types according to labeled knowledge points under each labeled knowledge type and predicted knowledge points under each predicted knowledge type.
The knowledge extraction method comprises the steps of obtaining a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and marking information of the training text is used for indicating a corresponding relation between at least one marking knowledge type and marking knowledge points; splicing the training text and the prompt information to obtain an input text; performing feature extraction on the input text to obtain semantic features of the input text; respectively extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to the multiple knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the multiple knowledge types; and performing joint training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type. Therefore, the knowledge extraction model is trained based on the deep learning technology, and the accuracy and the reliability of the model prediction result can be improved.
In order to clearly illustrate how the knowledge extraction models corresponding to the multiple knowledge types are adopted to respectively extract the knowledge of the semantic features to obtain the predicted knowledge points under at least one predicted knowledge type in the multiple knowledge types in the above embodiments of the present disclosure, the present disclosure further provides a training method of the knowledge extraction models.
Fig. 6 is a flowchart illustrating a training method of a knowledge extraction model according to a fifth embodiment of the disclosure.
As shown in fig. 6, the training method of the knowledge extraction model may include the following steps:
step 601, acquiring a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the labeling information of the training text is used for indicating the corresponding relation between at least one labeling knowledge type and labeling knowledge points.
Step 602, the training text and the prompt information are spliced to obtain an input text.
Step 603, performing feature extraction on the input text to obtain semantic features of the input text.
The execution process of step 601 to step 603 may refer to the execution process of any embodiment of the present disclosure, and is not described herein again.
Step 604, for any knowledge type in the multiple knowledge types, classifying the semantic features by using a first classification network corresponding to the knowledge type to obtain a first classification label of each character in the training text.
In the embodiment of the present disclosure, the first classification label of each character may be used to indicate whether the corresponding character is the first character of the knowledge point under the corresponding knowledge type, that is, the first classification label of each character may be used to indicate whether the corresponding character is the starting position of the knowledge point under the corresponding knowledge type in the training text.
In this disclosure, for any knowledge type of the multiple knowledge types, the knowledge extraction model corresponding to the any knowledge type may include a first classification network, and the semantic features may be classified by using the first classification network corresponding to the any knowledge type, so as to obtain a first classification label of each character in the training text.
And 605, classifying the semantic features by using a second classification network corresponding to the knowledge type to obtain a second classification label of each character.
In the embodiment of the present disclosure, the second classification label of each character may be used to indicate whether the corresponding character is a tail character of the knowledge point in the corresponding knowledge type, that is, the second classification label of each character may be used to indicate whether the corresponding character is an end position of the knowledge point in the corresponding knowledge type in the training text.
In this disclosure, for any knowledge type of the multiple knowledge types, the knowledge extraction model corresponding to the any knowledge type may further include a second classification network, and the second classification network corresponding to the any knowledge type may be used to classify the semantic features to obtain a second classification label of each character in the training text.
It is to be understood that the first classification network and the second classification network of the knowledge extraction model are independent of each other.
And 606, taking the knowledge type as a prediction knowledge type under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, and determining a prediction knowledge point under the prediction knowledge type according to the first target label and the second target label.
Wherein the first target label may be used to indicate that the corresponding character is the first character of the predictive knowledge point.
It should be noted that, when the semantic features are classified by using the first classification network corresponding to any knowledge type, the number of the first target tags included in the first classification tag of each character in the obtained training text may be, but is not limited to, one, and the disclosure does not limit this.
For example, assume that the training text is "complete model distributed deployment, online knowledge recommends an item. "the semantic features may be classified by using a first classification network corresponding to a knowledge type (such as a product), the obtained first classification tag 1 indicates that the predicted character" knows "as the first character of the knowledge point under the corresponding knowledge type" product ", the semantic features may be classified by using a first classification network corresponding to a knowledge type (such as a backend), and the obtained first classification tag 2 indicates that the predicted character" knows "as the first character of the knowledge point under the corresponding knowledge type" backend ".
Wherein the second target label may be used to indicate that the corresponding character is a tail character of the predictive knowledge point.
It should be noted that, when the semantic features are classified by using the second classification network corresponding to any knowledge type, the number of the second target tags included in the second classification tags of each character in the obtained training text may be, but is not limited to, one, and the disclosure does not limit this.
For example, still exemplified by the above example, the training text is "complete model distributed deployment, online knowledge recommendation item. "the semantic features may be classified by using a second classification network corresponding to a knowledge type (e.g., product), the obtained second classification label 1 indicates that the predicted character" recommend "is a tail character of a knowledge point under the corresponding knowledge type" product ", the semantic features may be classified by using a second classification network corresponding to a knowledge type (e.g., backend), and the obtained second classification label 2 indicates that the predicted character" deploy "is a tail character of a knowledge point under the corresponding knowledge type" backend ".
In the embodiment of the present disclosure, in a case where each first classification tag includes a first target tag and each second classification tag includes a second target tag, the corresponding knowledge type may be used as the prediction knowledge type, and the prediction knowledge point in the prediction knowledge type may be determined according to the first target tag and the second target tag.
As a possible implementation manner, the training text may be intercepted from the first character of the predictive knowledge point indicated by the first target label to the tail character of the predictive knowledge point indicated by the second target label, so as to obtain the predictive knowledge point under the predictive knowledge type.
As an example, when the number of the first target label and the number of the second target label are both one, for the training text, the training text may be directly intercepted from the first character of the predicted knowledge point indicated by the first target label, and intercepted to the tail character of the predicted knowledge point indicated by the second target label, so as to obtain the predicted knowledge point under the predicted knowledge type.
As another example, when the obtained numbers of the first target label and the second target label are both multiple, for example, the first target label includes a first target label 1 and a first target label 2, and the second target label includes a second target label 1 and a second target label 2, for the training text, the predicted knowledge point 1 in the predicted knowledge type may be obtained by cutting from the first character of the predicted knowledge point indicated by the first target label 1, cutting to the tail character of the predicted knowledge point indicated by the second target label 1 closest to the first target label 1, and the predicted knowledge point 2 in the predicted knowledge type may be obtained by cutting from the first character of the predicted knowledge point indicated by the first target label 2, cutting to the tail character of the predicted knowledge point indicated by the second target label 2 closest to the first target label 2.
It should be noted that, the above intercepting direction of the predicted knowledge points under the predicted knowledge type in the training text may be from left to right, or may also be from right to left, and the disclosure does not limit this.
And 607, performing joint training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
The execution process of step 607 may refer to the execution process of any embodiment of the present disclosure, and is not described herein again.
In a possible implementation manner of the embodiment of the present disclosure, for any knowledge type of a plurality of knowledge types, the knowledge extraction model corresponding to the any knowledge type may further include a sequence tagging model, and the semantic features may be input into the sequence tagging model corresponding to the knowledge type to obtain a prediction tag of each character in a training text output by the sequence tagging model corresponding to the knowledge type; in the case where the prediction tag of each character includes the first prediction tag and the second prediction tag, the knowledge type may be used as the prediction knowledge type, and the prediction knowledge point in the prediction knowledge type may be determined based on the first prediction tag and the second prediction tag.
Wherein, the first prediction label can be used for indicating that the corresponding character is the first character of the prediction knowledge point, and the second prediction label can be used for indicating that the corresponding character is the tail character of the prediction knowledge point.
In embodiments of the present disclosure, the predictive tag for each character may be used to indicate whether the corresponding character is a first character, a last character, other characters, etc. of a knowledge point under the corresponding knowledge type.
For example, the training text is "complete model distributed deployment, online knowledge recommendation item. ", the prediction labels of each character are: o, O, O, O, B, O, O, O, E, O, O, O, B, O, O, E, O, O, O, where B may be used to indicate that the corresponding character is the first character of a knowledge point under the corresponding knowledge type, and B is the first prediction label; e may be used to indicate that the corresponding character is a tail character of the knowledge point under the corresponding knowledge type, and E is a second predictive label, O may be used to indicate that the corresponding character is another character of the knowledge point under the corresponding knowledge type.
It should be noted that, the representation of the prediction label of each character in the above example is only an example, and in practical applications, the prediction label of each character may be set as needed, which is not limited by the present disclosure.
As a possible implementation manner, for the training text, the training text may be cut from the first character of the predictive knowledge point indicated by the first predictive tag to the last character of the predictive knowledge point indicated by the second predictive tag, so that the predictive knowledge point under the predictive knowledge type may be obtained.
Therefore, the sequence marking model corresponding to each knowledge type can be adopted, and effective prediction of knowledge points in the training text under each knowledge type is achieved.
According to the training method of the knowledge extraction model, semantic features are classified by adopting a first classification network corresponding to a knowledge type aiming at any one of a plurality of knowledge types, so that a first classification label of each character in a training text is obtained; classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character; under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, the knowledge type is used as a prediction knowledge type, and a prediction knowledge point under the prediction knowledge type is determined according to the first target label and the second target label; the first target label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second target label is used for indicating that the corresponding character is a tail character of the prediction knowledge point. Therefore, the prediction of the head and tail characters of the knowledge points in the training text based on the first classification network and the second classification network corresponding to each knowledge type can be realized, and the predicted knowledge points under the predicted knowledge types can be effectively determined.
In order to clearly illustrate how to perform joint training on the knowledge extraction models of a plurality of knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type in any embodiment of the disclosure, the disclosure further provides a training method of the knowledge extraction models.
Fig. 7 is a flowchart illustrating a training method of a knowledge extraction model according to a sixth embodiment of the disclosure.
As shown in fig. 7, the training method of the knowledge extraction model may include the following steps:
step 701, acquiring a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the labeling information of the training text is used for indicating a corresponding relation between at least one labeling knowledge type and labeling knowledge points.
And step 702, splicing the training text and the prompt message to obtain an input text.
Step 703, performing feature extraction on the input text to obtain semantic features of the input text.
And 704, respectively extracting knowledge of the semantic features by using a knowledge extraction model corresponding to the plurality of knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types.
The execution process of step 701 to step 704 may refer to the execution process of any embodiment of the present disclosure, and is not described herein again.
Step 705, comparing the annotated knowledge types with the predicted knowledge types to use the same annotated knowledge types and predicted knowledge types as the first knowledge types, and/or use different annotated knowledge types and predicted knowledge types as the second knowledge types.
In the embodiment of the present disclosure, the first knowledge type may be a corresponding knowledge type in which the labeled knowledge type and the predicted knowledge type are the same, among the labeled knowledge type and the predicted knowledge type.
In the embodiment of the present disclosure, the second knowledge type may be a knowledge type obtained by removing the first knowledge type from each labeled knowledge type and a knowledge type obtained by removing the first knowledge type from each predicted knowledge type, that is, a labeled knowledge type and a predicted knowledge type corresponding to each labeled knowledge type and each predicted knowledge type when the labeled knowledge type and the predicted knowledge type are different from each other.
In the embodiment of the present disclosure, each labeled knowledge type and each predicted knowledge type may be compared, and the same labeled knowledge type and predicted knowledge type may be used as the first knowledge type, and/or different labeled knowledge types and predicted knowledge types may be used as the second knowledge type.
For example, assume that the plurality of knowledge types includes knowledge type 1, knowledge type 2, knowledge type 3, knowledge type 4, and knowledge type 5, the labeled knowledge types include knowledge type 1, knowledge type 3, knowledge type 4, and the predicted knowledge types include knowledge type 1 and knowledge type 2, where knowledge type 1 is a first knowledge type and knowledge type 2, knowledge type 3, and knowledge type 4 are a second knowledge type.
Step 706, generating a first loss value according to a difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type.
In embodiments of the present disclosure, the first loss value may be generated based on a difference between the predicted knowledge point and the annotated knowledge point that belong to the first knowledge type.
As a possible implementation, for any first knowledge type, a first difference between a first position of an initial character in the annotated knowledge point belonging to the first knowledge type in the training text and a second position of an initial character in the predicted knowledge point belonging to the first knowledge type in the training text may be determined; and a second difference between a third position of the tail character in the training text in the labeled knowledge point belonging to the first knowledge type and a fourth position of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type can be determined; finally, a first loss value may be determined based on the first difference and the second difference.
Therefore, the loss value can be determined according to the position difference of the first character and the tail character of the knowledge point (including the labeled knowledge point or the predicted knowledge point) belonging to the first knowledge type in the training text, namely the position difference between the positive labels of the first character and the position difference between the positive labels of the tail character in the training text, so that the model can be trained according to the loss value to improve the training effect of the model, namely the prediction precision of the model.
It should be noted that, for the training text, the proportion of the first character and the last character of the knowledge point (including the labeled knowledge point or the predicted knowledge point) belonging to the first knowledge type in the training text is small, that is, the positive label used for indicating the first and last characters of the knowledge point in the training text is sparse, and the proportion of the first character and the last character of the knowledge point not belonging to the first knowledge type in the training text is large, that is, the negative label used for indicating the first and last characters of the knowledge point in the training text is large. In order to prevent the overfitting phenomenon of the knowledge extraction model in the training process, the loss of the negative label can be calculated. Therefore, in a possible implementation manner of the embodiment of the present disclosure, the first loss value may also be determined through the following steps:
first, a first sub-loss value may be determined based on the first difference and the second difference.
The first sub-loss value and the first difference are in a positive relationship (i.e., in a positive correlation), and the first sub-loss value and the second difference are also in a positive relationship (i.e., in a positive correlation), i.e., the smaller the first difference or the second difference is, the smaller the value of the first sub-loss value is, and conversely, the larger the first difference or the second difference is, the larger the value of the first sub-loss value is.
As an example, a loss value corresponding to the first character may be determined according to the first difference, a loss value corresponding to the last character may be determined according to the second difference, and finally, a sum of the loss value lossstart corresponding to the first character and the loss value lossdend corresponding to the last character may be determined, so that the sum of the loss value lossstart corresponding to the first character and the loss value lossdend corresponding to the last character may be used as the first sub-loss value.
Second, a second sub-penalty value may be determined based on a third difference between the first text and the second text.
Wherein the first text may be a result of removing the first character and the last character of the annotated knowledge point belonging to the first knowledge type from the training text and the second text may be a result of removing the first character and the last character of the predicted knowledge point belonging to the first knowledge type from the training text.
For example, the training text is "complete model distributed deployment, online knowledge recommendation project. "assume that the first character of the annotated knowledge point belonging to the first knowledge type is" known "and the last character is" recommended ", the first character of the predicted knowledge point belonging to the first knowledge type is" known "and the last character is" destination ", and the first text is" complete model distributed deployment, online identification and inference project ". "the second text is" complete model distributed deployment, online identification of recommended items. ".
As a possible implementation manner, a character loss value of each character in the first text may be determined according to a third difference between the first text and the second text; determining a target character from all characters of the first text according to a set proportion; thus, the second sub-loss value can be determined based on the character loss value of each target character.
In the embodiment of the present disclosure, the set ratio may be preset, for example, may be 50%,20%, and the like, which is not limited by the present disclosure.
In the embodiment of the present disclosure, the target character may be determined from the characters of the first text according to a set ratio, for example, the target character of the set ratio may be randomly selected from the characters of the first text according to the set ratio.
As a possible implementation manner, the setting ratio may be determined according to the total number of characters of the training text, for example, the setting ratio may be in a negative correlation with the total number of characters of the training text, that is, the larger the total number of characters of the training text is, the smaller the setting ratio is, the smaller the total number of characters of the training text is, the larger the setting ratio is. Therefore, the target characters can be controlled within a proper range, and the number of the target characters is not too large or too small.
It should be noted that the third difference between the second sub-loss value and the first text and the second text is a forward relationship (i.e., a positive correlation), that is, the smaller the third difference between the first text and the second text is, the smaller the character loss value is, the smaller the value of the second sub-loss value is, and conversely, the larger the third difference between the first text and the second text is, the larger the character loss value is, the larger the value of the second sub-loss value is.
Therefore, by carrying out negative sampling operation on the negative label, only the loss value corresponding to the negative label needs to be effectively determined based on the partial characters corresponding to the negative label, namely the character loss values corresponding to the target characters, and the loss values corresponding to the negative label do not need to be determined by adopting the character loss values corresponding to all the characters corresponding to the negative label, so that subsequent calculation can be facilitated, and the phenomenon of overfitting of the knowledge extraction model in the training process can be avoided.
Further, a type penalty value for the first knowledge type may be determined based on the first sub penalty value and the second sub penalty value.
For example, a sum of the first sub-loss value and the second sub-loss value may be determined, and the sum of the first sub-loss value and the second sub-loss value may be used as a type loss value corresponding to the first knowledge type.
Finally, a first loss value may be determined based on the type loss values of the respective first knowledge types.
For example, a sum of type loss values of the first knowledge types may be determined, and the sum of type loss values of the first knowledge types may be taken as the first loss value. Alternatively, the type loss values of the first knowledge types may be weighted and summed to obtain the first loss value.
Therefore, the first loss value can be effectively determined by fusing the first loss sub-value of the positive label and the second loss sub-value of the negative label, and the phenomenon of overfitting of the knowledge extraction model in the training process can be avoided.
In the disclosed embodiment, the second loss value may be generated according to predicted knowledge points or labeled knowledge points belonging to the second knowledge type.
For example, the second loss value may be generated based on the number of predicted knowledge points or annotated knowledge points that belong to the second knowledge type. The number and the second loss value are in a positive relationship, that is, the larger the number is, the larger the second loss value is, the smaller the number is, and the smaller the second loss value is.
Alternatively, the type loss value of each second knowledge type may be determined according to the positions of the first character and the last character in the predictive knowledge point or the annotated knowledge point belonging to each second knowledge type in the training text, so that the second loss value may be determined according to the type loss value of each second knowledge type. For example, the sum of the type loss values of the second knowledge types may be determined, and the sum of the type loss values of the second knowledge types may be taken as the second loss value. Alternatively, the type loss values of the second knowledge types may be weighted and summed to obtain the second loss value.
And 707, performing joint training on the knowledge extraction models of the plurality of knowledge types according to the first loss value and/or the second loss value.
In embodiments of the present disclosure, knowledge extraction models of multiple knowledge types may be jointly trained based on the first loss value and/or the second loss value.
As an example, the knowledge extraction models of multiple knowledge types may be jointly trained according to the first loss value to minimize a value of the first loss value; or, performing joint training on the knowledge extraction models of the multiple knowledge types according to the second loss value so as to minimize the value of the second loss value.
As another example, the knowledge extraction models of multiple knowledge types may be jointly trained according to the first loss value and the second loss value, for example, the first loss value and the second loss value may be weighted and summed to obtain a target loss value, or the sum of the first loss value and the second loss value may be used as the target loss value, so that the knowledge extraction models of multiple knowledge types may be jointly trained according to the target loss value to minimize the target loss value.
It should be noted that, the above example is performed by taking only the termination condition of the model training as the minimization of the loss value, and in practical application, other termination conditions may also be set, for example, the number of times of training reaches the set number of times, the training duration reaches the set duration, the loss value converges, and the like, which is not limited by the present disclosure.
It is understood that in the prediction of knowledge points in the training text, in the presence of the first knowledge type and the second knowledge type, there may be a case where the knowledge type does not predict the corresponding knowledge points, and still as described in the above example, the training text may correspond to knowledge types including knowledge type 1, knowledge type 2, knowledge type 3, knowledge type 4, and knowledge type 5, the labeled knowledge type includes knowledge type 1, knowledge type 3, and knowledge type 4, and the predicted knowledge type includes knowledge type 1 and knowledge type 2, wherein knowledge type 1 is the first knowledge type, knowledge type 2, knowledge type 3, and knowledge type 4 are the second knowledge type, however, for knowledge type 5, knowledge type 5 does not belong to the labeled knowledge type, nor to the predicted knowledge type (i.e., knowledge type 5 does not predict the corresponding knowledge points), and in the present disclosure, knowledge types other than the first knowledge type and the second knowledge type may be used as the third knowledge type, for example, knowledge type 5 in the above example belongs to the third knowledge type. In order to improve the prediction accuracy of the model, a loss value of a third knowledge type can be determined, and the knowledge extraction models of the knowledge types are jointly trained according to the loss value of the third knowledge type. That is, in a possible implementation manner of the embodiment of the present disclosure, the third loss values of the multiple knowledge types may also be determined according to the labeled knowledge points in each labeled knowledge type and the predicted knowledge points in each predicted knowledge type; determining a fourth loss value according to the third loss values of the plurality of knowledge types; and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the fourth loss value.
That is, a third loss value corresponding to any one of the knowledge types (including the first knowledge type, the second knowledge type, and the third knowledge type) is determined according to the labeled knowledge point in each labeled knowledge type and the predicted knowledge point in each predicted knowledge type, and a fourth loss value is determined according to the third loss values of the knowledge types, for example, the sum of the third loss values of the knowledge types may be determined and the sum of the third loss values of the knowledge types may be used as the fourth loss value, or the third loss values of the knowledge types may be weighted and summed to obtain the fourth loss value. Thus, in the present disclosure, the knowledge extraction models of the plurality of knowledge types may be jointly trained based on the fourth loss value.
As an example, the knowledge extraction models of the plurality of knowledge types may be jointly trained according to a fourth loss value to minimize the fourth loss value.
It should be noted that, the above example is performed by taking only the termination condition of the model training as the minimization of the fourth loss value, and in practical application, other termination conditions may also be set, for example, the number of times of training reaches the set number of times, the training duration reaches the set duration, the fourth loss value converges, and the like, which is not limited by the present disclosure.
Therefore, the loss values corresponding to various knowledge types are comprehensively considered, the knowledge extraction model is trained, and the prediction precision of the model, namely the accuracy and the reliability of the model extraction result can be improved.
The training method of the knowledge extraction model of the embodiment of the disclosure compares each labeled knowledge type with each predicted knowledge type to take the same labeled knowledge type and predicted knowledge type as a first knowledge type, and/or take different labeled knowledge types and predicted knowledge types as a second knowledge type; generating a first loss value according to the difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type; and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the first loss value and/or the second loss value. Therefore, joint training of knowledge extraction models of multiple knowledge types can be achieved according to the predicted knowledge points and the labeled knowledge points under multiple knowledge types, and accuracy of model prediction results is improved.
In summary, by using the method provided by any embodiment of the present disclosure, important knowledge points can be automatically extracted from a large amount of unstructured texts, and in particular, when knowledge points under multiple knowledge types need to be extracted simultaneously, the knowledge extraction model can be greatly simplified by the knowledge extraction framework provided by fig. 3.
In the present disclosure, the knowledge extraction problem is converted into an extraction-type reading understanding task, and a natural language extraction target (Prompt, for prompting learning) is constructed, and the extraction target is recorded as a Prompt message in the present disclosure to replace the problem in reading understanding, so as to achieve a better extraction effect, and the method mainly includes the following aspects:
on the first hand, a problem is replaced by Prompt, and a text to be extracted is taken as a model input, so that an extraction type MRC (Machine Reading Comprehension) task paradigm is constructed.
And in the second aspect, an input encoder is constructed based on a large pre-training model, each knowledge point in the text is determined at the output end in a mode of predicting the head and tail positions of the knowledge points or ATT, the tag loss is calculated in a combined mode, and meanwhile, the knowledge points under various knowledge types are mined.
In a third aspect, negative label loss is calculated by negative sampling to prevent model overfitting.
Specifically, a natural language extraction target Prompt may be first constructed according to a specific domain direction, for example, the Prompt may be "skill attention point in algorithm direction", "attention point in recommended business direction", or the like, or, in order to be applicable to all application scenarios, a broad extraction target Prompt may be constructed to be applied to different knowledge type scenarios, for example, the Prompt may be "knowledge point in text", "entity in text", or the like. Prompt information of a text to be extracted is introduced into the model through Prompt display, and semantic interaction understanding is enhanced; secondly, after the text to be extracted is coded by a pre-trained large model, predicting the head and tail positions of knowledge points under different knowledge types through 2 x num (number of knowledge types) of two-classification networks or two-classification models, and uniformly solving the nesting and non-nesting conditions; and finally, when the loss of the negative label is calculated, negative sampling operation can be adopted to prevent the overfitting of the model.
The knowledge extraction method provided by the disclosure can still obtain a good extraction effect under the scene of training the model by using a small number of training samples by means of a powerful pre-training model and skillfully converting a problem mode.
As an example, using the knowledge extraction model shown in fig. 3, the above process can be described in detail in conjunction with the following steps:
step one, data construction. The input data mined by ATT can be converted into an extraction type reading comprehension data form (f < query, text > → answer), wherein the 'query' is constructed in a Prompt mode, and the Prompt and the text to be extracted are spliced and then used as the input of a knowledge extraction model, so that external natural language Prompt information can be efficiently fused; and the answer is the extracted knowledge point.
The traditional information extraction task or knowledge extraction task is mainly solved in a sequence labeling mode, a text to be extracted serves as input, and a specific entity or knowledge point is extracted in a mode of an encoder + CRF. The embodiment can obtain a good extraction effect under the condition that the entities are not nested (different entities belong to different categories, and a character overlapping phenomenon does not exist), but cannot solve the condition that the entities are nested (one entity possibly belongs to a plurality of entity types).
Based on this, the knowledge extraction method provided by the embodiment of the disclosure can convert the knowledge extraction task into a reading understanding task. The mode is that for a given question, extracting a corresponding answer segment from the text to be extracted (learning the mapping of f < query, text > → answer): and extracting final entity information by predicting the starting position and the ending position of the answer. The extraction target can be customized, and the query in the MRC is constructed in a Prompt manner, and a complete data sample can be shown in table 1:
TABLE 1
Figure BDA0003957755390000161
Splicing the extraction target Prompt and the text to be extracted to construct an input text: "[ CLS ] Prompt [ SEP ] text [ SEP ]".
Taking data in the table 1 as an example, the constructed input text is a knowledge point [ SEP ] updating user behavior model in a [ CLS ] text, itg updates the user behavior model, improves the recognition rate, depends on the service of the map cloud, and is communicated with a map cloud team to obtain the performance test permission. [ SEP ] ".
And step two, inputting the constructed input text into a big pre-training model such as Ernie and the like, so that the Prompt and the text can be subjected to semantic interaction fully after network coding, and corresponding semantic features can be obtained.
And thirdly, knowledge extraction is carried out on the semantic features by respectively adopting knowledge extraction models corresponding to the multiple knowledge types so as to predict knowledge points under at least one knowledge type in the multiple knowledge types.
In the output layer of the knowledge extraction model, 2 classification models (the output networks of each knowledge type are independent) are designed for the knowledge extraction model corresponding to each knowledge type, and the classification models are marked as a first classification network and a second classification network in the disclosure, so that the initial position and the end position of a knowledge point under the knowledge type can be respectively predicted.
Therefore, the text to be extracted can be intercepted from the predicted initial position of the knowledge point under the corresponding knowledge type to the end position of the knowledge point under the corresponding knowledge type, so that the knowledge point under the corresponding knowledge type can be obtained.
It should be noted that, because the extraction result only exists in the text to be extracted, in the training stage of the knowledge extraction model, only the loss value of the output part corresponding to the training text is calculated, so as to train the knowledge extraction model according to the loss value; in the prediction stage, the final result is only predicted on the output layer corresponding to the text to be extracted.
It should be further noted that, by adopting the above structural design, the knowledge extraction model not only can extract a plurality of knowledge points existing in the text, but also can distinguish the cases that the same knowledge point belongs to different knowledge types, for example, the sample is extracted in which the Prompt is a "text focus point", and the text is a "completion model distributed deployment, online knowledge recommendation item". "the knowledge type includes" back end "," algorithm "and" product ", and the extracted knowledge point" knowledge recommendation "belongs to both the knowledge type" algorithm "and" product ", which is shown in fig. 8, for example.
Therefore, the knowledge extraction model can simultaneously solve nested and non-nested knowledge extraction tasks, and not only can extract the head and tail positions of the knowledge points in the text, but also can predict the knowledge types of the knowledge points by adopting a 2-by-num (the number of the knowledge categories) two-classification prediction mode at the output end.
In the training phase of the knowledge extraction model, the first sub-loss value of the training phase may be a sum of the loss of the start position and the loss of the end position, which is defined as follows:
loss=loss start +loss end =CE(y start ,p start )+CE(y end ,p end );
therein, loss start According to the first position y of the initial character in the labeled knowledge point belonging to the first knowledge type (the same labeled knowledge type and predicted knowledge type are used as the first knowledge type) in the training text start And a second position p of the first character in the predictive knowledge point in the training text belonging to the first knowledge type start First difference between, and by the loss function CE (y) start ,p start ) A determined loss value. Where CE represents the Cross Entropy loss function (Cross entry).
Therein, loss end Is according toThe third position y of the tail character in the labeled knowledge point of the first knowledge type (the same labeled knowledge type and the same predicted knowledge type are taken as the first knowledge type) in the training text end And a fourth position p of the tail character in the training text in the predictive knowledge point belonging to the first knowledge type end Second difference between, and by a loss function CE (y) end ,p end ) A determined loss value.
Since the positive labels (indicating the head and tail positions of the knowledge points) are very sparse in the training text, the number of negative labels (not used for indicating the head and tail positions of the knowledge points) is the most, and in order to prevent over-training, the negative label loss (denoted as the second sub-loss value in this disclosure) can be calculated. When the loss of the negative label is calculated, a negative sampling operation (that is, when the loss of the corresponding position of the negative label is calculated, probabilistic abandon loss calculation (which can be realized by dropout) is performed according to a preset threshold value) may be adopted, so that the training effect of the model may be further improved.
Specifically, a character loss value of each character in the first text may be determined according to a third difference between the first text and a second text, wherein the first text is obtained by removing the first character and the last character of the labeled knowledge point belonging to the first knowledge type from the training text, and the second text is obtained by removing the first character and the last character of the predicted knowledge point belonging to the first knowledge type from the training sample; determining a target character from all characters of the first text according to a set proportion; and a second sub-loss value may be determined based on the character loss value of each target character.
Further, a type loss value of the first knowledge type may be determined based on the first sub-loss value and the second sub-loss value; and may determine a first loss value based on the type loss value for each first knowledge type.
In summary, a complete flow diagram of the Prompt + MRC-based domain-specific knowledge point extraction service of the present disclosure is shown in fig. 9.
In the present disclosure, the knowledge extraction models of multiple knowledge types may be trained according to the first loss value, that is, parameters of the knowledge extraction models of multiple knowledge types may be adjusted so that the first loss value may be minimized.
The knowledge extraction method provided by the disclosure can be applied to the knowledge extraction problem or the information extraction problem under any field-subdividing scene, including but not limited to the extraction of the function sequence knowledge focus, the identification of named entities, the extraction of comments and the like. For example, the attention points of the employees can be actively discovered according to the texts deposited by the employees of the enterprise, so that sufficient connections between people, between people and knowledge and between knowledge and people are established, the knowledge is maximally reused and flowed in the work, and a positive cycle of production and consumption is formed.
The knowledge extraction model, the knowledge extraction method and the knowledge extraction idea adopted by the method do not depend on products, are suitable for the knowledge point extraction scene in any specific field, and mainly comprise the following parts: input data construction, knowledge point prediction under multiple knowledge types and negative sampling loss calculation. In the data construction stage, a proper extraction target can be constructed according to a specific service scene, and additional prompt information is provided for target extraction; on the aspect of model coding and knowledge point prediction under knowledge types, the current advanced pre-training large model can be replaced, the model can be further pre-trained according to actual conditions, and the coding capability of the network can be improved; in the aspect of target extraction, 2 × num (the number of knowledge types) of two classification models are adopted to respectively predict the head and tail positions of knowledge concern points (recorded as knowledge points in the disclosure) under different knowledge types, so that the unified extraction of the knowledge points under a plurality of knowledge types is realized; and finally, calculating a loss function of the negative label in a negative sampling mode to prevent the model from being over-fitted.
The knowledge extraction method can be applied to resource construction of knowledge focus points in different functional directions, and the extraction of the knowledge points of a plurality of knowledge types is integrated into a model frame in a unified modeling mode, so that the complexity of the knowledge extraction is greatly simplified. The role of the mining of the attention points of the function sequence knowledge as a loop of the most basic items of the knowledge distribution system in the knowledge management of some enterprises is to provide a high-quality resource base for the matching and recommendation of downstream knowledge points-documents and knowledge points-experts and promote the positive cycle of knowledge production-consumption.
The beneficial effects of the knowledge extraction method of the present disclosure at least may include:
1. an extraction target (Prompt) can be constructed according to business requirements, a traditional sequence standard problem is ingeniously converted into an MRC mode to be solved, and the information extraction capability is further improved by introducing additional external information;
2. according to the number of knowledge categories to which the extraction information (recorded as knowledge points in the disclosure) belongs, a plurality of two-classification networks can be constructed, and the nested and non-nested condition of the extraction information is combined, so that a knowledge extraction framework is more elegant and simpler;
3. the prediction result is processed according to a certain rule according to specific services, so that the extraction effect can be further improved, the accuracy rate of extracting knowledge attention points in all functional directions is over 90 percent, and the practical standard is achieved;
in summary, the knowledge extraction method provided by the present disclosure can be applied to any information extraction service scenario, and has strong practical significance and high extensibility.
In correspondence with the knowledge extraction method provided in the embodiment of fig. 1 to 4, the present disclosure also provides a knowledge extraction device, and since the knowledge extraction device provided in the embodiment of the present disclosure corresponds to the knowledge extraction method provided in the embodiment of fig. 1 to 4, the implementation of the knowledge extraction method is also applicable to the knowledge extraction device provided in the embodiment of the present disclosure, and will not be described in detail in the embodiment of the present disclosure.
Fig. 10 is a schematic structural diagram of a knowledge extraction device according to a seventh embodiment of the disclosure.
As shown in fig. 10, the knowledge extraction apparatus 1000 may include: a first obtaining module 1001, a splicing module 1002, an extracting module 1003, and an extracting module 1004.
The first obtaining module 1001 is configured to obtain a text to be extracted and prompt information, where the prompt information is used to indicate information to be extracted in the text to be extracted.
The splicing module 1002 is configured to splice the text to be extracted and the prompt information to obtain an input text.
The extracting module 1003 is configured to perform feature extraction on the input text to obtain semantic features of the input text.
The extraction module 1004 is configured to perform knowledge extraction on the semantic features by using knowledge extraction models corresponding to the multiple knowledge types, respectively, so as to obtain a knowledge point in at least one of the multiple knowledge types.
In a possible implementation manner of the embodiment of the present disclosure, the knowledge extraction model includes a first classification network and a second classification network, and the extraction module 1004 may be specifically configured to: classifying the semantic features by adopting a first classification network corresponding to the knowledge types aiming at any one of the knowledge types to obtain a first classification label, wherein the first classification label is used for indicating the initial position of the knowledge point in the text to be extracted; classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted; and determining knowledge points under the knowledge types according to the first classification label and the second classification label.
In a possible implementation manner of the embodiment of the present disclosure, the extraction module 1004 may specifically be configured to: and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first classification label to the end position of the knowledge point indicated by the second classification label to obtain the knowledge point under the knowledge type.
In a possible implementation manner of the embodiment of the present disclosure, the knowledge extraction model includes a sequence labeling model, and the extraction module 1004 may be specifically configured to: aiming at any knowledge type in a plurality of knowledge types, carrying out sequence labeling on semantic features by adopting a sequence labeling model corresponding to the knowledge type to obtain a first labeling label and a second labeling label; determining knowledge points under the knowledge types according to the first label and the second label; the first label is used for indicating the initial position of the knowledge point in the text to be extracted, and the second label is used for indicating the end position of the knowledge point in the text to be extracted.
In a possible implementation manner of the embodiment of the present disclosure, the extraction module 1004 may specifically be configured to: and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first label tag to the end position of the knowledge point indicated by the second label tag to obtain the knowledge point under the knowledge type.
In a possible implementation manner of the embodiment of the present disclosure, the splicing module 1002 may be specifically configured to: adding a separator before the first character of the text to be extracted; the prompt is concatenated before the separator, wherein the last character of the prompt is adjacent to the separator.
In a possible implementation manner of the embodiment of the present disclosure, the apparatus 1000 may further include:
and the second acquisition module is used for acquiring candidate resources matched with the knowledge types from the resource pool aiming at any knowledge type of the at least one knowledge type.
And the determining module is used for determining the target resource matched with the knowledge point under the knowledge type from the candidate resources.
And the display module is used for displaying the target resource.
The knowledge extraction device of the embodiment of the disclosure acquires a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; performing feature extraction on an input text to obtain semantic features of the input text; and respectively adopting knowledge extraction models corresponding to the multiple knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the multiple knowledge types. Therefore, the extraction of the knowledge points matched with the prompt information in the text to be extracted based on the knowledge extraction models corresponding to the known knowledge types can be realized, the knowledge points matched with the prompt information can be extracted from the text to be extracted, and the knowledge types of the knowledge points can be determined at the same time. Moreover, the common extraction of knowledge points under multiple knowledge types can be realized, so that the extraction requirements of different users are met.
Corresponding to the training method of the knowledge extraction model provided in the embodiments of fig. 5 to 7, the present disclosure also provides a training device of the knowledge extraction model, and since the training device of the knowledge extraction model provided in the embodiments of the present disclosure corresponds to the training method of the knowledge extraction model provided in the embodiments of fig. 5 to 7, the implementation of the training method of the knowledge extraction model in the embodiments of the present disclosure is also applicable to the training device of the knowledge extraction model provided in the embodiments of the present disclosure, and will not be described in detail in the embodiments of the present disclosure.
Fig. 11 is a schematic structural diagram of a training apparatus for knowledge extraction model according to an eighth embodiment of the present disclosure.
As shown in FIG. 11, the training set 1100 of knowledge extraction models may include: an acquisition module 1101, a stitching module 1102, an extraction module 1103, an extraction module 1104, and a training module 1105.
The obtaining module 1101 is configured to obtain a training text and prompt information, where the prompt information is used to indicate information to be extracted in the training text, and the label information of the training text is used to indicate a correspondence between at least one label knowledge type and a label knowledge point.
And a splicing module 1102, configured to splice the training text and the prompt information to obtain an input text.
An extracting module 1103, configured to perform feature extraction on the input text to obtain semantic features of the input text.
And the extraction module 1104 is configured to perform knowledge extraction on the semantic features respectively by using knowledge extraction models corresponding to the multiple knowledge types to obtain a predicted knowledge point in at least one predicted knowledge type of the multiple knowledge types.
The training module 1105 is configured to perform joint training on the knowledge extraction models of multiple knowledge types according to the labeled knowledge points in each labeled knowledge type and the predicted knowledge points in each predicted knowledge type.
In a possible implementation manner of the embodiment of the present disclosure, the knowledge extraction model includes a first classification network and a second classification network, and the extraction module 1104 may be specifically configured to: aiming at any knowledge type in a plurality of knowledge types, adopting a first classification network corresponding to the knowledge type to classify the semantic features so as to obtain a first classification label of each character in the training text; classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character; under the condition that each first classification label comprises a first target label and each second classification label comprises a second target label, taking the knowledge type as a prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first classification label and the second classification label; the first target label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second target label is used for indicating that the corresponding character is a tail character of the prediction knowledge point.
In a possible implementation manner of the embodiment of the present disclosure, the knowledge extraction model includes a sequence labeling model, and the extraction module 1104 may be specifically configured to: aiming at any knowledge type in a plurality of knowledge types, inputting semantic features into a sequence labeling model corresponding to the knowledge type to obtain a prediction label of each character in a training text output by the sequence labeling model corresponding to the knowledge type; under the condition that the prediction tag of each character comprises a first prediction tag and a second prediction tag, the knowledge type is used as the prediction knowledge type, and a prediction knowledge point under the prediction knowledge type is determined according to the first prediction tag and the second prediction tag; the first prediction label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second prediction label is used for indicating that the corresponding character is a tail character of the prediction knowledge point.
In a possible implementation manner of the embodiment of the present disclosure, the training module 1105 may be specifically configured to: comparing each annotated knowledge type with each predicted knowledge type to take the same annotated knowledge type and predicted knowledge type as a first knowledge type, and/or taking different annotated knowledge types and predicted knowledge types as a second knowledge type; generating a first loss value according to the difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type; and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the first loss value and/or the second loss value.
In a possible implementation manner of the embodiment of the present disclosure, the training module 1105 may specifically be configured to: for any first knowledge type, determining a first difference between a first position of an initial character in the labeled knowledge point belonging to the first knowledge type in the training text and a second position of the initial character in the predicted knowledge point belonging to the first knowledge type in the training text; determining a second difference between a third position of the tail character in the training text in the labeled knowledge point belonging to the first knowledge type and a fourth position of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type; a first loss value is determined based on the first difference and the second difference.
In a possible implementation manner of the embodiment of the present disclosure, the training module 1105 may be specifically configured to: determining a first sub-loss value according to the first difference and the second difference; determining a second sub-loss value according to a third difference between the first text and the second text; the first text is obtained by removing the first character and the tail character of the labeled knowledge point belonging to the first knowledge type from the training text, and the second text is obtained by removing the first character and the tail character of the predicted knowledge point belonging to the first knowledge type from the training text; determining a type loss value of the first knowledge type according to the first sub loss value and the second sub loss value; first loss values are determined based on the type loss values of each first knowledge type.
In a possible implementation manner of the embodiment of the present disclosure, the training module 1105 may be specifically configured to: determining a character loss value of each character in the first text according to a third difference between the first text and the second text; determining a target character from all characters of the first text according to a set proportion; and determining a second sub-loss value according to the character loss value of each target character.
The training device of the knowledge extraction model of the embodiment of the disclosure acquires a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the marking information of the training text is used for indicating a corresponding relation between at least one marking knowledge type and marking knowledge points; splicing the training text and the prompt information to obtain an input text; performing feature extraction on the input text to obtain semantic features of the input text; respectively extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to the multiple knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the multiple knowledge types; and performing combined training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under the labeled knowledge types and the predicted knowledge points under the predicted knowledge types. Therefore, the knowledge extraction model is trained based on the deep learning technology, and the accuracy and the reliability of the model prediction result can be improved.
To implement the above embodiments, the present disclosure also provides an electronic device, which may include at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executable by the at least one processor to enable the at least one processor to perform a method for knowledge extraction as set forth in any one of the above-described embodiments of the disclosure, or a method for training a knowledge extraction model as set forth in any one of the above-described embodiments of the disclosure.
To achieve the above embodiments, the present disclosure also provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the knowledge extraction method proposed by any of the above embodiments of the present disclosure or the training method of the knowledge extraction model proposed by any of the above embodiments of the present disclosure.
To achieve the above embodiments, the present disclosure also provides a computer program product, which includes a computer program that, when being executed by a processor, implements the knowledge extraction method proposed by any of the above embodiments of the present disclosure, or implements the training method of the knowledge extraction model proposed by any of the above embodiments of the present disclosure.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 12 shows a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure. The electronic device may include the server and the client in the above embodiments. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 12, the electronic apparatus 1200 includes a computing unit 1201 which can perform various appropriate actions and processes in accordance with a computer program stored in a ROM (Read-Only Memory) 1202 or a computer program loaded from a storage unit 1208 into a RAM (Random Access Memory) 1203. In the RAM 1203, various programs and data necessary for the operation of the electronic apparatus 1200 may also be stored. The computing unit 1201, the ROM1202, and the RAM 1203 are connected to each other by a bus 1204. An I/O (Input/Output) interface 1205 is also connected to the bus 1204.
Various components in the electronic device 1200 are connected to the I/O interface 1205, including: an input unit 1206 such as a keyboard, a mouse, or the like; an output unit 1207 such as various types of displays, speakers, and the like; a storage unit 1208, such as a magnetic disk, optical disk, or the like; and a communication unit 1209 such as a network card, modem, wireless communication transceiver, etc. The communication unit 1209 allows the electronic device 1200 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 1201 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of the computing Unit 1201 include, but are not limited to, a CPU (Central Processing Unit), a GPU (graphics Processing Unit), various dedicated AI (Artificial Intelligence) computing chips, various computing Units running machine learning model algorithms, a DSP (Digital Signal Processor), and any suitable Processor, controller, microcontroller, and the like. The calculation unit 1201 performs the respective methods and processes described above, such as the above-described knowledge extraction method or the training method of the knowledge extraction model. For example, in some embodiments, the above-described knowledge extraction methods may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1200 via the ROM1202 and/or the communication unit 1209. When the computer program is loaded into the RAM 1203 and executed by the computing unit 1201, one or more steps of the knowledge extraction method or the training method of the knowledge extraction model described above may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured in any other suitable way (e.g., by means of firmware) to perform the knowledge extraction method or training method of the knowledge extraction model described above.
Various implementations of the systems and techniques described here above may be realized in digital electronic circuitry, integrated circuitry, FPGAs (Field Programmable Gate arrays), ASICs (Application-Specific Integrated circuits), ASSPs (Application Specific Standard products), SOCs (System On Chip, system On a Chip), CPLDs (Complex Programmable Logic devices), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an EPROM (Electrically Programmable Read-Only-Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only-Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a Display device (e.g., a CRT (Cathode Ray Tube) or LCD (Liquid Crystal Display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network), WAN (Wide Area Network), internet, and blockchain Network.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in a conventional physical host and a VPS (Virtual Private Server). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be noted that artificial intelligence is a subject for studying a computer to simulate some human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), and includes both hardware and software technologies. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.
Deep learning is a new research direction in the field of machine learning. The method is an inherent rule and an expression level of learning sample data, and information obtained in the learning process is very helpful for explaining data such as characters, images and sound. The final aim of the method is to enable a machine to have analysis and learning capabilities like a human, and to recognize data such as characters, images and sounds.
Cloud computing (cloud computing) refers to a technology architecture that accesses a flexibly extensible shared physical or virtual resource pool through a network, where resources may include servers, operating systems, networks, software, applications, storage devices, and the like, and may be deployed and managed in an on-demand, self-service manner. Through the cloud computing technology, high-efficiency and strong data processing capacity can be provided for technical application and model training of artificial intelligence, block chains and the like.
According to the technical scheme of the embodiment of the disclosure, the text to be extracted and the prompt information are obtained, wherein the prompt information is used for indicating the information to be extracted in the text to be extracted; splicing the text to be extracted and the prompt information to obtain an input text; performing feature extraction on the input text to obtain semantic features of the input text; and respectively adopting knowledge extraction models corresponding to the multiple knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the multiple knowledge types. Therefore, the extraction of the knowledge points matched with the prompt information in the text to be extracted based on the knowledge extraction models corresponding to the known knowledge types can be realized, the knowledge points matched with the prompt information can be extracted from the text to be extracted, and the knowledge types of the knowledge points can be determined at the same time. Moreover, the knowledge points under a plurality of knowledge types can be extracted together, so that the extraction requirements of different users are met.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions proposed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (33)

1. A method of knowledge extraction, the method comprising:
acquiring a text to be extracted and prompt information, wherein the prompt information is used for indicating information to be extracted in the text to be extracted;
splicing the text to be extracted and the prompt information to obtain an input text;
performing feature extraction on the input text to obtain semantic features of the input text;
and respectively adopting knowledge extraction models corresponding to a plurality of knowledge types to extract the knowledge of the semantic features so as to obtain knowledge points under at least one knowledge type in the knowledge types.
2. The method of claim 1, wherein the knowledge extraction model comprises a first classification network and a second classification network,
the method for extracting knowledge from the semantic features by respectively adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain knowledge points under at least one knowledge type in the plurality of knowledge types comprises the following steps:
for any knowledge type in the knowledge types, classifying the semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in the text to be extracted;
classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted;
and determining knowledge points under the knowledge types according to the first classification label and the second classification label.
3. The method of claim 2, wherein said determining knowledge points under said knowledge type from said first classification tag and said second classification tag comprises:
and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first classification label to the end position of the knowledge point indicated by the second classification label to obtain the knowledge point under the knowledge type.
4. The method of claim 1, wherein the knowledge extraction model comprises a sequence annotation model,
the method for extracting knowledge from the semantic features by respectively adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain knowledge points under at least one knowledge type in the plurality of knowledge types comprises the following steps:
aiming at any one knowledge type in the multiple knowledge types, carrying out sequence labeling on the semantic features by adopting a sequence labeling model corresponding to the knowledge type to obtain a first labeling label and a second labeling label;
determining knowledge points under the knowledge types according to the first labeling label and the second labeling label;
the first label is used for indicating the starting position of the knowledge point in the text to be extracted, and the second label is used for indicating the ending position of the knowledge point in the text to be extracted.
5. The method of claim 4, wherein the determining knowledge points under the knowledge type from the first annotation tag and the second annotation tag comprises:
and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first label tag to the end position of the knowledge point indicated by the second label tag to obtain the knowledge point under the knowledge type.
6. The method according to any one of claims 1-5, wherein the splicing the text to be extracted and the prompt message to obtain an input text comprises:
adding a separator before the first character of the text to be extracted;
concatenating the hint information before the separator, wherein a last character of the hint information is adjacent to the separator.
7. The method according to any one of claims 1-5, wherein the method further comprises:
for any knowledge type of the at least one knowledge type, acquiring a candidate resource matched with the knowledge type from a resource pool;
determining a target resource matched with the knowledge point under the knowledge type from each candidate resource;
and displaying the target resource.
8. A method of training a knowledge extraction model, the method comprising:
acquiring a training text and prompt information, wherein the prompt information is used for indicating information to be extracted in the training text, and the marking information of the training text is used for indicating the corresponding relation between at least one marking knowledge type and marking knowledge points;
splicing the training text and the prompt information to obtain an input text;
performing feature extraction on the input text to obtain semantic features of the input text;
respectively extracting knowledge from the semantic features by adopting knowledge extraction models corresponding to a plurality of knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types;
and performing joint training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
9. The method of claim 8, wherein the knowledge extraction model comprises a first classification network and a second classification network,
the method for extracting knowledge from the semantic features by using knowledge extraction models corresponding to a plurality of knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types comprises the following steps:
for any knowledge type in the knowledge types, adopting a first classification network corresponding to the knowledge type to classify the semantic features so as to obtain a first classification label of each character in the training text;
classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character;
under the condition that each first classification tag comprises a first target tag and each second classification tag comprises a second target tag, taking the knowledge type as the prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first target tag and the second target tag;
the first target label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second target label is used for indicating that the corresponding character is a tail character of the prediction knowledge point.
10. The method of claim 8, wherein the knowledge extraction model comprises a sequence annotation model,
the method for extracting knowledge from the semantic features by using knowledge extraction models corresponding to a plurality of knowledge types to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types comprises the following steps:
aiming at any knowledge type in the knowledge types, inputting the semantic features into a sequence labeling model corresponding to the knowledge type to obtain a prediction label of each character in the training text output by the sequence labeling model corresponding to the knowledge type;
under the condition that the prediction label of each character comprises a first prediction label and a second prediction label, taking the knowledge type as a prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first prediction label and the second prediction label;
the first prediction label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second prediction label is used for indicating that the corresponding character is a tail character of the prediction knowledge point.
11. The method of any one of claims 8-10, wherein jointly training knowledge extraction models for the plurality of knowledge types based on labeled knowledge points for each of the labeled knowledge types and predicted knowledge points for each of the predicted knowledge types comprises:
comparing each annotated knowledge type with each predicted knowledge type to use the same annotated knowledge type and predicted knowledge type as a first knowledge type, and/or use different annotated knowledge types and predicted knowledge types as a second knowledge type;
generating a first loss value according to the difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type;
and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the first loss value and/or the second loss value.
12. The method of claim 11, wherein generating a first loss value based on a difference between predicted knowledge points and annotated knowledge points belonging to the first knowledge type comprises:
for any of the first knowledge types, determining a first difference between a first position in the training text of an initial character in the annotated knowledge point belonging to the first knowledge type and a second position in the training text of an initial character in the predicted knowledge point belonging to the first knowledge type;
determining a second difference between a third position of a tail character in the training text in the labeled knowledge point belonging to the first knowledge type and a fourth position of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type;
determining a first loss value based on the first difference and the second difference.
13. The method of claim 12, wherein said determining a first loss value from said first difference and said second difference comprises:
determining a first sub-loss value according to the first difference and the second difference;
determining a second sub-loss value according to a third difference between the first text and the second text; wherein the first text is obtained by removing the first character and the last character of the labeled knowledge point belonging to the first knowledge type from the training text, and the second text is obtained by removing the first character and the last character of the predicted knowledge point belonging to the first knowledge type from the training text;
determining a type loss value of the first knowledge type according to the first sub loss value and the second sub loss value;
determining the first loss value according to the type loss value of each first knowledge type.
14. The method of claim 13, wherein generating a second sub-loss value based on a third difference between the first text and the second text comprises:
determining a character loss value of each character in the first text according to a third difference between the first text and the second text;
determining a target character from all characters of the first text according to a set proportion;
and determining the second sub-loss value according to the character loss value of each target character.
15. The method of any one of claims 8-10, wherein jointly training knowledge extraction models for the plurality of knowledge types based on labeled knowledge points for each of the labeled knowledge types and predicted knowledge points for each of the predicted knowledge types comprises:
determining a third loss value of the plurality of knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type;
determining a fourth loss value according to the third loss values of the plurality of knowledge types;
and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the fourth loss value.
16. A knowledge extraction apparatus, the apparatus comprising:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a text to be extracted and prompt information, and the prompt information is used for indicating information to be extracted in the text to be extracted;
the splicing module is used for splicing the text to be extracted and the prompt information to obtain an input text;
the extraction module is used for extracting the features of the input text to obtain the semantic features of the input text;
and the extraction module is used for extracting knowledge from the semantic features by respectively adopting knowledge extraction models corresponding to a plurality of knowledge types so as to obtain a knowledge point under at least one knowledge type in the plurality of knowledge types.
17. The apparatus of claim 16, wherein the knowledge extraction model comprises a first classification network and a second classification network, and the extraction module is specifically configured to:
for any knowledge type in the knowledge types, classifying the semantic features by adopting a first classification network corresponding to the knowledge type to obtain a first classification label, wherein the first classification label is used for indicating the initial position of a knowledge point in the text to be extracted;
classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label, wherein the second classification label is used for indicating the end position of the knowledge point in the text to be extracted;
and determining knowledge points under the knowledge types according to the first classification label and the second classification label.
18. The apparatus of claim 17, wherein the extraction module is specifically configured to:
and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first classification label to the end position of the knowledge point indicated by the second classification label to obtain the knowledge point under the knowledge type.
19. The apparatus of claim 16, wherein the knowledge extraction model comprises a sequence annotation model, and the extraction module is specifically configured to:
aiming at any knowledge type in the knowledge types, carrying out sequence labeling on the semantic features by adopting a sequence labeling model corresponding to the knowledge type to obtain a first labeling label and a second labeling label;
determining knowledge points under the knowledge types according to the first labeling label and the second labeling label;
the first label is used for indicating the initial position of the knowledge point in the text to be extracted, and the second label is used for indicating the end position of the knowledge point in the text to be extracted.
20. The apparatus of claim 19, wherein the extraction module is specifically configured to:
and intercepting the text to be extracted from the initial position of the knowledge point indicated by the first label tag to the end position of the knowledge point indicated by the second label tag to obtain the knowledge point under the knowledge type.
21. The apparatus according to any one of claims 16-20, wherein the splicing module is specifically configured to:
adding a separator before the first character of the text to be extracted;
concatenating the hint information before the separator, wherein a last character of the hint information is adjacent to the separator.
22. The apparatus of any one of claims 16-20, wherein the apparatus further comprises:
a second obtaining module, configured to, for any one knowledge type of the at least one knowledge type, obtain a candidate resource matching the knowledge type from a resource pool;
the determining module is used for determining a target resource matched with the knowledge point under the knowledge type from each candidate resource;
and the display module is used for displaying the target resource.
23. A training apparatus for knowledge extraction models, the apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a training text and prompt information, the prompt information is used for indicating information to be extracted in the training text, and the marking information of the training text is used for indicating the corresponding relation between at least one marking knowledge type and marking knowledge points;
the splicing module is used for splicing the training text and the prompt information to obtain an input text;
the extraction module is used for extracting the characteristics of the input text to obtain the semantic characteristics of the input text;
the extraction module is used for extracting knowledge from the semantic features by adopting a knowledge extraction model corresponding to a plurality of knowledge types respectively so as to obtain a predicted knowledge point under at least one predicted knowledge type in the plurality of knowledge types;
and the training module is used for performing combined training on the knowledge extraction models of the multiple knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type.
24. The apparatus of claim 23, wherein the knowledge extraction model comprises a first classification network and a second classification network, and the extraction module is specifically configured to:
for any knowledge type in the knowledge types, adopting a first classification network corresponding to the knowledge type to classify the semantic features so as to obtain a first classification label of each character in the training text;
classifying the semantic features by adopting a second classification network corresponding to the knowledge type to obtain a second classification label of each character;
under the condition that each first classification tag comprises a first target tag and each second classification tag comprises a second target tag, taking the knowledge type as the prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first classification tag and the second classification tag;
the first target label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second target label is used for indicating that the corresponding character is a tail character of the prediction knowledge point.
25. The apparatus of claim 23, wherein the knowledge extraction model comprises a sequence annotation model, and the extraction module is specifically configured to:
aiming at any knowledge type in the knowledge types, inputting the semantic features into a sequence labeling model corresponding to the knowledge type to obtain a prediction label of each character in the training text output by the sequence labeling model corresponding to the knowledge type;
under the condition that the prediction label of each character comprises a first prediction label and a second prediction label, taking the knowledge type as a prediction knowledge type, and determining a prediction knowledge point under the prediction knowledge type according to the first prediction label and the second prediction label;
the first prediction label is used for indicating that the corresponding character is a first character of the prediction knowledge point, and the second prediction label is used for indicating that the corresponding character is a tail character of the prediction knowledge point.
26. The apparatus according to any one of claims 23-25, wherein the training module is specifically configured to:
comparing each annotated knowledge type with each predicted knowledge type to take the same annotated knowledge type and predicted knowledge type as a first knowledge type and/or take different annotated knowledge types and predicted knowledge types as a second knowledge type;
generating a first loss value according to the difference between the predicted knowledge point and the labeled knowledge point belonging to the first knowledge type, and/or generating a second loss value according to the predicted knowledge point or the labeled knowledge point belonging to the second knowledge type;
and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the first loss value and/or the second loss value.
27. The apparatus of claim 11, wherein the training module is specifically configured to:
for any of the first knowledge types, determining a first difference between a first position in the training text of an initial character in the annotated knowledge point belonging to the first knowledge type and a second position in the training text of an initial character in the predicted knowledge point belonging to the first knowledge type;
determining a second difference between a third position of a tail character in the training text in the labeled knowledge point belonging to the first knowledge type and a fourth position of the tail character in the training text in the predicted knowledge point belonging to the first knowledge type;
determining a first loss value based on the first difference and the second difference.
28. The apparatus of claim 27, wherein the training module is specifically configured to:
determining a first sub-loss value according to the first difference and the second difference;
determining a second sub-loss value according to a third difference between the first text and the second text; wherein the first text is obtained by removing the first character and the last character of the labeled knowledge point belonging to the first knowledge type from the training text, and the second text is obtained by removing the first character and the last character of the predicted knowledge point belonging to the first knowledge type from the training text;
determining a type loss value of the first knowledge type according to the first sub loss value and the second sub loss value;
determining the first loss value according to the type loss value of each first knowledge type.
29. The apparatus of claim 28, wherein the training module is specifically configured to:
determining a character loss value of each character in the first text according to a third difference between the first text and the second text;
determining a target character from all characters of the first text according to a set proportion;
and determining the second sub-loss value according to the character loss value of each target character.
30. The apparatus according to any one of claims 23-25, wherein the training module is specifically configured to:
determining a third loss value of the plurality of knowledge types according to the labeled knowledge points under each labeled knowledge type and the predicted knowledge points under each predicted knowledge type;
determining a fourth loss value according to the third loss values of the plurality of knowledge types;
and performing joint training on the knowledge extraction models of the plurality of knowledge types according to the fourth loss value.
31. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of knowledge extraction of any one of claims 1-7 or the method of training a knowledge extraction model of any one of claims 8-15.
32. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to execute the knowledge extraction method of any one of claims 1-7 or the training method of the knowledge extraction model of any one of claims 8-15.
33. A computer program product comprising a computer program which, when being executed by a processor, carries out the steps of the knowledge extraction method according to any one of claims 1 to 7, or the training method of the knowledge extraction model according to any one of claims 8 to 15.
CN202211466616.9A 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model Active CN115688920B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211466616.9A CN115688920B (en) 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211466616.9A CN115688920B (en) 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model

Publications (2)

Publication Number Publication Date
CN115688920A true CN115688920A (en) 2023-02-03
CN115688920B CN115688920B (en) 2023-08-25

Family

ID=85053438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211466616.9A Active CN115688920B (en) 2022-11-22 2022-11-22 Knowledge extraction method, training device, training equipment and training medium for model

Country Status (1)

Country Link
CN (1) CN115688920B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116244517A (en) * 2023-03-03 2023-06-09 北京航空航天大学 Multi-scene multi-task recommendation method based on hierarchical information extraction network
CN116483982A (en) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN117349425A (en) * 2023-12-04 2024-01-05 北京仁科互动网络技术有限公司 Knowledge item generation method, device, equipment and storage medium
CN117421426A (en) * 2023-12-18 2024-01-19 广东信聚丰科技股份有限公司 Knowledge point labeling method and system based on artificial intelligence assistance

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100228693A1 (en) * 2009-03-06 2010-09-09 phiScape AG Method and system for generating a document representation
GB201615371D0 (en) * 2015-11-11 2016-10-26 Adobe Systems Inc Structured knowledge modeling and extraction from images
CN111881298A (en) * 2020-08-04 2020-11-03 上海交通大学 Semi-structured text processing and analyzing method
CN112948535A (en) * 2019-12-10 2021-06-11 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN113392182A (en) * 2021-05-11 2021-09-14 宜通世纪物联网研究院(广州)有限公司 Knowledge matching method, device, equipment and medium fusing context semantic constraints
CN113656556A (en) * 2021-08-20 2021-11-16 上海大参林医疗健康科技有限公司 Text feature extraction method and knowledge graph construction method
CN114417785A (en) * 2021-12-27 2022-04-29 科大讯飞股份有限公司 Knowledge point annotation method, model training method, computer device, and storage medium
CN114595686A (en) * 2022-03-11 2022-06-07 北京百度网讯科技有限公司 Knowledge extraction method, and training method and device of knowledge extraction model
CN114610894A (en) * 2022-01-26 2022-06-10 清华大学 Multi-task joint knowledge mining method and device based on discourse context
CN114969371A (en) * 2022-05-31 2022-08-30 北京智谱华章科技有限公司 Heat sorting method and device of combined knowledge graph

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100228693A1 (en) * 2009-03-06 2010-09-09 phiScape AG Method and system for generating a document representation
GB201615371D0 (en) * 2015-11-11 2016-10-26 Adobe Systems Inc Structured knowledge modeling and extraction from images
CN112948535A (en) * 2019-12-10 2021-06-11 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN111881298A (en) * 2020-08-04 2020-11-03 上海交通大学 Semi-structured text processing and analyzing method
CN113392182A (en) * 2021-05-11 2021-09-14 宜通世纪物联网研究院(广州)有限公司 Knowledge matching method, device, equipment and medium fusing context semantic constraints
CN113656556A (en) * 2021-08-20 2021-11-16 上海大参林医疗健康科技有限公司 Text feature extraction method and knowledge graph construction method
CN114417785A (en) * 2021-12-27 2022-04-29 科大讯飞股份有限公司 Knowledge point annotation method, model training method, computer device, and storage medium
CN114610894A (en) * 2022-01-26 2022-06-10 清华大学 Multi-task joint knowledge mining method and device based on discourse context
CN114595686A (en) * 2022-03-11 2022-06-07 北京百度网讯科技有限公司 Knowledge extraction method, and training method and device of knowledge extraction model
CN114969371A (en) * 2022-05-31 2022-08-30 北京智谱华章科技有限公司 Heat sorting method and device of combined knowledge graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡代旺,焦一源,李雁妮: "一种新型高效的文库知识图谱实体关系抽取算法", 《西安电子科技大学学报(自然科学版)》, pages 75 - 83 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116244517A (en) * 2023-03-03 2023-06-09 北京航空航天大学 Multi-scene multi-task recommendation method based on hierarchical information extraction network
CN116244517B (en) * 2023-03-03 2023-11-28 北京航空航天大学 Multi-scene multi-task model training method based on hierarchical information extraction network
CN116483982A (en) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN116483982B (en) * 2023-06-25 2023-10-13 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN117349425A (en) * 2023-12-04 2024-01-05 北京仁科互动网络技术有限公司 Knowledge item generation method, device, equipment and storage medium
CN117349425B (en) * 2023-12-04 2024-03-22 北京仁科互动网络技术有限公司 Knowledge item generation method, device, equipment and storage medium
CN117421426A (en) * 2023-12-18 2024-01-19 广东信聚丰科技股份有限公司 Knowledge point labeling method and system based on artificial intelligence assistance
CN117421426B (en) * 2023-12-18 2024-03-05 广东信聚丰科技股份有限公司 Knowledge point labeling method and system based on artificial intelligence assistance

Also Published As

Publication number Publication date
CN115688920B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN111625635A (en) Question-answer processing method, language model training method, device, equipment and storage medium
CN110717339A (en) Semantic representation model processing method and device, electronic equipment and storage medium
CN115688920B (en) Knowledge extraction method, training device, training equipment and training medium for model
CN113722493B (en) Text classification data processing method, apparatus and storage medium
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN110599324A (en) Method and device for predicting refund rate
CN111046656A (en) Text processing method and device, electronic equipment and readable storage medium
CN113392209A (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN112148881A (en) Method and apparatus for outputting information
CN111859953A (en) Training data mining method and device, electronic equipment and storage medium
CN112507702A (en) Text information extraction method and device, electronic equipment and storage medium
EP4057283A2 (en) Method for detecting voice, method for training, apparatuses and smart speaker
CN113641830A (en) Model pre-training method and device, electronic equipment and storage medium
CN113553412A (en) Question and answer processing method and device, electronic equipment and storage medium
CN114218940B (en) Text information processing and model training method, device, equipment and storage medium
CN114547301A (en) Document processing method, document processing device, recognition model training equipment and storage medium
CN114416976A (en) Text labeling method and device and electronic equipment
CN113705192A (en) Text processing method, device and storage medium
CN112528658A (en) Hierarchical classification method and device, electronic equipment and storage medium
CN112541070A (en) Method and device for excavating slot position updating corpus, electronic equipment and storage medium
CN114239583B (en) Method, device, equipment and medium for training entity chain finger model and entity chain finger
CN114611521B (en) Entity identification method, device, equipment and storage medium
CN116467461A (en) Data processing method, device, equipment and medium applied to power distribution network
CN114417974B (en) Model training method, information processing device, electronic equipment and medium
CN112883183B (en) Method for constructing multi-classification model, intelligent customer service method, and related device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant