CN114742058A - Named entity extraction method and device, computer equipment and storage medium - Google Patents

Named entity extraction method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114742058A
CN114742058A CN202210375268.8A CN202210375268A CN114742058A CN 114742058 A CN114742058 A CN 114742058A CN 202210375268 A CN202210375268 A CN 202210375268A CN 114742058 A CN114742058 A CN 114742058A
Authority
CN
China
Prior art keywords
entity
entity extraction
extraction model
training
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210375268.8A
Other languages
Chinese (zh)
Other versions
CN114742058B (en
Inventor
袁扬
朱运
乔建秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210375268.8A priority Critical patent/CN114742058B/en
Publication of CN114742058A publication Critical patent/CN114742058A/en
Application granted granted Critical
Publication of CN114742058B publication Critical patent/CN114742058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application belongs to the technical field of natural language processing in artificial intelligence, and relates to a named entity extraction method, a named entity extraction device, computer equipment and a storage medium. In addition, the application also relates to a block chain technology, and a target entity extraction model of a user can be stored in the block chain. According to the method, the neural network model is used for automatically extracting entity nouns in the insurance field, the defects of the traditional method based on manual construction and rule template matching are overcome, and meanwhile, the target entity extraction model is obtained according to the corpus training related to the text in the training field, so that the entity extraction of the method keeps high robustness, generalization capability and execution capability.

Description

Named entity extraction method and device, computer equipment and storage medium
Technical Field
The present application relates to the technical field of natural language processing in artificial intelligence, and in particular, to a named entity extraction method, apparatus, computer device, and storage medium.
Background
Entity extraction, also commonly referred to as named entity extraction, includes entity detection and classification, and is generally used as a basic work of text information processing, and has a wide range of application scenarios, such as knowledge graph, information extraction, automatic abstract, automatic question answering, recommendation system, and the like.
In the existing entity extraction method, the entity word list construction depends on the field experts to carry out manual rule construction, or the entity word list construction is carried out in a mode of searching or classifying knowledge bases such as semantic network/vocabulary/word base and the like.
However, the applicant finds that the traditional entity extraction method is generally not intelligent, and because the vocabulary scale is limited, the method depends heavily on expert knowledge or vocabulary receiving and recording range, the coverage degree of the entity words of the types of new words, rare words, short words, abbreviation words, alternative names and the like is very limited, so that a great amount of manpower, time and resource cost are required to be invested for carrying out long-period updating iteration. Therefore, the traditional entity extraction method has the problems of low robustness, generalization capability and execution capability.
Disclosure of Invention
An embodiment of the present application aims to provide a named entity extraction method, a named entity extraction device, a computer device, and a storage medium, so as to solve the problem that the conventional entity extraction method is low in robustness, generalization capability, and execution capability.
In order to solve the above technical problem, an embodiment of the present application provides a named entity extraction method, which adopts the following technical solutions:
acquiring a target entity type;
performing entity category labeling operation on the word list of the existing field according to the target entity category to obtain a text of the training field;
performing first parameter adjustment operation on a pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer;
performing entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result;
acquiring corrected corpus data which is sent by a user terminal and corresponds to the entity initial identification result;
performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;
and carrying out automatic extraction operation of the named entity according to the target entity extraction model.
In order to solve the above technical problem, an embodiment of the present application further provides a named entity extracting device, which adopts the following technical solutions:
the target entity type obtaining module is used for obtaining the target entity type;
the entity type labeling module is used for carrying out entity type labeling operation on the word list in the prior field according to the target entity type to obtain a text in the training field;
the first parameter adjusting module is used for performing first parameter adjusting operation on a pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer;
the entity initial identification module is used for carrying out entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result;
the corrected corpus acquiring module is used for acquiring corrected corpus data which is sent by the user terminal and corresponds to the entity initial identification result;
the second parameter adjusting module is used for performing second parameter adjusting operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;
and the model application module is used for carrying out automatic extraction operation of the named entity according to the target entity extraction model.
In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:
comprising a memory having computer readable instructions stored therein and a processor that when executed implements the steps of the named entity extraction method as described above.
In order to solve the foregoing technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:
the computer readable storage medium has stored thereon computer readable instructions which, when executed by a processor, implement the steps of the named entity extraction method as described above.
Compared with the prior art, the embodiment of the application mainly has the following beneficial effects:
the application provides a named entity extraction method, which comprises the following steps: acquiring a target entity type; performing entity category labeling operation on the word list of the existing field according to the target entity category to obtain a text of the training field; performing first parameter adjustment operation on a pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a published named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer; performing entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result; acquiring corrected corpus data which is sent by a user terminal and corresponds to the entity initial identification result; performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and carrying out automatic extraction operation of the named entity according to the target entity extraction model. The method trains a pre-trained entity extraction model consisting of a Bert model, a BilSTM layer and a CRF layer through conventional named entity recognition public corpus resources, performs first parameter adjustment on the pre-trained entity extraction model according to a target entity category and a training field text constructed by a word list in the prior art, performs initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, and finally performs second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation to finally obtain a target entity extraction model according with the target entity category so as to perform automatic entity extraction work. Meanwhile, the target entity extraction model is obtained by training according to the corpus related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability.
Drawings
In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
fig. 2 is a flowchart illustrating an implementation of a named entity extraction method according to an embodiment of the present disclosure;
FIG. 3 is a flowchart of one embodiment of step S202 of FIG. 2;
FIG. 4 is a flowchart of an embodiment of obtaining a pre-trained entity extraction model according to an embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating another embodiment of obtaining a pre-trained entity extraction model according to an embodiment of the present application;
FIG. 6 is a diagram illustrating a vector similarity calculation operation according to an embodiment of the present disclosure;
FIG. 7 is a flowchart of one embodiment of step S505 of FIG. 5;
FIG. 8 is a flowchart of another embodiment of step S505 of FIG. 5;
fig. 9 is a schematic structural diagram of a named entity extraction apparatus according to a second embodiment of the present application;
FIG. 10 is a block diagram illustrating one embodiment of the entity class labeling module 220 of FIG. 9;
FIG. 11 is a schematic block diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof in the description and claims of this application and the description of the figures above, are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. Network 104 is the medium used to provide communication links between terminal devices 101, 102, 103 and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the named entity extraction method provided in the embodiment of the present application is generally executed by a server/terminal device, and accordingly, the named entity extraction apparatus is generally disposed in the server/terminal device.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.
Continuing to refer to fig. 2, a flowchart of an implementation of the named entity extraction method provided in an embodiment of the present application is shown, and for convenience of description, only the portion related to the present application is shown.
The named entity extraction method comprises the following steps: step S201, step S202, step S203, step S204, step S205, step S206, and step S207.
Step S201: and acquiring the target entity category.
In the embodiment of the application, the target entity type refers to a domain entity type which needs to be extracted and is formulated according to actual business requirements.
In the embodiment of the present application, the target entity category may be obtained by sending from a user terminal or by inputting from a device terminal, and it should be understood that the example of obtaining the target entity category is only for convenience of understanding and is not used to limit the present application.
Step S202: and carrying out entity category labeling operation on the word list of the existing field according to the category of the target entity to obtain a text of the training field.
In the embodiments of the present application, the existing domain vocabulary refers to an existing domain dictionary or vocabulary.
In the embodiment of the present application, the entity category labeling operation may be performing entity word matching and automatic labeling of the entity category by a character string matching method.
In the embodiment of the application, the training domain text is mainly used for constructing a training set and a verification set of a training model.
Step S203: and performing first parameter adjustment operation on the pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer.
In the embodiment of the present application, the training set and the verification set constructed in step S202 and the target entity category are combined to perform fine-tuning on the pre-trained entity extraction model, and the model parameters are saved.
Step S204: and performing initial entity identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an initial entity identification result.
In the embodiment of the application, the target database refers to a question and answer database which is actually applied, entity recognition is carried out on query sentences of the target database, recognized entity words are counted, classification and sequencing are carried out according to categories and word frequencies, comparison is carried out on the entity words and the word frequencies are compared with the existing standard category word list, newly added words are extracted, and manual proofreading is submitted to supplement the existing word list.
In the embodiment of the application, interval random sampling can be performed on the extraction result of the query according to sentence length distribution, manual verification is submitted to construct the domain entity extraction standard training corpus, and meanwhile, the actual performance of the model is manually evaluated.
Step S205: and acquiring corrected corpus data which is sent by the user terminal and corresponds to the initial entity identification result.
In the embodiment of the application, the constructed training set and the constructed verification set are generated by direct matching according to the vocabulary, so that a large error exists, and manual proofreading is required.
Step S206: and performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model.
In the embodiment of the present application, since the constructed training set and the verification set are generated by direct matching according to the vocabulary, and there is a large error, it is necessary to perform secondary fine-tuning on the intermediate entity extraction model again according to the real domain entity word tagging corpus obtained by feedback in step S205, and update and store the domain entity extraction model parameters.
Step S207: and carrying out automatic extraction operation of the named entity according to the target entity extraction model.
In an embodiment of the present application, a named entity extraction method is provided, including: acquiring a target entity type; performing entity category labeling operation on the word list of the existing field according to the category of the target entity to obtain a text of the training field; performing first parameter adjustment operation on a pre-trained entity extraction model according to a training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer; performing entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result; acquiring corrected corpus data which is sent by a user terminal and corresponds to an entity initial identification result; performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and carrying out automatic extraction operation of the named entity according to the target entity extraction model. The method comprises the steps of training a pre-training entity extraction model consisting of a Bert model, a BilSt layer and a CRF layer through conventional named entity recognition public corpus resources, performing first parameter adjustment on the pre-training entity extraction model according to a target entity category and a training field text constructed by a word list in the prior art, performing initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, performing second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation, and finally obtaining the target entity extraction model according with the target entity category to perform automatic entity extraction work. Meanwhile, the target entity extraction model is obtained by training according to the linguistic data related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability.
Continuing to refer to fig. 3, a flowchart of one embodiment of step S202 of fig. 2 is shown, and for ease of illustration, only the portions relevant to the present application are shown.
In some optional implementation manners of this embodiment, step S202 specifically includes: step S301, step S302, and step S303.
Step S301: and sequencing the word lists in the prior field according to the sequence of the word lengths from large to small to obtain the word lists in the sequencing field.
Step S302: and performing entity word matching operation on the word list in the sequencing field according to the character string matching method to obtain entity words of the word list.
In the embodiment of the present application, the string matching method refers to searching for all appearance positions of a certain string P in a large string T. Wherein T is called text, P is called pattern, and T and P are both defined on the same alphabet Σ, wherein the string matching method may be:
1) the brute force method, namely, the most intuitive and direct thinking from top to bottom is used for matching;
2) the Robin Karp algorithm is used for carrying out character string matching by using the hash principle;
3) kmp, creating a next array by using the prefix and suffix characteristics of the pattern string, and performing unidirectional scanning on the original string by using the next array to obtain a matching result;
it should be understood that the above examples of the string matching method are only for convenience of understanding and are not intended to limit the present application.
Step S303: and carrying out entity category labeling operation on the word list entity words according to the target entity categories to obtain training field texts.
In the embodiment of the application, because the sequencing of the word lists in the prior art is chaotic and irregular, the situation that different entity words are overlapped at the boundary in a sentence is caused, and the accuracy of entity category labeling operation is further influenced, the word lists in the prior art are sequenced in the sequence of the lengths of the words from large to small, so that the situation that different entity words are overlapped at the boundary in the sentence is avoided, and the accuracy of entity category labeling operation is effectively ensured.
Continuing to refer to fig. 4, a flowchart of a specific implementation of obtaining a pre-trained entity extraction model according to an embodiment of the present application is shown, and for convenience of illustration, only the relevant portions of the present application are shown.
In some optional implementations of this embodiment, before step S203, the method further includes: step S401 and step S402.
Step S401: and pre-training the entity extraction model based on the BERT language model according to the universal corpus sample and the entity label corresponding to the universal corpus sample.
Step S402: and fine-tuning the pre-trained BERT language model according to the specific entity corpus sample and the entity label corresponding to the specific entity corpus sample to obtain a pre-trained entity extraction model.
In the embodiment of the application, the entity related word vector sequence in the text is extracted by fine tuning the pre-trained BERT language model.
Continuing to refer to fig. 5, a flow chart of another specific implementation of obtaining a pre-trained entity extraction model according to an embodiment of the present application is shown, and for convenience of explanation, only the relevant portions of the present application are shown.
In some optional implementations of this embodiment, before step S203, the method further includes: step S501, step S502, step S503, step S504, and step S505.
Step S501: reading a training database, and obtaining a training text data set in the training database, wherein the training text data set at least comprises a first positive example sample, a second positive example sample with the same type as the first positive example sample, and a random sample with the different type from the first positive example sample.
In the embodiment of the present application, the training text data set is a triple data set of (sentence 1, sentence 2, sentence 3), and the categories of sentences 1, 2, and 3 are A, A, B, respectively. The first two elements are any two positive samples of the same class, and the last element can be randomly extracted from different classes, or extracted from classes which are difficult to distinguish from class A, or extracted by combining the first two modes.
Step S502: and respectively inputting the first positive sample, the second positive sample and the random sample into the original Bert network for feature transformation operation to obtain a first feature vector, a second feature vector and a random feature vector.
In the embodiment of the present application, the original Bert network refers to an original feature vector transformation model without any training. The input triplets pass through the same Bert layer. Bert acts as an encoder, with the goal of outputting a sentence vector that characterizes semantics.
Step S503: and performing vector similarity calculation operation on the first feature vector and the second feature vector to obtain the similarity of the similar vectors.
Step S504: and performing vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-homogeneous vector similarity.
In the embodiment of the present application, referring to fig. 6, a schematic diagram of the vector similarity calculation operation is shown, in which sent1, all sent2, and sent3 are sentence vectors sharing BERT layer output. Due to the possibility of noisy samples in each set of samples, the Triplet loss can be calculated by taking the mean of the similarity of all sample vectors of each set of samples to the sent1, or the maximum of the similarity of all sample vectors of positive example bag to the sent1 (indicating the most similar to sentence 1) and the minimum of the similarity of all sample vectors of negative example bag to the sent1 (indicating the least similar to sentence 1).
Step S505: and training the Bert network based on the homogeneous vector similarity, the non-homogeneous vector similarity and the triple loss function to obtain a pre-training entity extraction model.
In the present embodiment, it is assumed that there are three sentences in each BAG for multi-instance learning. Then the corresponding triplet (sentence 1, sentence 2, sentence 3) in (1) above becomes (sentence 1, [ sentence 2[0], sentence 2[1], sentence 2[2], [ sentence 3[0], sentence 3[1], sentence 3[2]), i.e. a positive example, a set of homogeneous positive example samples, and a set of heterogeneous negative example samples are input.
Figure BDA0003590118330000091
Training step 1: the 7 sentences simultaneously pass through the same model BERT sharing the weight, and 7 vector vectors are output.
Training step 2: the similarity of sentence 1 to the other 6 sentences is calculated, respectively.
Training step 3: taking the positive example BAG, the most similar to sentence 1: max (Sim (sentence 1, sentence 2[0]), Sim (sentence 1, sentence 2[2])) (the highest similarity of the three may be that of sentence 1 and sentence 2[ 1.) since sentence 2[0] is actually a noisy data with a tag error, this calculation can weaken its effect on the model.
Training step 4: taking negative example BAG, the most dissimilar to sentence 1: min (Sim (sentence 1, sentence 3[0]), Sim (sentence 1, sentence 3[2])) (the least similar hypothesis of the three is sentence 3[1 ]).
Training step 5: triple loss, the similarity difference between the training step 3 and the training step 4 is made as large as possible.
Training step 6: while doing another task: in addition to the task of triple loss above, which opens up the inter-class gap, sentence 1 is finally classified as multi-class.
Training step 7: after the model is trained, the classification of sentence 1 can be obtained, and the classes can be further distinguished from each other in the process under the effort of weakening the marking noise. At this time, only the model part of sentence 1 to sentence 1 category prediction needs to be reserved, and the method can be used for predicting the classification of an optional input sentence.
Continuing to refer to fig. 7, a flowchart of one embodiment of step S505 of fig. 5 is shown, and for ease of illustration, only the portions relevant to the present application are shown.
In some optional implementation manners of this embodiment, the step S505 specifically includes: step S701, step S702, and step S703.
In step S701, an average value of the similarity of the same type is calculated to obtain an average similar vector.
In practical application, if the similarity between sentence 1 and sentence 2[0] is 60, the similarity between sentence 1 and sentence 2[1] is 70, the similarity between sentence 1 and sentence 2[2] is 80, and the average value of similarity calculated from the average value is 70, then the average similarity vector is equal to Sim (sentence 1, sentence 2[1 ]).
In step S702, an average value of the non-homogeneous similarity is calculated to obtain an average non-homogeneous vector.
In the embodiment of the present application, the calculation of the average value of the non-homogeneous similarity is the same as the above-described implementation of the calculation of the average value of the homogeneous similarity.
In step S703, a reverse update operation is performed on the BERT network based on the first feature vector, the average homogeneous vector, the average non-homogeneous vector, and the triplet loss function, so as to obtain a pre-training entity extraction model.
In the embodiment of the present application, the reverse update operation is mainly used to dynamically update the characterization parameters of the BERT network according to the changes of the average homogeneous vectors and the average non-homogeneous vectors.
Continuing to refer to fig. 8, a flowchart of another embodiment of step S505 in fig. 5 is shown, and for convenience of illustration, only the relevant portions of the present application are shown.
In some optional implementation manners of the first embodiment of the present application, step S505 specifically includes: step S801, step S802, and step S803.
In step S801, a maximum similarity vector having the greatest similarity is obtained from the second feature vectors based on the similarity of the same class.
In the embodiment of the present application, if the similarity of sentence 1 to sentence 2[0] is 60, the similarity of sentence 1 to sentence 2[1] is 70, the similarity of sentence 1 to sentence 2[2] is 80, and the similarity calculated from the maximum value is 80, the maximum similarity vector is Sim (sentence 1, sentence 2[2 ]).
In step S802, a minimum random vector with a minimum similarity is obtained from the random feature vectors based on the non-homogeneous similarity.
In the embodiment of the present application, the implementation manner of obtaining the minimum random vector with the minimum similarity is the same as that of obtaining the maximum homogeneous vector with the maximum similarity.
In step S5803, the BERT network is reversely updated based on the first eigenvector, the largest homogeneous vector, the smallest random vector, and the triplet loss function, so as to obtain a pre-training entity extraction model.
In the embodiment of the present application, the reverse update operation is mainly used to dynamically update the characterization parameters of the BERT network according to the variation of the maximum homogeneous vector and the minimum random vector.
In some optional implementations of the first embodiment of the present application, the triple loss function is expressed as:
Figure BDA0003590118330000111
wherein N represents the total number of the entire training set;
Figure BDA0003590118330000112
representing a first positive example sample;
Figure BDA0003590118330000113
representing a first feature vector;
Figure BDA0003590118330000114
representing a second positive example sample;
Figure BDA0003590118330000115
representing a second feature vector;
Figure BDA0003590118330000116
representing a random sample;
Figure BDA0003590118330000117
representing a random feature vector; α represents a minimum interval of a distance between the first and second positive examples and a distance between the first positive example and the random sample.
In the embodiment of the application, a refers to anchor and represents a first positive sample tuple; p refers to positive, representing a second positive sample tuple; n refers to negative and represents a random sample tuple.
In the embodiment of the present application, when the value in + represents [ ] is greater than zero, the loss is taken as the value, and when the value is less than zero, the loss is zero.
When in use
Figure BDA0003590118330000118
And with
Figure BDA0003590118330000119
Is less than
Figure BDA00035901183300001110
And
Figure BDA00035901183300001111
the sum of the distance between the two and alpha]If the value of inner is greater than zero, losses will occur.
When in use
Figure BDA00035901183300001112
And
Figure BDA00035901183300001113
is greater than or equal to
Figure BDA00035901183300001114
And
Figure BDA00035901183300001115
the sum of the distance between and alpha, the loss is zero.
In summary, the present application provides a named entity extraction method, including: acquiring a target entity type; performing entity category labeling operation on the word list of the existing field according to the category of the target entity to obtain a text of the training field; performing first parameter adjustment operation on a pre-trained entity extraction model according to a training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer; performing entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result; acquiring corrected corpus data which is sent by a user terminal and corresponds to an entity initial identification result; performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and carrying out automatic extraction operation of the named entity according to the target entity extraction model. The method trains a pre-trained entity extraction model consisting of a Bert model, a BilSTM layer and a CRF layer through conventional named entity recognition public corpus resources, performs first parameter adjustment on the pre-trained entity extraction model according to a target entity category and a training field text constructed by a word list in the prior art, performs initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, and finally performs second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation to finally obtain a target entity extraction model according with the target entity category so as to perform automatic entity extraction work. Meanwhile, the target entity extraction model is obtained by training according to the corpus related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability. Furthermore, the word lists in the prior art are sequenced through the sequence of the lengths of the words from large to small, so that the condition that the boundaries of different entity words are overlapped in sentences is avoided, and the accuracy of entity class labeling operation is effectively ensured; the vector sequence of the entity related words in the text is extracted by fine tuning the pre-trained BERT language model.
It is emphasized that, to further ensure the privacy and security of the target entity extraction model, the target entity extraction model may also be stored in a node of a block chain.
The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, can include processes of the embodiments of the methods described above. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless otherwise indicated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
Example two
With further reference to fig. 9, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a named entity extracting apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.
As shown in fig. 9, the named entity extracting apparatus 200 of the present embodiment includes: a target entity type obtaining module 210, an entity type labeling module 220, a first parameter adjusting module 230, an entity initial identification module 240, a corrected corpus obtaining module 250, a second parameter adjusting module 260, and a model applying module 270. Wherein:
a target entity type obtaining module 210, configured to obtain a target entity type;
the entity type labeling module 220 is used for performing entity type labeling operation on the word list of the existing field according to the target entity type to obtain a training field text;
the first parameter adjusting module 230 is configured to perform a first parameter adjusting operation on the pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, where the pre-trained entity extraction model is obtained by training based on a published named entity extraction corpus, and the pre-trained entity extraction model is composed of a Bert model, a BilSTM layer, and a CRF layer;
an entity initial identification module 240, configured to perform an entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model, so as to obtain an entity initial identification result;
a corrected corpus obtaining module 250, configured to obtain corrected corpus data corresponding to the initial entity identification result sent by the user terminal;
the second parameter adjusting module 260 is configured to perform a second parameter adjusting operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;
and the model application module 270 is configured to perform automatic named entity extraction according to the target entity extraction model.
In the embodiment of the application, the target entity type refers to a domain entity type which needs to be extracted and is formulated according to actual business requirements.
In the embodiment of the present application, the target entity category may be obtained by sending from a user terminal or by inputting from a device terminal, and it should be understood that the example of obtaining the target entity category is only for convenience of understanding and is not used to limit the present application.
In the embodiments of the present application, the existing domain vocabulary refers to an existing domain dictionary or vocabulary.
In the embodiment of the present application, the entity category labeling operation may be performing entity word matching and automatic labeling of the entity category by a character string matching method.
In the embodiment of the application, the training domain text is mainly used for constructing a training set and a verification set of a training model.
In the embodiment of the present application, a training set and a verification set constructed by the entity class labeling module 220 and a target entity class are combined to perform fine-tuning on the pre-trained entity extraction model and save model parameters.
In the embodiment of the application, the target database refers to a question and answer database which is actually applied, entity recognition is carried out on query sentences of the target database, recognized entity words are counted, classification and sequencing are carried out according to categories and word frequencies, comparison is carried out on the entity words and the word frequencies are compared with the existing standard category word list, newly added words are extracted, and manual proofreading is submitted to supplement the existing word list.
In the embodiment of the application, interval random sampling can be performed on the extraction result of the query according to sentence length distribution, manual verification is submitted to construct the domain entity extraction standard training corpus, and meanwhile, the actual performance of the model is manually evaluated.
In the embodiment of the application, the constructed training set and the constructed verification set are generated by direct matching according to the vocabulary, so that a large error exists, and manual proofreading is required.
In the embodiment of the present application, since the constructed training set and the verification set are generated by direct matching according to the vocabulary, and there is a large error, the middle entity extraction model needs to be re-refined-tuning for the second time according to the real domain entity word tagging corpus obtained by feedback of the corrected corpus obtaining module 250, and the domain entity extraction model parameters are updated and stored.
In an embodiment of the present application, there is provided a named entity extraction apparatus 200, including: a target entity type obtaining module 210, configured to obtain a target entity type; the entity type labeling module 220 is used for performing entity type labeling operation on the word list of the existing field according to the target entity type to obtain a training field text; the first parameter adjusting module 230 is configured to perform a first parameter adjusting operation on a pre-trained entity extraction model according to a training domain text to obtain an intermediate entity extraction model, where the pre-trained entity extraction model is obtained by training based on a published named entity extraction corpus, and the pre-trained entity extraction model is composed of a Bert model, a BilSTM layer, and a CRF layer; an entity initial identification module 240, configured to perform an entity initial identification operation on an inquiry statement of the target database according to the intermediate entity extraction model, to obtain an entity initial identification result; a corrected corpus obtaining module 250, configured to obtain corrected corpus data corresponding to the initial entity identification result sent by the user terminal; the second parameter adjusting module 260 is configured to perform a second parameter adjusting operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and the model application module 270 is configured to perform automatic named entity extraction according to the target entity extraction model. The method trains a pre-trained entity extraction model consisting of a Bert model, a BilSTM layer and a CRF layer through conventional named entity recognition public corpus resources, performs first parameter adjustment on the pre-trained entity extraction model according to a target entity category and a training field text constructed by a word list in the prior art, performs initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, and finally performs second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation to finally obtain a target entity extraction model according with the target entity category so as to perform automatic entity extraction work. Meanwhile, the target entity extraction model is obtained by training according to the linguistic data related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability.
Continuing to refer to FIG. 10, a schematic block diagram of one embodiment of the entity class labeling module 220 of FIG. 9 is shown, and for ease of illustration, only relevant portions of the present application are shown.
In some optional implementations of this embodiment, the entity category labeling module 220 includes: a sorting sub-module 221, an entity word matching sub-module 222, and an entity category labeling sub-module 223, wherein:
the sorting sub-module 221 is configured to perform sorting operation on the word list in the existing field according to a sequence of word lengths from large to small to obtain a word list in a sorting field;
the entity word matching sub-module 222 is configured to perform entity word matching operation on the sorting field word list according to a character string matching method to obtain a word list entity word;
and the entity category labeling sub-module 223 is configured to perform entity category labeling operation on the word list entity words according to the target entity category to obtain a training field text.
In the embodiment of the present application, the string matching method refers to searching for all appearance positions of a certain string P in a large string T. Where T is called text, P is called pattern, and both T and P are defined on the same alphabet Σ, the string matching method may be:
1) the brute force method, namely, the most intuitive and direct thinking from top to bottom is used for matching;
2) the Robin Karp algorithm is used for carrying out character string matching by using the hash principle;
3) kmp, creating a next array by using the prefix and suffix characteristics of the pattern string, and performing unidirectional scanning on the original string by using the next array to obtain a matching result;
it should be understood that the above examples of the string matching method are only for convenience of understanding and are not intended to limit the present application.
In the embodiment of the application, because the sequencing of the word lists in the prior art is chaotic and irregular, the situation that different entity words are overlapped at the boundary in a sentence is caused, and the accuracy of entity category labeling operation is further influenced, the word lists in the prior art are sequenced in the sequence of the lengths of the words from large to small, so that the situation that different entity words are overlapped at the boundary in the sentence is avoided, and the accuracy of entity category labeling operation is effectively ensured.
In some optional implementations of this embodiment, the named entity extracting apparatus 200 further includes: pre-training module and fine-tuning module, wherein:
the pre-training module is used for pre-training the entity extraction model based on the BERT language model according to the universal corpus sample and the entity label corresponding to the universal corpus sample;
and the fine tuning module is used for fine tuning the pre-trained BERT language model according to the specific entity corpus sample and the entity label corresponding to the specific entity corpus sample to obtain a pre-trained entity extraction model.
In the embodiment of the application, the entity related word vector sequence in the text is extracted by fine tuning the pre-trained BERT language model.
In some optional implementations of this embodiment, the named entity extracting apparatus 200 further includes: the system comprises a training text acquisition module, a feature conversion training module, a homogeneous vector similarity calculation module, a non-homogeneous vector similarity calculation module and a network training module, wherein:
the training text acquisition module is used for reading a training database and acquiring a training text data set in the training database, wherein the training text data set at least comprises a first positive example sample, a second positive example sample with the same type as the first positive example sample and a random sample with the different type from the first positive example sample;
the feature transformation training module is used for respectively inputting the first positive sample, the second positive sample and the random sample into an original BERT network for feature transformation operation to obtain a first feature vector, a second feature vector and a random feature vector;
the homogeneous vector similarity calculation module is used for performing vector similarity calculation operation on the first feature vector and the second feature vector to obtain homogeneous vector similarity;
the non-homogeneous vector similarity calculation module is used for carrying out vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-homogeneous vector similarity;
and the network training module is used for carrying out training operation on the BERT network based on the similarity vector similarity, the non-similarity vector similarity and the triple loss function to obtain a pre-training entity extraction model.
In some optional implementations of this embodiment, the network training module includes: the system comprises a homogeneous average calculation submodule, a non-homogeneous average calculation submodule and a first reverse updating submodule. Wherein:
the same-class average calculation submodule is used for calculating the average value of the same-class similarity to obtain an average same-class vector;
the non-homogeneous average calculation submodule is used for calculating the average value of the non-homogeneous similarity to obtain an average non-homogeneous vector;
and the first reverse updating sub-module is used for performing reverse updating operation on the BERT network based on the first characteristic vector, the average homogeneous vector, the average non-homogeneous vector and the triple loss function to obtain a pre-training entity extraction model.
In some optional implementation manners of this embodiment, the network training module further includes: the maximum value obtaining submodule, the minimum value obtaining submodule and the second reverse updating submodule. Wherein:
the maximum value obtaining submodule is used for obtaining a maximum similar vector with the maximum similarity from the second characteristic vector based on the similar similarity;
the minimum value obtaining submodule is used for obtaining a minimum random vector with minimum similarity from the random feature vectors based on the non-homogeneous similarity;
and the second reverse updating sub-module is used for performing reverse updating operation on the BERT network based on the first characteristic vector, the maximum homogeneous vector, the minimum random vector and the triple loss function to obtain a pre-training entity extraction model.
In some optional implementations of this embodiment, the triplet loss function is represented as:
Figure BDA0003590118330000171
wherein N represents the total number of the entire training set;
Figure BDA0003590118330000181
representing a first positive example sample;
Figure BDA0003590118330000182
representing a first feature vector;
Figure BDA0003590118330000183
representing a second positive example sample;
Figure BDA0003590118330000184
representing a second feature vector;
Figure BDA0003590118330000185
representing a random sample;
Figure BDA0003590118330000186
representing a random feature vector; α represents a minimum interval of a distance between the first and second positive examples and a distance between the first positive example and the random sample.
In the embodiment of the application, a refers to anchor and represents a first positive example tuple; p refers to positive, representing a second positive example tuple; n refers to negative, representing a random sample tuple.
In summary, the present application provides a named entity extraction apparatus 200, including: a target entity category obtaining module 210, configured to obtain a target entity category; the entity type labeling module 220 is used for performing entity type labeling operation on the word list of the existing field according to the target entity type to obtain a training field text; the first parameter adjusting module 230 is configured to perform a first parameter adjusting operation on a pre-trained entity extraction model according to a training domain text to obtain an intermediate entity extraction model, where the pre-trained entity extraction model is obtained by training based on a published named entity extraction corpus, and the pre-trained entity extraction model is composed of a Bert model, a BilSTM layer, and a CRF layer; an entity initial identification module 240, configured to perform an entity initial identification operation on an inquiry statement of the target database according to the intermediate entity extraction model, to obtain an entity initial identification result; a corrected corpus acquiring module 250, configured to acquire corrected corpus data corresponding to the initial entity identification result sent by the user terminal; the second parameter adjusting module 260 is configured to perform a second parameter adjusting operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and the model application module 270 is configured to perform automatic named entity extraction according to the target entity extraction model. The method comprises the steps of training a pre-training entity extraction model consisting of a Bert model, a BilSt layer and a CRF layer through conventional named entity recognition public corpus resources, performing first parameter adjustment on the pre-training entity extraction model according to a target entity category and a training field text constructed by a word list in the prior art, performing initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, performing second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation, and finally obtaining the target entity extraction model according with the target entity category to perform automatic entity extraction work. Meanwhile, the target entity extraction model is obtained by training according to the linguistic data related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability. Furthermore, the word lists in the prior art are sequenced in the sequence of the word length from large to small, so that the condition that different entity words are overlapped in boundaries in sentences is avoided, and the accuracy of entity category labeling operation is effectively ensured; the vector sequence of the entity related words in the text is extracted by fine tuning the pre-trained BERT language model.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 11, fig. 11 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 300 includes a memory 310, a processor 320, and a network interface 330 communicatively coupled to each other via a system bus. It is noted that only computer device 300 having components 310 and 330 is shown, but it is understood that not all of the shown components are required and that more or fewer components may alternatively be implemented. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 310 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 310 may be an internal storage unit of the computer device 300, such as a hard disk or a memory of the computer device 300. In other embodiments, the memory 310 may also be an external storage device of the computer device 300, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 300. Of course, the memory 310 may also include both internal and external storage devices of the computer device 300. In this embodiment, the memory 310 is generally used for storing an operating system installed on the computer device 300 and various application software, such as computer readable instructions of the named entity extraction method. In addition, the memory 310 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 320 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 320 is generally operative to control overall operation of the computer device 300. In this embodiment, the processor 320 is configured to execute computer readable instructions stored in the memory 310 or process data, such as computer readable instructions for executing the named entity extraction method.
The network interface 330 may include a wireless network interface or a wired network interface, and the network interface 330 is generally used to establish a communication connection between the computer device 300 and other electronic devices.
The application provides computer equipment, which trains a pre-training entity extraction model consisting of a Bert model, a BilSt layer and a CRF layer through conventional named entity recognition public corpus resources, performs first parameter adjustment on the pre-training entity extraction model according to a target entity category and a training field text constructed by a word list in the prior field, performs initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, performs second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation, and finally obtains a target entity extraction model conforming to the target entity category to perform automatic entity extraction work, wherein a neural network model is used for automatic entity extraction in the insurance field, so that the defects of the traditional method based on manual construction and rule template matching are overcome, meanwhile, the target entity extraction model is obtained by training according to the linguistic data related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability.
The present application further provides another embodiment, which is to provide a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of the named entity extraction method as described above.
The application provides a computer readable storage medium, which trains a pre-training entity extraction model consisting of a Bert model, a BilStm layer and a CRF layer through conventional named entity recognition public corpus resources, performs first parameter adjustment on the pre-training entity extraction model according to a target entity category and a training field text constructed by a word list in the prior field, performs initial entity recognition operation according to an intermediate entity extraction model after the first parameter adjustment, and performs second parameter adjustment on the intermediate entity extraction model through corrected corpus data fed back by a user terminal according to an initial entity recognition result after the initial entity recognition operation to finally obtain a target entity extraction model according with the target entity category so as to perform automatic entity extraction, wherein the application uses a neural network model to perform automatic entity noun extraction in the insurance field, thereby overcoming the defects of the traditional method based on manual construction and rule template matching, meanwhile, the target entity extraction model is obtained by training according to the linguistic data related to the text in the training field, so that the entity extraction of the application keeps higher robustness, generalization capability and execution capability.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields, and all the equivalent structures are within the protection scope of the present application.

Claims (10)

1. A named entity extraction method is characterized by comprising the following steps:
acquiring a target entity type;
performing entity class labeling operation on the word list of the existing field according to the target entity class to obtain a text of the training field;
performing first parameter adjustment operation on a pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer;
performing entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result;
acquiring corrected corpus data which is sent by a user terminal and corresponds to the entity initial identification result;
performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;
and carrying out automatic named entity extraction operation according to the target entity extraction model.
2. The method for extracting named entities according to claim 1, wherein the step of performing entity class tagging operation on the existing domain vocabulary according to the target entity class to obtain a training domain text specifically comprises the following steps:
sequencing the word lists in the prior field according to the sequence of the word lengths from large to small to obtain a sequencing field word list;
performing entity word matching operation on the word list in the sequencing field according to a character string matching method to obtain word list entity words;
and performing entity type labeling operation on the word list entity words according to the target entity type to obtain the training field text.
3. The method of claim 1, wherein before the step of performing the first parameter adjustment operation on the pre-trained entity extraction model according to the training domain text to obtain the intermediate entity extraction model, the method further comprises:
pre-training an entity extraction model based on a BERT language model according to a general corpus sample and an entity label corresponding to the general corpus sample;
and fine-tuning the pre-trained BERT language model according to the specific entity corpus sample and the entity label corresponding to the specific entity corpus sample to obtain a pre-trained entity extraction model.
4. The method of claim 1, wherein before the step of performing the first parameter adjustment operation on the pre-trained entity extraction model according to the training domain text to obtain the intermediate entity extraction model, the method further comprises:
reading a training database, and acquiring a training text data set in the training database, wherein the training text data set at least comprises a first positive example sample, a second positive example sample with the same type as the first positive example sample, and a random sample with the different type from the first positive example sample;
inputting the first positive sample, the second positive sample and the random sample into an original BERT network respectively to perform the feature transformation operation, so as to obtain a first feature vector, a second feature vector and a random feature vector;
performing vector similarity calculation operation on the first feature vector and the second feature vector to obtain the similarity of similar vectors;
performing vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-homogeneous vector similarity;
and training the original BERT network based on the homogeneous vector similarity, the non-homogeneous vector similarity and the triple loss function to obtain the pre-training entity extraction model.
5. The method according to claim 4, wherein the step of obtaining the pre-trained entity extraction model by performing a training operation on the original BERT network based on the similarity vector similarity, the non-similarity vector similarity, and the triplet loss function specifically comprises the steps of:
calculating the average value of the similarity of the same kind to obtain an average similar vector;
calculating the average value of the non-homogeneous similarity to obtain an average non-homogeneous vector;
and carrying out reverse updating operation on the original BERT network based on the first feature vector, the average homogeneous vector, the average non-homogeneous vector and the triple loss function to obtain the pre-training entity extraction model.
6. The method according to claim 4, wherein the step of obtaining the pre-trained entity extraction model by performing a training operation on the original BERT network based on the similarity vector similarity, the non-similarity vector similarity, and the triplet loss function specifically comprises the steps of:
acquiring the maximum similar vector with the maximum similarity from the second feature vectors based on the similar similarity;
acquiring a minimum random vector with minimum similarity from the random feature vectors based on the non-homogeneous similarity;
and carrying out reverse updating operation on the original BERT network based on the first characteristic vector, the maximum homogeneous vector, the minimum random vector and the triple loss function to obtain the pre-training entity extraction model.
7. A named entity extraction apparatus, comprising:
the target entity type obtaining module is used for obtaining the target entity type;
the entity type labeling module is used for carrying out entity type labeling operation on the word list of the existing field according to the target entity type to obtain a training field text;
the first parameter adjusting module is used for performing first parameter adjusting operation on a pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-trained entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-trained entity extraction model consists of a Bert model, a BilSTM layer and a CRF layer;
the entity initial identification module is used for carrying out entity initial identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity initial identification result;
the corrected corpus acquiring module is used for acquiring corrected corpus data which is sent by the user terminal and corresponds to the entity initial identification result;
the second parameter adjusting module is used for performing second parameter adjusting operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;
and the model application module is used for carrying out automatic named entity extraction operation according to the target entity extraction model.
8. The named entity extraction device of claim 7, wherein the entity class labeling module comprises:
the sequencing submodule is used for sequencing the word list in the prior field according to the sequence of the word length from large to small to obtain a sequencing field word list;
the entity word matching sub-module is used for carrying out entity word matching operation on the word list in the sequencing field according to a character string matching method to obtain word list entity words;
and the entity category labeling submodule is used for carrying out entity category labeling operation on the word list entity words according to the target entity category to obtain the training field text.
9. A computer device comprising a memory having computer readable instructions stored therein and a processor that when executed implements the steps of the named entity extraction method of any of claims 1-6.
10. A computer-readable storage medium, having computer-readable instructions stored thereon, which, when executed by a processor, implement the steps of the named entity extraction method of any one of claims 1 to 6.
CN202210375268.8A 2022-04-11 2022-04-11 Named entity extraction method, named entity extraction device, computer equipment and storage medium Active CN114742058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210375268.8A CN114742058B (en) 2022-04-11 2022-04-11 Named entity extraction method, named entity extraction device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210375268.8A CN114742058B (en) 2022-04-11 2022-04-11 Named entity extraction method, named entity extraction device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114742058A true CN114742058A (en) 2022-07-12
CN114742058B CN114742058B (en) 2023-06-02

Family

ID=82281311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210375268.8A Active CN114742058B (en) 2022-04-11 2022-04-11 Named entity extraction method, named entity extraction device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114742058B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115204120A (en) * 2022-07-25 2022-10-18 平安科技(深圳)有限公司 Insurance field triple extraction method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200372350A1 (en) * 2019-05-22 2020-11-26 Electronics And Telecommunications Research Institute Method of training image deep learning model and device thereof
CN112784051A (en) * 2021-02-05 2021-05-11 北京信息科技大学 Patent term extraction method
CN113408290A (en) * 2021-06-29 2021-09-17 山东亿云信息技术有限公司 Intelligent marking method and system for Chinese text
CN114091427A (en) * 2021-11-19 2022-02-25 海信电子科技(武汉)有限公司 Image text similarity model training method and display equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200372350A1 (en) * 2019-05-22 2020-11-26 Electronics And Telecommunications Research Institute Method of training image deep learning model and device thereof
CN112784051A (en) * 2021-02-05 2021-05-11 北京信息科技大学 Patent term extraction method
CN113408290A (en) * 2021-06-29 2021-09-17 山东亿云信息技术有限公司 Intelligent marking method and system for Chinese text
CN114091427A (en) * 2021-11-19 2022-02-25 海信电子科技(武汉)有限公司 Image text similarity model training method and display equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115204120A (en) * 2022-07-25 2022-10-18 平安科技(深圳)有限公司 Insurance field triple extraction method and device, electronic equipment and storage medium
CN115204120B (en) * 2022-07-25 2023-05-30 平安科技(深圳)有限公司 Insurance field triplet extraction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114742058B (en) 2023-06-02

Similar Documents

Publication Publication Date Title
US11232140B2 (en) Method and apparatus for processing information
CN112101041B (en) Entity relationship extraction method, device, equipment and medium based on semantic similarity
US11586817B2 (en) Word vector retrofitting method and apparatus
CN114780727A (en) Text classification method and device based on reinforcement learning, computer equipment and medium
WO2022095354A1 (en) Bert-based text classification method and apparatus, computer device, and storage medium
CN112686022A (en) Method and device for detecting illegal corpus, computer equipment and storage medium
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN112528654A (en) Natural language processing method and device and electronic equipment
CN112686053A (en) Data enhancement method and device, computer equipment and storage medium
CN112084752A (en) Statement marking method, device, equipment and storage medium based on natural language
CN114398477A (en) Policy recommendation method based on knowledge graph and related equipment thereof
CN113505601A (en) Positive and negative sample pair construction method and device, computer equipment and storage medium
CN115438149A (en) End-to-end model training method and device, computer equipment and storage medium
CN112417887A (en) Sensitive word and sentence recognition model processing method and related equipment thereof
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN113887237A (en) Slot position prediction method and device for multi-intention text and computer equipment
CN113987125A (en) Text structured information extraction method based on neural network and related equipment thereof
CN112084779A (en) Entity acquisition method, device, equipment and storage medium for semantic recognition
CN114817478A (en) Text-based question and answer method and device, computer equipment and storage medium
CN110674635A (en) Method and device for text paragraph division
CN114090792A (en) Document relation extraction method based on comparison learning and related equipment thereof
CN114742058A (en) Named entity extraction method and device, computer equipment and storage medium
CN113420161A (en) Node text fusion method and device, computer equipment and storage medium
US20230004715A1 (en) Method and apparatus for constructing object relationship network, and electronic device
CN115730603A (en) Information extraction method, device, equipment and storage medium based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant