CN114742058B

CN114742058B - Named entity extraction method, named entity extraction device, computer equipment and storage medium

Info

Publication number: CN114742058B
Application number: CN202210375268.8A
Authority: CN
Inventors: 袁扬; 朱运; 乔建秀
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2022-04-11
Filing date: 2022-04-11
Publication date: 2023-06-02
Anticipated expiration: 2042-04-11
Also published as: CN114742058A

Abstract

The embodiment of the application belongs to the technical field of natural language processing in artificial intelligence, and relates to a named entity extraction method, a named entity extraction device, computer equipment and a storage medium. In addition, the present application relates to blockchain technology, and the target entity extraction model of the user can be stored in the blockchain. According to the method, the entity nouns in the insurance field are automatically extracted by using the neural network model, the defects of the traditional manual construction and rule template matching method are overcome, and meanwhile, the target entity extraction model is obtained according to corpus training related to the text in the training field, so that the entity extraction of the method is kept high in robustness, generalization capability and execution capability.

Description

Named entity extraction method, named entity extraction device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of natural language processing in artificial intelligence, and in particular, to a named entity extraction method, device, computer device, and storage medium.

Background

Entity extraction, also commonly referred to as named entity extraction, includes entity detection and classification, often as a basic task of text information processing, has a wide range of application scenarios, such as knowledge graph, information extraction, automatic abstracts, automatic questions and answers, recommendation systems, and so forth.

The existing entity extraction method is that the entity vocabulary construction depends on domain experts to carry out manual rule construction or carries out retrieval or classification on knowledge bases such as semantic networks, vocabularies, word libraries and the like.

However, the applicant finds that the conventional entity extraction method is generally not intelligent, and because the vocabulary scale is limited, expert knowledge or vocabulary recording scope is seriously relied on, and coverage of new words/uncommon words/abbreviations/other types of entity words is very limited, so that a great deal of manpower, time and resource cost are required to be input for carrying out long-period updating iteration. As can be seen, the conventional entity extraction method has the problems of low robustness, generalization capability and execution capability.

Disclosure of Invention

An objective of the embodiments of the present application is to provide a named entity extraction method, apparatus, computer device, and storage medium, so as to solve the problems of low robustness, generalization capability, and execution capability of the conventional entity extraction method.

In order to solve the above technical problems, the embodiments of the present application provide a named entity extraction method, which adopts the following technical schemes:

obtaining a target entity class;

performing entity category labeling operation on the existing field word list according to the target entity category to obtain training field text;

Performing first parameter adjustment operation on a pre-training entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model consists of a Bert model, a BiLSTM layer and a CRF layer;

performing entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity primary identification result;

acquiring correction corpus data which is sent by a user terminal and corresponds to the entity primary identification result;

performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;

and carrying out automatic extraction operation of the named entity according to the target entity extraction model.

In order to solve the above technical problems, the embodiments of the present application further provide a named entity extraction device, which adopts the following technical scheme:

the target entity category acquisition module is used for acquiring the target entity category;

the entity category labeling module is used for carrying out entity category labeling operation on the existing field word list according to the target entity category to obtain training field text;

The first parameter adjustment module is used for performing first parameter adjustment operation on the pre-training entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model consists of a Bert model, a BiLSTM layer and a CRF layer;

the entity primary identification module is used for carrying out entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity primary identification result;

the correction corpus acquisition module is used for acquiring correction corpus data which is sent by the user terminal and corresponds to the entity initial recognition result;

the second parameter adjustment module is used for carrying out second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model;

and the model application module is used for carrying out automatic extraction operation of the named entity according to the target entity extraction model.

In order to solve the above technical problems, the embodiments of the present application further provide a computer device, which adopts the following technical schemes:

comprising a memory having stored therein computer readable instructions which when executed by a processor implement the steps of the named entity extraction method as described above.

In order to solve the above technical problems, embodiments of the present application further provide a computer readable storage medium, which adopts the following technical solutions:

the computer readable storage medium has stored thereon computer readable instructions which when executed by a processor implement the steps of the named entity extraction method as described above.

Compared with the prior art, the embodiment of the application has the following main beneficial effects:

the application provides a named entity extraction method, which comprises the following steps: obtaining a target entity class; performing entity category labeling operation on the existing field word list according to the target entity category to obtain training field text; performing first parameter adjustment operation on a pre-training entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model consists of a Bert model, a BiLSTM layer and a CRF layer; performing entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity primary identification result; acquiring correction corpus data which is sent by a user terminal and corresponds to the entity primary identification result; performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and carrying out automatic extraction operation of the named entity according to the target entity extraction model. According to the method, a pre-training entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pre-training entity extraction model according to training field texts constructed by target entity types and existing field word forms, entity initial recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, and finally second parameter adjustment is carried out on the intermediate entity extraction model through correction corpus data fed back by a user terminal for an entity initial recognition result after the entity initial recognition operation, so that a target entity extraction model conforming to the target entity types is finally obtained for carrying out entity automatic extraction work.

Drawings

For a clearer description of the solution in the present application, a brief description will be given below of the drawings that are needed in the description of the embodiments of the present application, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flowchart illustrating an implementation of a named entity extraction method according to an embodiment of the present disclosure;

FIG. 3 is a flow chart of one embodiment of step S202 of FIG. 2;

FIG. 4 is a flowchart of one embodiment of obtaining a pre-trained entity extraction model according to one embodiment of the present application;

FIG. 5 is a flowchart of another embodiment of obtaining a pre-trained entity extraction model according to one embodiment of the present application;

FIG. 6 is a schematic diagram of a vector similarity calculation operation according to an embodiment of the present disclosure;

FIG. 7 is a flow chart of one embodiment of step S505 of FIG. 5;

FIG. 8 is a flowchart of another embodiment of step S505 of FIG. 5;

FIG. 9 is a schematic diagram of a named entity extraction device according to a second embodiment of the present disclosure;

FIG. 10 is a schematic diagram of an embodiment of the entity class labeling module 220 of FIG. 9;

FIG. 11 is a structural schematic diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In order to better understand the technical solutions of the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings.

As shown in fig. 1, a system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the

terminal devices

101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the

terminal devices

101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the

terminal devices

101, 102, 103.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the

terminal devices

101, 102, 103.

It should be noted that, the named entity extraction method provided in the embodiments of the present application is generally executed by a server/terminal device, and accordingly, the named entity extraction device is generally disposed in the server/terminal device.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flowchart of an implementation of a named entity extraction method according to an embodiment of the present application is shown, and for convenience of explanation, only a portion relevant to the present application is shown.

The named entity extraction method comprises the following steps: step S201, step S202, step S203, step S204, step S205, step S206, and step S207.

Step S201: and obtaining the target entity category.

In the embodiment of the application, the target entity category refers to a domain entity category which needs to be extracted according to actual service requirements.

In the embodiment of the present application, the target entity class may be obtained through sending by the user terminal, or may be obtained through inputting by the device terminal, which should be understood that the example of obtaining the target entity class is only convenient to understand and is not limited to the present application.

Step S202: and performing entity category labeling operation on the existing domain word list according to the target entity category to obtain training domain text.

In the embodiment of the application, the existing domain vocabulary refers to the existing domain dictionary and vocabulary.

In the embodiment of the application, the entity category labeling operation may be that entity word matching and automatic labeling of entity categories are performed through a character string matching method.

In the embodiment of the application, the training field text is mainly used for constructing a training set and a verification set of a training model.

Step S203: and performing first parameter adjustment operation on the pre-training entity extraction model according to the training field text to obtain an intermediate entity extraction model, wherein the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model consists of a Bert model, a BiLSTM layer and a CRF layer.

In the embodiment of the present application, the training set and the verification set constructed in step S202 and the target entity category are combined, and fine-tuning is performed on the pre-training entity extraction model, so as to save model parameters.

Step S204: and performing entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity primary identification result.

In the embodiment of the application, the target database refers to a question-answer database which is actually applied, entity recognition is carried out on query sentences of the target database, the recognized entity words are counted, classified and ordered according to category and word frequency, the entity words are compared with the existing standard category word list, new added words are extracted, and manual correction is submitted to supplement the existing word list.

In the embodiment of the application, interval random sampling can be performed on the extraction result of the query according to sentence length distribution, then manual verification is submitted to construct a domain entity extraction standard training corpus, and meanwhile, manual evaluation of actual performance of the model is performed.

Step S205: and obtaining correction corpus data which is sent by the user terminal and corresponds to the entity initial recognition result.

In the embodiment of the application, because the constructed training set and verification set are generated by direct matching according to the word list, large errors exist, and therefore, manual correction is needed.

Step S206: and carrying out second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model.

In the embodiment of the present application, because the constructed training set and the validation set are generated by directly matching according to the word list, there is a large error, so that the actual domain entity word annotation corpus obtained by feedback in step S205 is required to perform secondary fine-tuning on the intermediate entity extraction model again, and update and save the domain entity extraction model parameters.

Step S207: and carrying out automatic extraction operation of the named entity according to the target entity extraction model.

In an embodiment of the present application, a named entity extraction method is provided, including: obtaining a target entity class; performing entity category labeling operation on the existing domain word list according to the target entity category to obtain training domain text; performing first parameter adjustment operation on a pre-training entity extraction model according to training field texts to obtain an intermediate entity extraction model, wherein the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model consists of a Bert model, a BiLSTM layer and a CRF layer; performing entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity primary identification result; acquiring correction corpus data corresponding to an entity initial recognition result, which is sent by a user terminal; performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and carrying out automatic extraction operation of the named entity according to the target entity extraction model. According to the method, a pre-training entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pre-training entity extraction model according to training field texts constructed by target entity types and existing field word forms, entity initial recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, and finally second parameter adjustment is carried out on the intermediate entity extraction model through correction corpus data fed back by a user terminal for an entity initial recognition result after the entity initial recognition operation, so that a target entity extraction model conforming to the target entity types is finally obtained for carrying out entity automatic extraction work.

With continued reference to fig. 3, a flowchart of one embodiment of step S202 of fig. 2 is shown, only portions relevant to the present application being shown for ease of illustration.

In some optional implementations of the present embodiment, step S202 specifically includes: step S301, step S302, and step S303.

Step S301: and performing sorting operation on the word list in the prior art according to the sequence from large to small of the vocabulary length to obtain a word list in the sorting field.

Step S302: and performing entity word matching operation on the word list in the sorting field according to the character string matching method to obtain word list entity words.

In the embodiment of the present application, the character string matching method refers to searching for all occurrence positions of a certain character string P in one large character string T. Wherein T is called text, P is called pattern, T and P are both defined on the same alphabet Σ, wherein the string matching method may be:

1) The brute force method is to match with the most intuitive and direct thought from top to bottom;

2) The Robin Karp algorithm is that the character string matching is carried out by using the principle of hash;

3) kmp, namely creating a next array by using the prefix and suffix characteristics of the mode string, and obtaining a matching result by using the next array to perform unidirectional scanning in the original string;

It should be understood that the foregoing examples of the string matching method are for convenience of understanding only and are not intended to limit the present application.

Step S303: and performing entity category labeling operation on the vocabulary entity words according to the target entity categories to obtain training field texts.

In the embodiment of the application, the word list in the prior art is ordered in a disordered and irregular manner, so that different entity words are overlapped in the sentence, and the accuracy of entity category labeling operation is further affected.

With continued reference to fig. 4, a flowchart of a specific implementation of obtaining a pre-training entity extraction model according to the first embodiment of the present application is shown, and for convenience of explanation, only a portion relevant to the present application is shown.

In some optional implementations of the present embodiment, before step S203, further includes: step S401 and step S402.

Step S401: and pre-training the entity extraction model based on the BERT language model according to the universal corpus sample and the entity label corresponding to the universal corpus sample.

Step S402: and fine tuning the pre-trained BERT language model according to the specific entity corpus sample and the entity label corresponding to the specific entity corpus sample to obtain a pre-trained entity extraction model.

In the embodiment of the application, the extraction of the entity-related word vector sequence in the text is realized by fine tuning the pre-trained BERT language model.

With continued reference to fig. 5, a flowchart of another embodiment of obtaining a pre-training entity extraction model according to the first embodiment of the present application is shown, and for convenience of explanation, only a portion relevant to the present application is shown.

In some optional implementations of the present embodiment, before step S203, further includes: step S501, step S502, step S503, step S504, and step S505.

Step S501: the training database is read, and a training text data set is obtained in the training database, wherein the training text data set at least comprises a first positive sample, a second positive sample with the same category as the first positive sample and a random sample with different categories from the first positive sample.

In the embodiment of the application, the training text data set is a triplet data set of (sentence 1, sentence 2, and sentence 3), and categories of

sentences

1, 2, and 3 are A, A, B respectively. The first two elements are any two positive examples of the same class, and the last element can be randomly extracted from different classes, or extracted from classes which are difficult to distinguish from class A, or extracted from two classes before combination.

Step S502: and respectively inputting the first positive sample, the second positive sample and the random sample into an original Bert network to perform feature transformation operation to obtain a first feature vector, a second feature vector and a random feature vector.

In the embodiment of the present application, the original Bert network refers to an original feature vector transformation model without any training. The input triples pass through the same Bert layer. Bert functions as an encoder to output sentence vectors that characterize semantics.

Step S503: and carrying out vector similarity calculation operation on the first feature vector and the second feature vector to obtain the similarity of the similar vectors.

Step S504: and carrying out vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-similar vector similarity.

In the embodiment of the application, referring to fig. 6, a schematic diagram of a vector similarity calculation operation is shown, where send 1, all send 2, send 3 are sentence vectors that share the BERT layer output. Because of the possibility of noise samples in each group of samples, the triple loss can be calculated by taking the average value of the similarity between all sample vectors of each group of samples and the send 1, or calculating the maximum value of the similarity between all sample vectors of the positive example and the send 1 (which is the most similar to the sentence 1) and the minimum value of the similarity between all sample vectors of the negative example and the send 1 (which is the least similar to the sentence 1).

Step S505: and training the Bert network based on the similar vector similarity, the non-similar vector similarity and the triplet loss function to obtain a pre-training entity extraction model.

In the embodiment of the application, three sentences are assumed in each BAG for multi-instance learning. Then the corresponding triplet (sentence 1, sentence 2, sentence 3) in (1) is changed into (sentence 1, [ 2[0], sentence 2[1], sentence 2[2] ], sentence 3[0, sentence 3[1], sentence 3[2] ], i.e. a positive example sample, a group of similar positive example samples, and a group of different similar negative example samples are input.

Training step 1: the 7 sentences simultaneously output 7 vectors through the same model BERT sharing weights.

Training step 2: and respectively calculating the similarity of the sentence 1 and the other 6 sentences.

Training step 3: taking the positive example BAG, it is most similar to sentence 1: max (Sim (sentence 1, sentence 2[0)), sim (sentence 1, sentence 2[0), sim (sentence 1, sentence 2[2 ])) (the highest similarity among the three may be sentence 1 and sentence 2[1.

Training step 4: taking negative example BAG, it is least similar to sentence 1: min (Sim (sentence 1, sentence 3[0)), sim (sentence 1, sentence 3[0 ]), sim (sentence 1, sentence 3[2 ]) (the least similar assumption among the three is sentence 3[1 ]).

Training step 5: the similarity difference between the training step 3 and the training step 4 is as large as possible.

Training step 6: while doing another task: except for the task of pulling the Triplet loss of the inter-class gap, sentence 1 finally makes a multiple classification.

Training step 7: after model training, classification of sentence 1 can be obtained, and classes can be distinguished from each other in the process under the effort of weakening marking noise. Only the model part of sentence 1 to sentence 1 category prediction needs to be reserved at this time, so that the classification of an arbitrary input sentence can be predicted.

With continued reference to fig. 7, a flowchart of one embodiment of step S505 of fig. 5 is shown, only the portions relevant to the present application being shown for ease of illustration.

In some optional implementations of this embodiment, the step S505 specifically includes: step S701, step S702, and step S703.

In step S701, an average value of the similarity of the same kind is calculated, and an average same kind vector is obtained.

In practical application, if the similarity of the same kind as that of sentence 1 and 2[0 is 60, the similarity of the same kind of sentence 1 and 2[1 is 70, the similarity of the same kind of sentence 1 and sentence 2 is 80, and the average value of the similarity of the same kind is 70 according to the average value calculation, then the average similarity vector is equal to Sim (sentence 1, sentence 2[1).

In step S702, an average value of the non-homogeneous similarities is calculated, and an average non-homogeneous vector is obtained.

In the embodiment of the present application, the calculation of the average value of the non-homogeneous similarity is the same as the implementation manner of calculating the average value of the homogeneous similarity.

In step S703, a back update operation is performed on the BERT network based on the first feature vector, the average homogeneous vector, the average heterogeneous vector, and the triplet loss function, so as to obtain a pre-training entity extraction model.

In the embodiment of the application, the reverse updating operation is mainly used for dynamically updating the characterization parameters of the BERT network according to the change of the average homogeneous vector and the average non-homogeneous vector.

With continued reference to fig. 8, a flowchart of another embodiment of step S505 of fig. 5 is shown, only the portions relevant to the present application being shown for ease of illustration.

In some optional implementations of the first embodiment of the present application, the step S505 specifically includes: step S801, step S802, and step S803.

In step S801, the maximum similarity vector having the maximum similarity is acquired from the second feature vectors based on the similarity of the similarity.

In the embodiment of the present application, if the similarity between sentence 1 and sentence 2[0 is 60, the similarity between sentence 1 and sentence 2[1 is 70, the similarity between sentence 1 and sentence 2[2] is 80, and the maximum similarity is 80 as calculated by the maximum value, then the maximum similarity vector is Sim (sentence 1, sentence 2[2 ]).

In step S802, a smallest random vector with the smallest similarity is obtained from the random feature vectors based on the non-homogeneous similarity.

In the embodiment of the present application, the implementation manner of obtaining the smallest random vector with the smallest similarity is the same as that of obtaining the largest similar vector with the largest similarity.

In step S5803, a back update operation is performed on the BERT network based on the first feature vector, the maximum homogeneous vector, the minimum random vector, and the triplet loss function, to obtain a pre-training entity extraction model.

In the embodiment of the application, the reverse updating operation is mainly used for dynamically updating the characterization parameters of the BERT network according to the changes of the maximum homogeneous vector and the minimum random vector.

In some optional implementations of the first embodiment of the present application, the triplet loss function is expressed as:

wherein N represents the total number of the entire training set;

representing a first positive example sample; />

Representing a first feature vector;

representing a second positive example; />

Representing a second feature vector; />

Representing random samples; />

Representing a random feature vector; alpha represents the minimum separation of the distance between the first positive sample and the second positive sample and the distance between the first positive sample and the random sample.

In the embodiment of the present application, a refers to an anchor, which represents a first positive sample tuple; p is positive and represents the second positive sample tuple; n refers to negative and represents a random sample tuple.

In the embodiment of the present application, + represents that when the value in [ ] is greater than zero, the loss is taken as the loss, and when the value is less than zero, the loss is zero.

When (when)

And->

The distance between them is less than->

And->

When the sum of the distance and alpha [ alpha ]]The internal value is greater than zero, resulting in losses.

When (when)

And->

The distance between them is greater than or equal to->

And->

When the sum of the distance between the two is equal to alpha, the loss is zero.

In summary, the present application provides a named entity extraction method, including: obtaining a target entity class; performing entity category labeling operation on the existing domain word list according to the target entity category to obtain training domain text; performing first parameter adjustment operation on a pre-training entity extraction model according to training field texts to obtain an intermediate entity extraction model, wherein the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model consists of a Bert model, a BiLSTM layer and a CRF layer; performing entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model to obtain an entity primary identification result; acquiring correction corpus data corresponding to an entity initial recognition result, which is sent by a user terminal; performing second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data to obtain a target entity extraction model; and carrying out automatic extraction operation of the named entity according to the target entity extraction model. According to the method, a pre-training entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pre-training entity extraction model according to training field texts constructed by target entity types and existing field word forms, entity initial recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, and finally second parameter adjustment is carried out on the intermediate entity extraction model through correction corpus data fed back by a user terminal for an entity initial recognition result after the entity initial recognition operation, so that a target entity extraction model conforming to the target entity types is finally obtained for carrying out entity automatic extraction work. Furthermore, the word list in the prior art is ordered according to the sequence from large to small of the word length, so that the situation that boundary overlapping of different entity words occurs in sentences is avoided, and the accuracy of entity category labeling operation is effectively ensured; the extraction of the entity-related word vector sequences in the text is achieved by fine tuning the pre-trained BERT language model.

It should be emphasized that, to further ensure the privacy and security of the target entity extraction model, the target entity extraction model may also be stored in a node of a blockchain.

The blockchain referred to in the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Those skilled in the art will appreciate that implementing all or part of the processes of the methods of the embodiments described above may be accomplished by way of computer readable instructions, stored on a computer readable storage medium, which when executed may comprise processes of embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

Example two

With further reference to fig. 9, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a named entity extracting apparatus, where an embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 9, the named entity extraction apparatus 200 of the present embodiment includes: the system comprises a target entity category acquisition module 210, an entity category labeling module 220, a first parameter adjustment module 230, an entity primary identification module 240, a revised corpus acquisition module 250, a second parameter adjustment module 260 and a model application module 270. Wherein:

a target entity category obtaining module 210, configured to obtain a target entity category;

the entity category labeling module 220 is configured to perform entity category labeling operation on the existing domain vocabulary according to the target entity category, so as to obtain training domain text;

the first parameter adjustment module 230 is configured to perform a first parameter adjustment operation on a pre-training entity extraction model according to a training field text to obtain an intermediate entity extraction model, where the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model is composed of a Bert model, a BiLSTM layer and a CRF layer;

The entity primary identification module 240 is configured to perform an entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model, so as to obtain an entity primary identification result;

the corrected corpus acquisition module 250 is configured to acquire corrected corpus data corresponding to the entity initial recognition result sent by the user terminal;

the second parameter adjustment module 260 is configured to perform a second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data, so as to obtain a target entity extraction model;

the model application module 270 is configured to perform an automatic named entity extraction operation according to the target entity extraction model.

In the embodiment of the present application, the training set and the verification set constructed by the entity class labeling module 220 and the target entity class are combined to perform fine-tuning on the pre-training entity extraction model, and the model parameters are saved.

In the embodiment of the present application, because the constructed training set and the validation set are generated by directly matching according to the word list, there is a large error, so that the actual domain entity word annotation corpus obtained by feedback of the correction corpus obtaining module 250 is required to perform secondary fine-tuning on the intermediate entity extraction model again, and update and save the domain entity extraction model parameters.

In an embodiment of the present application, there is provided a named entity extraction apparatus 200, including: a target entity category obtaining module 210, configured to obtain a target entity category; the entity category labeling module 220 is configured to perform entity category labeling operation on the existing domain vocabulary according to the target entity category, so as to obtain training domain text; the first parameter adjustment module 230 is configured to perform a first parameter adjustment operation on a pre-training entity extraction model according to a training field text to obtain an intermediate entity extraction model, where the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model is composed of a Bert model, a BiLSTM layer and a CRF layer; the entity primary identification module 240 is configured to perform an entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model, so as to obtain an entity primary identification result; the corrected corpus acquisition module 250 is configured to acquire corrected corpus data corresponding to the entity initial recognition result sent by the user terminal; the second parameter adjustment module 260 is configured to perform a second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data, so as to obtain a target entity extraction model; the model application module 270 is configured to perform an automatic named entity extraction operation according to the target entity extraction model. According to the method, a pre-training entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pre-training entity extraction model according to training field texts constructed by target entity types and existing field word forms, entity initial recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, and finally second parameter adjustment is carried out on the intermediate entity extraction model through correction corpus data fed back by a user terminal for an entity initial recognition result after the entity initial recognition operation, so that a target entity extraction model conforming to the target entity types is finally obtained for carrying out entity automatic extraction work.

With continued reference to fig. 10, a schematic diagram of an embodiment of the entity class labeling module 220 of fig. 9 is shown, and for ease of illustration, only portions relevant to the present application are shown.

In some optional implementations of this embodiment, the entity class labeling module 220 includes: a ranking sub-module 221, an entity word matching sub-module 222, and an entity category labeling sub-module 223, wherein:

a sorting sub-module 221, configured to perform a sorting operation on the existing domain vocabulary according to the order of the word length from large to small, so as to obtain a sorted domain vocabulary;

the entity word matching submodule 222 is configured to perform entity word matching operation on the vocabulary in the sorting domain according to the character string matching method to obtain vocabulary entity words;

the entity category labeling sub-module 223 is configured to perform entity category labeling operation on the vocabulary entity words according to the target entity category, and obtain training field text.

In some optional implementations of this embodiment, the named entity extraction device 200 further includes: the training module and fine setting module in advance, wherein:

the pre-training module is used for pre-training the entity extraction model based on the BERT language model according to the universal corpus sample and the entity label corresponding to the universal corpus sample;

And the fine tuning module is used for fine tuning the pre-trained BERT language model according to the specific entity corpus sample and the entity label corresponding to the specific entity corpus sample to obtain a pre-trained entity extraction model.

In some optional implementations of this embodiment, the named entity extraction device 200 further includes: training text acquisition module, feature transformation training module, like vector similarity calculation module, non-like vector similarity calculation module and network training module, wherein:

the training text acquisition module is used for reading a training database and acquiring a training text data set in the training database, wherein the training text data set at least comprises a first positive sample, a second positive sample with the same class as the first positive sample and a random sample with different classes as the first positive sample;

the feature transformation training module is used for respectively inputting the first positive sample, the second positive sample and the random sample into the original BERT network to perform feature transformation operation to obtain a first feature vector, a second feature vector and a random feature vector;

The similar vector similarity calculation module is used for carrying out vector similarity calculation operation on the first feature vector and the second feature vector to obtain similar vector similarity;

the non-homogeneous vector similarity calculation module is used for carrying out vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-homogeneous vector similarity;

and the network training module is used for training the BERT network based on the similar vector similarity, the non-similar vector similarity and the triplet loss function to obtain a pre-training entity extraction model.

In some optional implementations of this embodiment, the network training module includes: the system comprises a homogeneous average value calculation sub-module, a non-homogeneous average value calculation sub-module and a first reverse updating sub-module. Wherein:

the homogeneous average value calculation sub-module is used for calculating the average value of homogeneous similarity to obtain an average homogeneous vector;

the non-homogeneous average value calculation sub-module is used for calculating the average value of the non-homogeneous similarity to obtain an average non-homogeneous vector;

and the first reverse updating sub-module is used for carrying out reverse updating operation on the BERT network based on the first feature vector, the average similar vector, the average non-similar vector and the triplet loss function to obtain a pre-training entity extraction model.

In some optional implementations of this embodiment, the network training module further includes: the system comprises a maximum value acquisition sub-module, a minimum value acquisition sub-module and a second reverse updating sub-module. Wherein:

the maximum value obtaining sub-module is used for obtaining the maximum similar vector with the maximum similarity from the second characteristic vector based on the similar similarity;

the minimum value acquisition sub-module is used for acquiring a minimum random vector with minimum similarity from the random feature vectors based on the non-similar similarity;

and the second reverse updating sub-module is used for carrying out reverse updating operation on the BERT network based on the first feature vector, the maximum similar vector, the minimum random vector and the triplet loss function to obtain a pre-training entity extraction model.

In some alternative implementations of the present embodiment, the triplet loss function described above is expressed as:

wherein N represents the total number of the entire training set;

representing a first positive example sample; />

Representing a first feature vector;

representing a second positive example; />

Representing a second feature vector; />

Representing random samples; />

Representing a random feature vector; alpha representsA minimum separation of a distance between the first positive sample and the second positive sample and a distance between the first positive sample and the random sample.

In summary, the present application provides a named entity extraction device 200, including: a target entity category obtaining module 210, configured to obtain a target entity category; the entity category labeling module 220 is configured to perform entity category labeling operation on the existing domain vocabulary according to the target entity category, so as to obtain training domain text; the first parameter adjustment module 230 is configured to perform a first parameter adjustment operation on a pre-training entity extraction model according to a training field text to obtain an intermediate entity extraction model, where the pre-training entity extraction model is obtained by training based on a public named entity extraction corpus, and the pre-training entity extraction model is composed of a Bert model, a BiLSTM layer and a CRF layer; the entity primary identification module 240 is configured to perform an entity primary identification operation on the query statement of the target database according to the intermediate entity extraction model, so as to obtain an entity primary identification result; the corrected corpus acquisition module 250 is configured to acquire corrected corpus data corresponding to the entity initial recognition result sent by the user terminal; the second parameter adjustment module 260 is configured to perform a second parameter adjustment operation on the intermediate entity extraction model according to the corrected corpus data, so as to obtain a target entity extraction model; the model application module 270 is configured to perform an automatic named entity extraction operation according to the target entity extraction model. According to the method, a pre-training entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pre-training entity extraction model according to training field texts constructed by target entity types and existing field word forms, entity initial recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, and finally second parameter adjustment is carried out on the intermediate entity extraction model through correction corpus data fed back by a user terminal for an entity initial recognition result after the entity initial recognition operation, so that a target entity extraction model conforming to the target entity types is finally obtained for carrying out entity automatic extraction work. Furthermore, the word list in the prior art is ordered according to the sequence from large to small of the word length, so that the situation that boundary overlapping of different entity words occurs in sentences is avoided, and the accuracy of entity category labeling operation is effectively ensured; the extraction of the entity-related word vector sequences in the text is achieved by fine tuning the pre-trained BERT language model.

In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 11, fig. 11 is a basic structural block diagram of a computer device according to the present embodiment.

The computer device 300 includes a memory 310, a processor 320, and a network interface 330 communicatively coupled to each other via a system bus. It should be noted that only computer device 300 having components 310-330 is shown in the figures, but it should be understood that not all of the illustrated components need be implemented, and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.

The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory 310 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 310 may be an internal storage unit of the computer device 300, such as a hard disk or a memory of the computer device 300. In other embodiments, the memory 310 may also be an external storage device of the computer device 300, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 300. Of course, the memory 310 may also include both internal storage units and external storage devices of the computer device 300. In this embodiment, the memory 310 is typically used to store an operating system and various application software installed on the computer device 300, such as computer readable instructions of a named entity extraction method. In addition, the memory 310 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 320 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 320 is generally used to control the overall operation of the computer device 300. In this embodiment, the processor 320 is configured to execute computer readable instructions stored in the memory 310 or process data, such as computer readable instructions for executing the named entity extraction method.

The network interface 330 may include a wireless network interface or a wired network interface, the network interface 330 typically being used to establish communication connections between the computer device 300 and other electronic devices.

According to the computer equipment, a pre-training entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pre-training entity extraction model according to training field texts constructed by target entity types and existing field word forms, initial entity recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, and finally second parameter adjustment is carried out on the intermediate entity extraction model through correction corpus data fed back by a user terminal for an entity initial recognition result after the initial entity recognition operation, so that a target entity extraction model conforming to the target entity types is finally obtained for carrying out entity automatic extraction work.

The present application also provides another embodiment, namely, a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of a named entity extraction method as described above.

According to the method, a pretrained entity extraction model consisting of a Bert model, a BiLSTM layer and a CRF layer is trained through conventional named entity recognition public corpus resources, first parameter adjustment is carried out on the pretrained entity extraction model through training field texts constructed according to target entity types and existing field word forms, entity initial recognition operation is carried out according to an intermediate entity extraction model after the first parameter adjustment, finally, second parameter adjustment is carried out on the intermediate entity extraction model through user terminals aiming at correction corpus data fed back by an entity initial recognition result after the entity initial recognition operation, finally, a target entity extraction model conforming to the target entity types is obtained for carrying out entity automatic extraction work, entity nouns in the insurance field are automatically extracted by using a neural network model, the defects of a traditional manual construction and rule template matching method are overcome, meanwhile, the target entity extraction model is obtained according to corpus related to the training field texts, and accordingly, the entity extraction of the method is kept high in robustness, generalization capacity and execution capacity.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.

It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.

Claims

1. A named entity extraction method, comprising the steps of:

obtaining a target entity class;

carrying out automatic extraction operation of named entities according to the target entity extraction model;

before the step of performing the first parameter adjustment operation on the pre-training entity extraction model according to the training field text to obtain the intermediate entity extraction model, the method further includes:

Reading a training database, and acquiring a training text data set in the training database, wherein the training text data set at least comprises a first positive sample, a second positive sample with the same category as the first positive sample and a random sample with different categories from the first positive sample;

the first positive example sample, the second positive example sample and the random sample are respectively input into an original BERT network to perform the feature conversion operation, so as to obtain a first feature vector, a second feature vector and a random feature vector;

performing vector similarity calculation operation on the first feature vector and the second feature vector to obtain similar vector similarity;

performing vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-similar vector similarity;

training the original BERT network based on the similar vector similarity, the non-similar vector similarity and a triplet loss function to obtain the pre-training entity extraction model;

the step of training the original BERT network based on the similar vector similarity, the non-similar vector similarity and the triplet loss function to obtain the pre-training entity extraction model specifically comprises the following steps:

Obtaining the maximum similar vector with the maximum similarity from the second characteristic vector based on the similar vector similarity;

acquiring a minimum random vector with minimum similarity from the random feature vectors based on the non-homogeneous vector similarity;

and carrying out reverse updating operation on the original BERT network based on the first eigenvector, the maximum similar vector, the minimum random vector and the triplet loss function to obtain the pre-training entity extraction model.

2. The named entity extraction method according to claim 1, wherein the step of performing entity class labeling operation on the existing domain vocabulary according to the target entity class to obtain training domain text specifically comprises the following steps:

sorting the existing domain word list according to the sequence from big to small of the word length to obtain a sorted domain word list;

performing entity word matching operation on the word list of the sorting field according to a character string matching method to obtain word list entity words;

and carrying out the entity category labeling operation on the vocabulary entity words according to the target entity categories to obtain the training field text.

3. The named entity extraction method according to claim 1, wherein before the step of performing a first parameter adjustment operation on a pre-trained entity extraction model according to the training field text to obtain an intermediate entity extraction model, the method further comprises:

Pre-training an entity extraction model based on a BERT language model according to a general corpus sample and an entity label corresponding to the general corpus sample;

and fine tuning the pre-trained BERT language model according to the specific entity corpus sample and the entity label corresponding to the specific entity corpus sample to obtain a pre-trained entity extraction model.

4. A named entity extraction device, comprising:

the model application module is used for carrying out automatic extraction operation of the named entity according to the target entity extraction model;

the apparatus further comprises: training text acquisition module, feature transformation training module, like vector similarity calculation module, non-like vector similarity calculation module and network training module, wherein:

the training text acquisition module is used for reading a training database, and acquiring a training text data set in the training database, wherein the training text data set at least comprises a first positive sample, a second positive sample with the same class as the first positive sample and a random sample with different classes as the first positive sample;

the feature transformation training module is configured to input the first positive sample, the second positive sample, and the random sample to an original BERT network respectively to perform the feature transformation operation, so as to obtain a first feature vector, a second feature vector, and a random feature vector;

the non-homogeneous vector similarity calculation module is used for performing vector similarity calculation operation on the first feature vector and the random feature vector to obtain non-homogeneous vector similarity;

the network training module is used for performing training operation on the original BERT network based on the similar vector similarity, the non-similar vector similarity and a triplet loss function to obtain the pre-training entity extraction model;

the network training module comprises: the system comprises a maximum value acquisition sub-module, a minimum value acquisition sub-module and a second reverse updating sub-module, wherein:

the maximum value obtaining submodule is used for obtaining the maximum similar vector with the maximum similarity from the second characteristic vector based on the similar vector similarity;

the minimum value obtaining submodule is used for obtaining a minimum random vector with minimum similarity from the random feature vectors based on the non-homogeneous vector similarity;

and the second reverse updating sub-module is used for carrying out reverse updating operation on the original BERT network based on the first eigenvector, the maximum similar vector, the minimum random vector and the triplet loss function to obtain the pre-training entity extraction model.

5. The named entity extraction device of claim 4, wherein the entity class labeling module comprises:

the sorting sub-module is used for sorting the existing domain word list according to the sequence from large to small of the word length to obtain a sorting domain word list;

the entity word matching sub-module is used for carrying out entity word matching operation on the word list in the sorting field according to a character string matching method to obtain word list entity words;

and the entity category labeling sub-module is used for carrying out the entity category labeling operation on the vocabulary entity words according to the target entity category to obtain the training field text.

6. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which when executed by the processor implement the steps of the named entity extraction method of any of claims 1 to 3.

7. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the named entity extraction method of any of claims 1 to 3.