CN112347782A - Entity identification method and system - Google Patents

Entity identification method and system Download PDF

Info

Publication number
CN112347782A
CN112347782A CN202011056229.9A CN202011056229A CN112347782A CN 112347782 A CN112347782 A CN 112347782A CN 202011056229 A CN202011056229 A CN 202011056229A CN 112347782 A CN112347782 A CN 112347782A
Authority
CN
China
Prior art keywords
entity
entity recognition
mapping layer
model
recognition model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011056229.9A
Other languages
Chinese (zh)
Inventor
李国才
谢佳雨
陈伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN202011056229.9A priority Critical patent/CN112347782A/en
Publication of CN112347782A publication Critical patent/CN112347782A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

An entity identification method and system are provided, the entity identification method comprises: acquiring a first entity recognition model, wherein the first entity recognition model is trained in advance based on entity recognition training data of a field other than a target field, the first entity recognition model comprises a semantic understanding layer, a mapping layer and a sequence labeling layer, and the mapping layer comprises at least one sub-mapping layer; reconstructing a mapping layer of the first entity identification model to obtain a second entity identification model; training a second entity recognition model based on the entity recognition training data of the target field; and performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.

Description

Entity identification method and system
Technical Field
The present application relates to the art of machine learning model training, and more particularly, to an entity recognition method and system.
Background
Named Entity Recognition (NER) is a basic and important lexical analysis task in Natural Language Processing (NLP), and is often used as a directional or explicit or implicit basic task for information extraction, question and answer systems, machine translation, and the like.
For an entity recognition task in a target field, a specific entity recognition model needs to be trained based on a corpus of the target field. When the training corpus of the target field is less, the accuracy of the trained entity recognition model is difficult to achieve the expectation; the entity recognition model aiming at the target field is trained from scratch, a large amount of cost needs to be consumed, the implementation process is difficult, the recognition effect of the trained entity recognition model cannot be estimated, and if the recognition effect cannot meet the expectation, the cost is wasted.
Disclosure of Invention
Exemplary embodiments of the present disclosure may or may not address at least the above-mentioned problems.
In one aspect, there is provided an entity identification method, including: acquiring a first entity recognition model, wherein the first entity recognition model is trained in advance based on entity recognition training data of a field other than a target field, the first entity recognition model comprises a semantic understanding layer, a mapping layer and a sequence marking layer, and the mapping layer comprises at least one sub-mapping layer; reconstructing a mapping layer of the first entity recognition model to obtain a second entity recognition model; training a second entity recognition model based on the entity recognition training data of the target field; and performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.
Optionally, the step of reconstructing the mapping layer of the first entity recognition model includes: and adjusting the weight and the structure of each sub mapping layer in the mapping layer.
Optionally, the step of adjusting the weight and the structure of each sub mapping layer in the mapping layer comprises: the weight of each sub mapping layer in the mapping layer is initialized.
Optionally, the step of reconstructing the mapping layer of the first entity recognition model further includes: and adjusting the number of hidden units of each sub mapping layer in the mapping layer.
Optionally, the step of reconstructing the mapping layer of the first entity recognition model further includes: in the mapping layer, at least one sub-mapping layer is added on the basis of the original sub-mapping layer.
Optionally, the entity recognition training data of the domains other than the target domain is obtained by fusing entity recognition training data of at least one domain other than the target domain; the method further comprises the following steps: and when the expression modes of the same entity type in different fields are inconsistent, carrying out normalization processing on the same entity type.
Optionally, the number of entity recognition training data used for training the first entity recognition model is greater than the number of entity recognition training data used for training the second entity recognition model.
In another aspect, an entity recognition system is provided, which includes a first model acquisition module, a second model acquisition module, a model training module, and an entity recognition module;
the first model acquisition module is configured to: acquiring a first entity recognition model, wherein the first entity recognition model is trained in advance based on entity recognition training data of a field other than a target field, the first entity recognition model comprises a semantic understanding layer, a mapping layer and a sequence labeling layer, and the mapping layer comprises at least one sub-mapping layer; the second model acquisition module is configured to: reconstructing a mapping layer of the first entity identification model to obtain a second entity identification model; the model training module is configured to: training a second entity recognition model based on the entity recognition training data of the target field; the entity identification module is configured to: and performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.
Optionally, the second model acquisition module is configured to: and adjusting the weight and the structure of each sub mapping layer in the mapping layer.
Optionally, the second model acquisition module is configured to: the weight of each sub mapping layer in the mapping layer is initialized.
Optionally, the second model acquisition module is configured to: and adjusting the number of hidden units of each sub mapping layer in the mapping layer.
Optionally, the second model acquisition module is configured to: in the mapping layer, at least one sub-mapping layer is added on the basis of the original sub-mapping layer.
Optionally, the entity recognition training data of the domains other than the target domain is obtained by fusing entity recognition training data of at least one domain other than the target domain; the first model acquisition module is configured to: and when the expression modes of the same entity type in different fields are inconsistent, carrying out normalization processing on the same entity type.
Optionally, the number of entity recognition training data used for training the first entity recognition model is greater than the number of entity recognition training data used for training the second entity recognition model.
In another aspect, a computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the entity identification method described above is provided.
In another aspect, a system is provided that includes at least one computing device and a storage device having at least one stored instruction stored thereon, wherein the instruction, when executed by the at least one computing device, causes the at least one computing device to perform the entity identification method described above.
According to the entity recognition method and system provided by the exemplary embodiment of the invention, the mapping layer of the first entity recognition model trained in advance based on the entity recognition data of the fields other than the target field is reconstructed to obtain the initial second entity recognition model, the initial second entity recognition model is initialized by a small amount of training data of the target field, and the trained second entity recognition model can be applied to the entity recognition business of the target field and can achieve higher accuracy. The model training process aiming at the target field can quickly obtain the entity recognition model with expected accuracy based on less training corpora, remarkably simplify the training process and reduce the cost.
Drawings
These and/or other aspects and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 shows a flowchart of an entity identification method provided by an exemplary embodiment of the present invention.
Fig. 2 shows a block diagram of an entity identification system provided by an exemplary embodiment of the present invention.
Detailed Description
The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of embodiments of the invention defined by the claims and their equivalents. Various specific details are included to aid understanding, but these are merely to be considered exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.
In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the first step; (2) executing the step two; (3) and executing the step one and the step two.
The first entity recognition model and the second entity recognition model provided by the exemplary embodiment of the invention are both machine learning models. Here, machine learning is a necessary product of the development of artificial intelligence research to a certain stage, which is directed to improving the performance of the system itself by means of calculation, using experience. In a computer system, "experience" is usually in the form of "data" from which a "model" can be generated by means of a machine learning algorithm, i.e. the empirical data is supplied to the machine learning algorithm, on the basis of which a model can be generated, which in the face of new situations provides corresponding judgments, i.e. prediction results. Whether the machine learning model is trained or predicted using the trained machine learning model, the data needs to be converted into machine learning samples including various features. Machine learning may be implemented in the form of "supervised learning," "unsupervised learning," or "semi-supervised learning," it being noted that exemplary embodiments of the present invention do not impose particular limitations on specific machine learning algorithms. It should also be noted that other means such as statistical algorithms may also be incorporated during the training and application of the model.
In an exemplary embodiment of the present invention, the first entity recognition model may perform entity recognition on texts in a field other than the target field, and the second entity recognition model may perform entity recognition on texts in the target field, where the second entity recognition model is obtained by reconstructing and retraining the pre-trained first entity recognition model. The following describes a flow of the entity recognition method provided by the exemplary embodiment of the present invention, where the flow includes a process of obtaining a second entity recognition model and a process of performing entity recognition on a text in a target field by using the trained second entity recognition model.
Fig. 1 shows a flowchart of an entity identification method provided by an exemplary embodiment of the present invention.
Referring to fig. 1, in step S110, a first entity recognition model is obtained.
Here, the first entity recognition model is trained in advance based on entity recognition training data of a domain other than the target domain, and the first entity recognition model includes a semantic understanding layer, a mapping layer, and a sequence labeling layer, and the mapping layer includes at least one sub-mapping layer.
Alternatively, the first entity recognition model may be pre-trained based on entity recognition training data of one domain other than the target domain, or may be pre-trained based on entity recognition training data of a plurality of domains other than the target domain. The specific training process of the first entity recognition model will be exemplarily described in the following content, and it should be understood that the first entity recognition model can perform entity recognition on the text in the field to which the entity recognition training data belongs, and can achieve the desired accuracy.
As an example, the target field is a scientific field, the first entity recognition model may be trained in advance based on entity recognition training data of the movie field, and the first entity recognition model may be capable of performing entity recognition on texts in the movie field, and may achieve a desired accuracy.
Alternatively, the type of semantic understanding layer may include, but is not limited to, a bert (bidirectional Encoder retrieval from transforms) Model, a MASS Sequence to Sequence Pre-training (MASS Sequence to Sequence Pre-training) Model, a MT-DNN (Multi-Task Deep Neural Networks) Model, and a uni lm (UNIfied Pre-trained Language Model) Model, and the type of semantic understanding layer applied in the first entity recognition Model may be determined according to actual needs.
Optionally, the mapping layer may include one sub-mapping layer, or may include a plurality of sub-mapping layers. The types of the child mapping layers may include a linear layer, an lstm (Long Short-Term Memory) layer, an rnn (Current Neural network) layer, and a transform layer. The number and type of sub-mapping layers included in a mapping layer may be determined according to actual needs.
Optionally, the type of sequence annotation layer may include a crf (conditional Random field) model, a hmm (hidden Markov model), a memm (maximum entry Markov model), and the like, but is not limited thereto, and the type of sequence annotation layer applied in the first entity recognition model may be determined according to actual needs.
It should be appreciated that the first entity recognition model may be a combination of any of the above types of semantic understanding, mapping, and sequence annotation layers. For example, the structure of the first entity recognition model may include the following form: bert + linear + CRF, MASS + linear + CRF, Bert + linear + HMM, Bert + linear + CRF, Bert + linear + lstm + CRF, MASS + linear + lstm + CRF, and Bert + linear + lstm + MEMM. Of course, the structural form of the first entity recognition model is not limited thereto.
In step S120, the mapping layer of the first entity identification model is reconstructed to obtain a second entity identification model.
Optionally, reconstructing the mapping layer may include adjusting a parameter of a sub-mapping layer of the mapping layer, and adjusting the number of sub-mapping layers in the mapping layer, but the reconstruction method is not limited thereto.
It should be noted that, in step S120, the semantic understanding layer and the sequence annotation layer in the first entity recognition model are not changed, for example, the weights of the semantic understanding layer and the sequence annotation layer are not changed.
In step S130, a second entity recognition model is trained based on the entity recognition training data of the target domain.
It is understood that the training process of the second entity recognition model is substantially identical to the training process of the first entity recognition model, and the specific training process of the second entity recognition model may refer to the training process of the first entity recognition model described in the following.
It should be noted that the number of entity recognition training data used for training the first entity recognition model is greater than the number of entity recognition training data used for training the second entity recognition model. Specifically, the number of the entity recognition training data of the target domain used for training the second entity recognition model may be smaller than the number of the entity recognition training data of the domains other than the target domain used for training the first entity recognition model, under the condition that the similar or identical accuracy can be achieved.
In step S140, entity recognition is performed on the text in the target field by using the trained second entity recognition model, and an entity recognition result is output.
After the training of the second entity recognition model is completed, the text of the target field can be input into the second entity recognition model, and the second entity recognition model can perform entity recognition on the text of the target field and output an entity recognition result. Optionally, the entity recognition result may include at least one of a location of at least one entity in the text, a type of the entity, and a content of the entity.
According to the entity identification method provided by the exemplary embodiment of the invention, the mapping layer of the first entity identification model trained in advance based on the entity identification data of the fields other than the target field is reconstructed to obtain the initial second entity identification model, the initial second entity identification model is initialized by a small amount of training data of the target field, and the trained second entity identification model can be applied to the entity identification service of the target field and can achieve higher accuracy. The model training process aiming at the target field can quickly obtain the entity recognition model with expected accuracy based on less training corpora, obviously simplify the training process and reduce the cost.
Optionally, the step of reconstructing the mapping layer of the first entity recognition model in step S120 includes: and adjusting the weight and the structure of each sub mapping layer in the mapping layer.
As an example, the weights of the various sub-mapping layers in a mapping layer may be adjusted in the same manner. For example, the step of adjusting the weight of each sub-mapping layer in the mapping layer comprises: initializing the weight of each sub-mapping layer in the mapping layer, or scaling the weight of each sub-mapping layer in the mapping layer according to a preset ratio, but is not limited thereto.
As an example, the weights of the sub mapping layers in the mapping layer may be adjusted in different manners. For example, the step of adjusting the weight of each sub-mapping layer in the mapping layer comprises: initializing the weight of one part of the sub-mapping layers in the mapping layer, and scaling the weight of the other part of the sub-mapping layers in the mapping layer according to a preset proportion, but is not limited thereto.
Optionally, the sub-map layer comprises at least one hidden unit. As an example, the structure of the sub-mapping layer may be adjusted by adjusting the number of hidden units in the sub-mapping layer, wherein the number of hidden units in the sub-mapping layer may be determined according to the number of entity recognition training data used for training the second entity recognition model.
Optionally, the step of reconstructing the mapping layer of the first entity recognition model in step S120 may further include: in the mapping layer, at least one sub-mapping layer is added on the basis of the original sub-mapping layer.
It should be understood that the newly added sub-mapping layer is the weight-adjusted sub-mapping layer. The number of the newly added sub-mapping layers can be determined according to actual needs, one sub-mapping layer can be newly added on the basis of the original sub-mapping layer, and a plurality of sub-mapping layers can be newly added on the basis of the original sub-mapping layer.
Optionally, the original sub-mapping layer includes a type of sub-mapping layer, and the type of the newly added sub-mapping layer is the same as the type of the original sub-mapping layer.
For example, the original sub-mapping layer only includes linear, and each newly added sub-mapping layer is the linear after the weight adjustment.
As an example, the first entity recognition model has a structural form of Bert + linear + CRF, after the weight of the linear is initialized, a new linear with the initialized weight is added, and the second entity recognition model has a structural form of Bert + linear + linear + CRF.
Optionally, the original sub-mapping layer includes more than two sub-mapping layers of different types, and each new sub-mapping layer type belongs to the type set of the original sub-mapping layer. It is understood that each new added child map layer type may be any of the types of the original child map layers.
For example, the original sub-map layer includes both linear and lstm, and the type of the newly added sub-map layer may include at least one of the weight-adjusted linear and the weight-adjusted lstm.
As an example, the first entity recognition model has a structural form of Bert + line + lstm + CRF, after the line and lstm are reinitialized, a weight initialized line is newly added, the second entity recognition model has a structural form of Bert + line + lstm + CRF, or a weight initialized line and a weight initialized lstm are newly added, and the second entity recognition model has a structural form of Bert + line + lstm + CRF.
The following is an exemplary description of a specific training process of the first entity recognition model.
For ease of description, fields other than the target field are referred to as reference fields. In order to obtain entity recognition training data for training the first entity recognition model, contents in each corpus of the reference domain are first digitally converted.
Taking the reference field as an example of the video field, a corpus of the video field includes: how to play own role, please read independent cruciate splendid in the poor victory of the king of comedy, actor self-maintenance, entity _ list, 21, end, 25, entity _ type, movie and television, entity, comedy king, entity _ index, begin, 26, end, 29, entity _ type, character, entity, Zhou Chixing, star, etc.
In the corpus, text is the text content, and entity _ list is a list for storing information of all tagged entities in the text. The information of one entity includes a location (entity _ index) of the entity in the text, an entity type (entity _ type), and an entity content (entity).
As described above, the first entity recognition model is trained in advance based on the entity recognition training data of the domains other than the target domain, and therefore, the entity recognition training data of the domains other than the target domain may be obtained by fusing the entity recognition training data of at least one domain other than the target domain. And when the expression modes of the same entity type in different fields are inconsistent, carrying out normalization processing on the same entity type. For example, for the names of persons in the text, the types in different fields may be expressed as "person", "person name", and "person name", respectively, and the types of the names of persons in the text may be collectively specified as "person name".
The entity recognition training data comprises text input and label input, wherein the text is text content in the corpus, and the label is an entity in the corpus.
And carrying out digital conversion on the text content in the corpus to obtain text input. Specifically, adding [ CLS ] to the beginning position of the text, adding [ SEP ] to the end position of the text, changing the text content into how [ CLS ] plays its own role, please read the unique bletilla [ SEP ] from the monster of the "Hill-assist of actors self-nursing" ("the king of comedy" on the foot ". The text content is then converted to numbers based on the vocabulary, e.g., if the word is at line 1064 of the vocabulary, then the word is mapped to 1064. The above text contents can be converted into [101,1964,863,4029,1963,5633,2347,4639, 6236,5683,8025,6436,6439,518,4029,1448,5633,2770,935,1076,519,518, 1600,1197,723,4375,519,1454,3216,7721,2308,6630,755,4957,1738,4058, 949,723,705,4639,4325,7306,4909,5008,102] based on the vocabulary.
And carrying out digital conversion on the entities in the corpus to obtain tag input. The tag input is processed in b (begin) e (end) i (inter) o (other) format, specifically, converting the entity "comedy's king" into the sequence [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,1,2,2,3,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 3,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0, 0.
In the above sequence, 1 represents Begin, which represents the beginning of the content of the entity; 3 represents End, representing the End part of the entity; 2 represents Inter, representing the part between the beginning and the end of the entity; 0 represents Other, representing something Other than the current entity's content in the text. For example, "1", "2", "2", "3" in the above-mentioned first sequence respectively represent "a favorite", "a drama", "a sight", "a king". The "0" in the first sequence represents something other than "the king of comedy" in the text.
The text input and label input obtained above may be used as entity recognition training data for the first entity recognition model. When performing model training, the training program may segment all entity recognition training data into a training set and a test set according to a preset ratio (e.g., 9:1), train the first entity recognition model using the training set, and test the accuracy of the first entity recognition model using the test set. In addition, relevant training parameters can be set according to needs, for example, the maximum length of sentences in the text is set to be 128, and the learning rate is set to be 0.00001.
And reconstructing a mapping layer of the first entity recognition model after the trained first entity recognition model is obtained. As an example, the first entity recognition model has a structural form of Bert + line + CRF, after the trained first entity recognition model is obtained, the line in the first entity recognition model is replaced by line + line, the weight and the structure of each line in the line + line are adjusted, and the second entity recognition model has a structural form of Bert + line + CRF.
After a second entity recognition model with the structure form of Bert + linear + linear + CRF is obtained, the second entity recognition model is trained based on entity recognition training data in the target field.
It is understood that the training process of the second entity recognition model is substantially identical to the training process of the first entity recognition model, and is not repeated here. The difference between the two model training processes is that the number of entity recognition training data used to train the first entity recognition model is greater than the number of entity recognition training data used to train the second entity recognition model. As an example, 10000 corpora of entity recognition training data are used to train a first entity recognition model, and several tens of corpora (e.g., 20) of entity recognition training data are used to train a second entity recognition model.
And after the second entity recognition model is trained, performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.
Specifically, the text in the target field is input into the trained second entity recognition model, and the second entity recognition model can perform entity recognition on the text in the target field and output an entity recognition result. Taking the target field as the scientific field as an example, the input texts in the scientific field are as follows: "text": Chomp was created in 2010, initially only for apple App Store business, in 2011 in turn extending its service scope to google Android platform. TechCrunch states that Chomp has previously acquired 250 ten thousand dollar funding assistance, with its angel investors including Ron Conway, et al. The social news web site Digg creator kavin ross (Kevin Rose), the american actor Ashton Kutcher (Ashton Kutcher), et al are Chomp consultants. Chomp currently has around 20 employees, and these employees will be joined in the apple. For the report that the apples have bought Chomp, the Chomp refuses the comment, and the apples have not yet published the comment. (from Tengchi scientific and technical article: Zhongtao) }.
The output entity recognition results are as follows: "text": Chomp was created in 2010, initially only for apple App Store business, and in 2011 again extends its service scope to google Android platform. TechCrunch states that Chomp has previously acquired $ 250 ten thousand funding assistance, with its angel investors including Ron Conway, et al. The social news website Digg founder kan ross (Kevin Rose), the american actor Ashton Kutcher (Ashton Kutcher), et al are Chomp consultants. Chomp currently has about 20 employees, and all of the employees will be apple franchise. For the report that the apples have bought Chomp, the Chomp refuses the comment, and the apples have not yet published the comment. (sources: from _ \, "end": 76}, "entity _ type": 78, "end":83}, "entity _ type": product _ name "," entry "-," entity _ index "- {" begin "-: 78," end "-: 83}," entity _ type "-: 112)," entity _ type "-," person _ name "," entry "-: roe. dimension" -, "entity _ index" - { "entity _ index" -: 113, "end" -: 123}, "entity _ type" -: me "-" entity _ name "-," entity "-" entity _ index "- {" entity "- {" entry "-: 138" -, "end" - { "entity" - { "entry" -: 134 "-, -" entity "- {" entry "-, -exit" -, -exit "- { -exit" -, "end":146}, "entity _ type": "person _ name", "entity": 147 "," end ": 157}," entity _ type ": person _ name", "entity": key ": {" entity _ index ": 159", "end":161}, "entity _ type": location "," entity ": U.S. {" U.S. member "{" entity _ index ": 163", "end":169}, "entity _ type": "entity _ name" { "entity" { "attribute" { "index": 188 "{" entity "{" index "}," entity _ index "{" index ": 170" { "entity" { "index" }, "entity _ index" { "index" }, "index" { "index" }, "entity _ index" { "begin":196, "end":201}, "entry _ type": "component _ name", "entity": "Chomp" }, { "entity _ index" { "end _ index": 221, "end": 223), "entity _ type": "component _ name", "entity": apple "}, {" entity _ index "{" 226, "end":228}, "entity _ type": "component _ name", "entity": apple "}, {" entity _ index "{" 240 "{" entity _ name "{" index "{" 240 "{" entity _ index "{" index "}, {" index "{" 240 "{" index "}", "{" index "{" 250 "{, { "entity _ index" { "begin":266, "end":270}, "entity _ type": product _ name "," entity ":" emptying technology "}, {" entity _ index "{" begin ": 273," end ":275}," entity _ type ":" person _ name "," entity ": middle bag" }.
In the output result, text is the text content, and the entry _ list is a list for storing information of all labeled entities in the text. The information of one entity includes a location (entity _ index) of the entity in the text, an entity type (entity _ type), and an entity content (entity).
Taking entity information of { "entity _ index" { "begin":19, "end":30}, "entity _ type": product _ name "," entity ": apple App Store" } as an example, the location of the entity in the text is 19 th to 30 th bytes, the type of the entity is product _ name (product name), and the content of the entity is apple App Store.
In order to test the performance of the second entity identification model provided by the present invention, the inventors of the present application performed a comparative test on the second entity identification model and a third entity identification model, wherein the second entity identification model is obtained based on the first entity identification model (i.e., the second entity identification model is obtained after reconstructing the mapping layer of the first entity identification model), and the third entity identification model is obtained based on a conventional model construction method.
Figure BDA0002710938330000111
Figure BDA0002710938330000121
TABLE 1
The second entity identification model and the third entity identification model are trained using the same number (e.g., 20) of entity identification training data from the same source, and F1 scores (F1-score) are calculated for both models, and the F1 scores for both models are referred to in table 1. It should be noted that the F1 score is a measure of the classification problem, and is a harmonic mean of the precision rate and the recall rate, the F1 score is at most 1, and the F1 score is at least 0. The larger the F1 score, the higher the accuracy and recall of the machine learning model.
As can be seen in table 1, in the case where the second recognition model and the third recognition model are trained using 20 pieces of entity recognition training data, the second recognition model can obtain a higher F1 score in each domain, and the F1 scores of the second recognition model are both higher than the F1 score of the third recognition model under the same domain.
Fig. 2 shows a block diagram of an entity identification system provided by an exemplary embodiment of the present invention.
Referring to fig. 2, the entity recognition system includes a first model acquisition module, a second model acquisition module, a model training module, and an entity recognition module.
The first model acquisition module is configured to: the method comprises the steps of obtaining a first entity recognition model, wherein the first entity recognition model is trained in advance based on entity recognition training data of a field except a target field, the first entity recognition model comprises a semantic understanding layer, a mapping layer and a sequence labeling layer, and the mapping layer comprises at least one sub-mapping layer.
The second model acquisition module is configured to: and reconstructing the mapping layer of the first entity recognition model to obtain a second entity recognition model.
The model training module is configured to: the second entity recognition model is trained based on the entity recognition training data for the target domain.
The entity identification module is configured to: and performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.
Optionally, the second model acquisition module is configured to: and adjusting the weight and the structure of each sub mapping layer in the mapping layer.
Optionally, the second model acquisition module is configured to: the weight of each sub mapping layer in the mapping layer is initialized.
Optionally, the second model acquisition module is configured to: and adjusting the number of hidden units of each sub mapping layer in the mapping layer.
Optionally, the second model acquisition module is configured to: in the mapping layer, at least one sub-mapping layer is added on the basis of the original sub-mapping layer.
Optionally, the original sub-mapping layer includes more than two different types of sub-mapping layers; each newly added child mapping layer type belongs to the type set of the original child mapping layer.
Optionally, the entity recognition training data of the domains other than the target domain is obtained by fusing entity recognition training data of at least one domain other than the target domain; the first model acquisition module is configured to: and when the expression modes of the same entity type in different fields are inconsistent, carrying out normalization processing on the same entity type.
Optionally, the number of entity recognition training data used for training the first entity recognition model is greater than the number of entity recognition training data used for training the second entity recognition model.
The entity identification method and system according to the exemplary embodiments of the present disclosure have been described above with reference to fig. 1 to 2.
According to the entity recognition method and system provided by the exemplary embodiment of the invention, the mapping layer of the first entity recognition model trained in advance based on the entity recognition data of the fields other than the target field is reconstructed to obtain the initial second entity recognition model, the initial second entity recognition model is initialized by a small amount of training data of the target field, and the trained second entity recognition model can be applied to the entity recognition business of the target field and can achieve higher accuracy. The model training process aiming at the target field can quickly obtain the entity recognition model with expected accuracy based on less training corpora, remarkably simplify the training process and reduce the cost.
The various elements of the entity identification system shown in fig. 2 may be configured as software, hardware, firmware, or any combination thereof that performs a particular function. For example, each unit may correspond to an application-specific integrated circuit, to pure software code, or to a module combining software and hardware. Furthermore, one or more functions implemented by the respective units may also be uniformly executed by components in a physical entity device (e.g., a processor, a client, a server, or the like).
Further, the entity identification method described with reference to fig. 1 may be implemented by a program (or instructions) recorded on a computer-readable storage medium. For example, according to an exemplary embodiment of the present disclosure, a computer-readable storage medium storing instructions may be provided, wherein the instructions, when executed by at least one computing device, cause the at least one computing device to perform an entity identification method according to the present disclosure.
The computer program in the computer-readable storage medium may be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, and the like, and it should be noted that the computer program may also be used to perform additional steps other than the above steps or perform more specific processing when the above steps are performed, and the content of the additional steps and the further processing are already mentioned in the description of the related method with reference to fig. 1, and therefore will not be described again here to avoid repetition.
It should be noted that each unit in the entity recognition system according to the exemplary embodiments of the present disclosure may completely depend on the execution of the computer program to realize the corresponding function, that is, each unit corresponds to each step in the functional architecture of the computer program, so that the entire system is called by a special software package (e.g., lib library) to realize the corresponding function.
Alternatively, the various elements shown in FIG. 2 may be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the corresponding operations may be stored in a computer-readable medium such as a storage medium so that a processor may perform the corresponding operations by reading and executing the corresponding program code or code segments.
For example, exemplary embodiments of the present disclosure may also be implemented as a computing device including a storage component having stored therein a set of computer-executable instructions that, when executed by a processor, perform an entity identification method according to exemplary embodiments of the present disclosure.
In particular, computing devices may be deployed in servers or clients, as well as on node devices in a distributed network environment. Further, the computing device may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the set of instructions.
The computing device need not be a single computing device, but can be any device or collection of circuits capable of executing the instructions (or sets of instructions) described above, individually or in combination. The computing device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In a computing device, a processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
Some of the operations described in the entity identification method according to the exemplary embodiment of the present disclosure may be implemented by software, some of the operations may be implemented by hardware, and further, the operations may be implemented by a combination of software and hardware.
The processor may execute instructions or code stored in one of the memory components, which may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.
The memory component may be integral to the processor, e.g., having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage component may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The storage component and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the storage component.
In addition, the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via a bus and/or a network.
The entity identification method according to the exemplary embodiments of the present disclosure may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operated on by non-exact boundaries.
Accordingly, the entity identification method described with reference to fig. 1 may be implemented by a system comprising at least one computing device and at least one storage device storing instructions.
According to an exemplary embodiment of the present disclosure, the at least one computing device is a computing device for performing an entity identification method according to an exemplary embodiment of the present disclosure, the storage device having stored therein a set of computer-executable instructions that, when executed by the at least one computing device, performs the entity identification method described with reference to fig. 1.
While various exemplary embodiments of the present disclosure have been described above, it should be understood that the above description is exemplary only, and not exhaustive, and that the present disclosure is not limited to the disclosed exemplary embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. Therefore, the protection scope of the present disclosure should be subject to the scope of the claims.

Claims (10)

1. An entity identification method, comprising:
acquiring a first entity recognition model, wherein the first entity recognition model is trained in advance based on entity recognition training data of a field other than a target field, and comprises a semantic understanding layer, a mapping layer and a sequence marking layer, and the mapping layer comprises at least one sub-mapping layer;
reconstructing a mapping layer of the first entity identification model to obtain a second entity identification model;
training the second entity recognition model based on the entity recognition training data of the target domain;
and performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.
2. The method of claim 1, wherein,
the step of reconstructing the mapping layer of the first entity recognition model comprises: and adjusting the weight and the structure of each sub mapping layer in the mapping layer.
3. The method of claim 2, wherein,
the step of adjusting the weight and structure of each sub-mapping layer in the mapping layer comprises: initializing the weight of each sub mapping layer in the mapping layer.
4. The method of claim 2, wherein,
the step of adjusting the weight and structure of each sub-mapping layer in the mapping layer comprises: and adjusting the number of hidden units of each sub mapping layer in the mapping layer.
5. The method of claim 2, wherein,
the step of reconstructing the mapping layer of the first entity recognition model further comprises: and adding at least one sub-mapping layer in the mapping layers on the basis of the original sub-mapping layer.
6. The method of claim 1, wherein,
the entity recognition training data of the fields outside the target field is obtained by fusing the entity recognition training data of at least one field outside the target field;
the method further comprises the following steps: and when the expression modes of the same entity type in different fields are inconsistent, carrying out normalization processing on the same entity type.
7. The method of claim 1, wherein a number of entity recognition training data used to train the first entity recognition model is greater than a number of entity recognition training data used to train the second entity recognition model.
8. An entity identification system comprising:
a first model acquisition module configured to: acquiring a first entity recognition model, wherein the first entity recognition model is trained in advance based on entity recognition training data of a field other than a target field, and comprises a semantic understanding layer, a mapping layer and a sequence marking layer, and the mapping layer comprises at least one sub-mapping layer;
a second model acquisition module configured to: reconstructing a mapping layer of the first entity identification model to obtain a second entity identification model;
a model training module configured to: training the second entity recognition model based on the entity recognition training data of the target domain;
an entity identification module configured to: and performing entity recognition on the text in the target field by using the trained second entity recognition model, and outputting an entity recognition result.
9. A computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the entity identification method of any of claims 1 to 7.
10. A system comprising at least one computing device and storage having stored thereon at least one stored instruction, wherein the instruction, when executed by the at least one computing device, causes the at least one computing device to perform the entity identification method of any of claims 1 to 7.
CN202011056229.9A 2020-09-29 2020-09-29 Entity identification method and system Pending CN112347782A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011056229.9A CN112347782A (en) 2020-09-29 2020-09-29 Entity identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011056229.9A CN112347782A (en) 2020-09-29 2020-09-29 Entity identification method and system

Publications (1)

Publication Number Publication Date
CN112347782A true CN112347782A (en) 2021-02-09

Family

ID=74361425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011056229.9A Pending CN112347782A (en) 2020-09-29 2020-09-29 Entity identification method and system

Country Status (1)

Country Link
CN (1) CN112347782A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150199333A1 (en) * 2014-01-15 2015-07-16 Abbyy Infopoisk Llc Automatic extraction of named entities from texts
CN109918644A (en) * 2019-01-26 2019-06-21 华南理工大学 A kind of Chinese medicine health consultation text name entity recognition method based on transfer learning
CN110807328A (en) * 2019-10-25 2020-02-18 华南师范大学 Named entity identification method and system oriented to multi-strategy fusion of legal documents
CN111666766A (en) * 2019-03-05 2020-09-15 阿里巴巴集团控股有限公司 Data processing method, device and equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150199333A1 (en) * 2014-01-15 2015-07-16 Abbyy Infopoisk Llc Automatic extraction of named entities from texts
CN109918644A (en) * 2019-01-26 2019-06-21 华南理工大学 A kind of Chinese medicine health consultation text name entity recognition method based on transfer learning
CN111666766A (en) * 2019-03-05 2020-09-15 阿里巴巴集团控股有限公司 Data processing method, device and equipment
CN110807328A (en) * 2019-10-25 2020-02-18 华南师范大学 Named entity identification method and system oriented to multi-strategy fusion of legal documents

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
小M同学: "预训练模型&迁移学习", pages 1 - 4, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/139712757> *

Similar Documents

Publication Publication Date Title
US20220050967A1 (en) Extracting definitions from documents utilizing definition-labeling-dependent machine learning background
TW201917602A (en) Semantic encoding method and device for text capable of enabling mining of semantic relationships of text and of association between text and topics, and realizing fixed semantic encoding of text data having an indefinite length
US11900064B2 (en) Neural network-based semantic information retrieval
US11250839B2 (en) Natural language processing models for conversational computing
US11475227B2 (en) Intelligent routing services and systems
US11397892B2 (en) Method of and system for training machine learning algorithm to generate text summary
US11455148B2 (en) Software programming assistant
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN113987169A (en) Text abstract generation method, device and equipment based on semantic block and storage medium
US11681876B2 (en) Cascaded fact-based summarization
US11874798B2 (en) Smart dataset collection system
CN112948676A (en) Training method of text feature extraction model, and text recommendation method and device
CN114896983A (en) Model training method, text processing device and computer equipment
US20230124572A1 (en) Translation of text depicted in images
CN114416995A (en) Information recommendation method, device and equipment
Dai et al. The state of the art in implementing machine learning for mobile apps: A survey
US11893990B2 (en) Audio file annotation
CN111368554B (en) Statement processing method, device, computer equipment and storage medium
CN115269768A (en) Element text processing method and device, electronic equipment and storage medium
CN113569017A (en) Model processing method and device, electronic equipment and storage medium
CN112307738A (en) Method and device for processing text
CN116502645A (en) Identifying regulatory data corresponding to an executable rule
US11822893B2 (en) Machine learning models for detecting topic divergent digital videos
US20230281400A1 (en) Systems and Methods for Pretraining Image Processing Models
US20230062307A1 (en) Smart document management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination