CN113919357A - Method, device and equipment for training address entity recognition model and storage medium - Google Patents

Method, device and equipment for training address entity recognition model and storage medium Download PDF

Info

Publication number
CN113919357A
CN113919357A CN202111277134.4A CN202111277134A CN113919357A CN 113919357 A CN113919357 A CN 113919357A CN 202111277134 A CN202111277134 A CN 202111277134A CN 113919357 A CN113919357 A CN 113919357A
Authority
CN
China
Prior art keywords
data
training
label
address entity
entity recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111277134.4A
Other languages
Chinese (zh)
Inventor
魏万顺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202111277134.4A priority Critical patent/CN113919357A/en
Publication of CN113919357A publication Critical patent/CN113919357A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to artificial intelligence, in particular to the technical field of natural language processing, and provides a training method, a device, equipment and a storage medium for an address entity recognition model, wherein the method comprises the following steps: acquiring a plurality of artificial label data; generating a plurality of groups of training data according to the plurality of artificial label data; training the first address entity recognition model based on each group of training data to obtain a second address entity recognition model corresponding to each group of training data; inputting a plurality of label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain machine label data corresponding to the label-free data; and performing model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models, thereby realizing the improvement of the generalization capability of the address entity recognition models. The application also relates to blockchain techniques, where artificial and machine tag data can be stored in blockchain link points.

Description

Method, device and equipment for training address entity recognition model and storage medium
Technical Field
The present application relates to the field of natural language processing technology in the field of artificial intelligence, and in particular, to a method, an apparatus, a device, and a storage medium for training an address entity recognition model.
Background
Address entity recognition is a technology used in Natural Language Processing (NLP) to solve the problem of address entity information extraction (information extraction), and address entity information can be extracted from a text by a corresponding address entity recognition model. At present, sample data for training an address entity identification model is generated by manually labeling label-free data, so that the cost is high, and the sample data still has the problem of scarcity, so that the generalization capability of the address entity identification model is weak.
Therefore, how to improve the generalization capability of the address entity recognition model becomes an urgent problem to be solved.
Disclosure of Invention
The application provides a training method, a device, equipment and a storage medium of an address entity recognition model, aiming at improving the generalization capability of the address entity recognition model.
In order to achieve the above object, the present application provides a method for training an address entity recognition model, where the method for training the address entity recognition model includes:
acquiring a plurality of artificial label data;
generating a plurality of groups of training data according to the plurality of artificial label data;
training a first address entity recognition model based on each set of training data to obtain a second address entity recognition model corresponding to each set of training data;
inputting a plurality of label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain machine label data corresponding to the label-free data;
and performing model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models.
In addition, to achieve the above object, the present application further provides a training apparatus for an address entity recognition model, including:
the acquisition module is used for acquiring a plurality of pieces of artificial label data;
the generating module is used for generating a plurality of groups of training data according to the plurality of artificial label data;
the first training module is used for training a first address entity recognition model based on each group of training data to obtain a second address entity recognition model corresponding to each group of training data;
the label prediction module is used for inputting a plurality of label-free data into a plurality of second address entity identification models to perform label prediction respectively so as to obtain machine label data corresponding to the label-free data;
and the second training module is used for carrying out model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models.
In addition, to achieve the above object, the present application also provides a computer device comprising a memory and a processor;
the memory for storing a computer program;
the processor is configured to execute the computer program and implement the above-mentioned training method of the address entity recognition model when executing the computer program.
In addition, to achieve the above object, the present application further provides a computer readable storage medium storing a computer program, which when executed by a processor, implements the steps of the above training method for an address entity recognition model.
The application discloses a training method, a device, equipment and a storage medium of an address entity identification model, multiple groups of training data are generated according to multiple groups of artificial label data by obtaining the multiple groups of artificial label data, each group of training data is respectively input into a first address entity identification model to carry out model training, a corresponding second address entity identification model is obtained, then multiple label-free data are input into multiple second address entity identification models to carry out label prediction respectively, machine label data corresponding to the label-free data are obtained, model training is carried out on the multiple second address entity identification models according to the artificial label data and the machine label data, and the trained multiple second address entity identification models are obtained. That is, the manual label data is used as the training sample, and the machine label data is also used as the training sample, so that the number of the training samples is increased, and the generalization capability of the trained multiple second address entity recognition models is improved; and because the machine label data is obtained by comprehensively predicting the label of the label-free data through the plurality of second address entity identification models, the accuracy of the machine label data is guaranteed, namely the reliability of the training sample is ensured, and the generalization capability of the trained plurality of second address entity identification models is further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram illustrating steps of a method for training an address entity recognition model according to an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating steps for generating a plurality of sets of training data from the plurality of artificial label data according to an embodiment of the present application;
fig. 3 is a schematic diagram of a recognition model for obtaining a plurality of second address entities based on training of a plurality of sets of training data according to an embodiment of the present application;
fig. 4 is a schematic flowchart of a step of inputting a plurality of non-tag data into a plurality of second address entity identification models to perform tag prediction, respectively, to obtain machine tag data corresponding to the non-tag data, according to an embodiment of the present application;
FIG. 5 is a schematic diagram of obtaining machine tag data through multiple second address entity recognition models according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating steps of another method for training an address entity recognition model according to an embodiment of the present application;
FIG. 7 is a schematic block diagram of a training apparatus for an address entity recognition model provided in an embodiment of the present application;
fig. 8 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Embodiments of the present application provide a method, an apparatus, a device and a storage medium for training an address entity recognition model, which are used to improve generalization capability of the address entity recognition model.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for training an address entity recognition model according to an embodiment of the present application. The method can be applied to computer equipment, and the application scene of the method is not limited in the application. The following describes the training method of the address entity recognition model in detail, taking the application of the training method of the address entity recognition model to computer equipment as an example.
As shown in fig. 1, the method for training the address entity recognition model specifically includes steps S101 to S105.
S101, acquiring a plurality of pieces of manual label data.
The manual label data is text data labeled with labels on the non-label text data through manual operation. The artificial label data comprises address entity information, and the address entity information in the artificial label data can be extracted through an address entity recognition model.
In some embodiments, after the manual tag data is generated by tagging through manual operation, the manual tag data is stored in a corresponding database, and a plurality of manual tag data can be obtained by querying the database.
Illustratively, according to a preset query condition, a plurality of pieces of artificial tag data meeting the preset query condition are queried and obtained from a database. The preset query condition includes, but is not limited to, a text name, a text generation time period, and other information. That is, the manual tag data corresponding to the text name, the text generation time period and other information is searched and obtained from the database.
It should be noted that the number of the artificial label data is not large, the address entity recognition model is subjected to model training based on the artificial label data with small number, and the generalization capability of the trained address entity recognition model is weak.
And S102, generating a plurality of groups of training data according to the plurality of artificial label data.
Each group of training data comprises a training set and a verification set, and the number of the artificial label data in the training set is more than that of the artificial label data in the verification set.
And selecting one part of artificial label data from the plurality of artificial label data as a training set, and the other part of artificial label data as a verification set, wherein the training set and the verification set form a group of training data. The operation is repeated for a plurality of times in the mode, and corresponding multiple groups of training data are generated.
In some embodiments, as shown in fig. 2, step S102 may include sub-step S1021 and sub-step S1022.
And S1021, dividing the plurality of the artificial label data into N equal parts to generate N data sets.
For example, the multiple pieces of artificial tag data are divided into 5 data sets: data set 1, data set 2, data set 3, data set 4, and data set 5.
It should be noted that the specific value of N can be flexibly set according to actual situations, and is not limited specifically herein.
And S1022, sequentially using each data set in the N data sets as a verification set, using other data sets except the verification set as training sets, generating a group of training data by using each group of verification sets and each training set, and obtaining a plurality of groups of training data containing different verification sets and training sets.
For example, taking the above listed division into 5 data sets as an example, taking the data set 1 as the verification set, and the data set 2, 3, 4 and 5 as the training sets, a set of training data is generated, for example, denoted as training data a, which is { training set: data set 2, data set 3, data set 4, and data set 5; and (4) verification set: data set 1 }.
Taking the data set 2 as a verification set, and the data set 1, the data set 3, the data set 4, and the data set 5 as a training set, a set of training data is generated, and for example, the training data is denoted as training data B, where the training data B is a { training set: data set 1, data set 3, data set 4, and data set 5; and (4) verification set: data set 2 }.
Taking the data set 3 as a verification set, and the data set 1, the data set 2, the data set 4, and the data set 5 as a training set, a set of training data is generated, for example, denoted as training data C, where the training data C is { training set: data set 1, data set 2, data set 4 and data set 5; and (4) verification set: data set 3 }.
Taking the data set 4 as a verification set, and the data set 1, the data set 2, the data set 3, and the data set 5 as a training set, a set of training data is generated, and is, for example, denoted as training data D, where the training data D is a { training set: data set 1, data set 2, data set 3 and data set 5; and (4) verification set: data set 4 }.
Taking the data set 5 as a verification set, and the data set 1, the data set 2, the data set 3, and the data set 4 as a training set, a set of training data is generated, for example, denoted as training data E, where the training data E is { training set: data set 1, data set 2, data set 3 and data set 4; and (4) verification set: data set 5 }.
That is, 5 sets of training data were obtained: training data a { training set: data set 2, data set 3, data set 4, and data set 5; and (4) verification set: data set 1, training data B { training set: data set 1, data set 3, data set 4, and data set 5; and (4) verification set: data set 2, training data C { training set: data set 1, data set 2, data set 4 and data set 5; and (4) verification set: data set 3, training data D { training set: data set 1, data set 2, data set 3 and data set 5; and (4) verification set: data set 4, training data E { training set: data set 1, data set 2, data set 3 and data set 4; and (4) verification set: data set 5 }.
In still other embodiments, generating the plurality of sets of training data from the plurality of artificial label data may include:
and randomly selecting at least one artificial label data from the plurality of artificial label data according to a preset proportion value as a verification set, using the rest artificial label data as a training set, generating a group of training data by the verification set and the training set, and repeating the operation of generating a group of training data to obtain a plurality of groups of training data.
That is, according to a preset proportion value, part of the artificial label data is randomly selected from the plurality of artificial label data to serve as a verification set, the rest of the artificial label data serves as a training set, the training set and the data amount corresponding to the verification set are in a certain proportion, and a set of training data is formed by the determined verification set and the training set.
For example, if the preset ratio value is set to 1/5 in advance, then 1/5 of manual label data is randomly selected from the manual label data as a verification set, and the rest of the manual label data of 4/5 is used as a training set, so that the ratio of the data amount corresponding to the training set and the verification set is 4: and 1, selecting the determined verification set and the training set to form a group of training data.
Repeating the operation for multiple times according to the mode of generating a group of training data to obtain multiple groups of corresponding training data.
It should be noted that the preset ratio value can be flexibly set according to actual situations, and is not particularly limited herein.
S103, training the first address entity recognition model based on each group of training data to obtain a second address entity recognition model corresponding to each group of training data.
It is to be understood that the first address entity recognition model may be an initial address entity recognition model built.
For example, the initial address entity recognition model may be a model in which an attribute-base such as a roformer-char-base, a roberta, a bert, etc. is adopted at the upstream of the model, a token-by-token layer of fully-connected network is adopted at the downstream of the model, and a cross entropy loss function model is adopted as a loss function of the model.
For example, if N sets of training data are obtained through the operations in the above steps, model training is performed on the first address entity recognition model based on each set of training data, and a corresponding second address entity recognition model trained by one set of training data is obtained. Thus, N second address entity recognition models are obtained.
For example, as shown in fig. 3, the training data a obtained as described above is input to a first address entity recognition Model to perform Model training, and a corresponding address entity recognition Model1 is obtained.
And inputting the obtained training data B into the first address entity recognition Model for Model training to obtain a corresponding address entity recognition Model 2.
And inputting the obtained training data C into the first address entity recognition Model for Model training to obtain a corresponding address entity recognition Model 3.
And inputting the obtained training data D into the first address entity recognition Model for Model training to obtain a corresponding address entity recognition Model 4.
And inputting the obtained training data E into the first address entity recognition Model for Model training to obtain a corresponding address entity recognition Model 5.
S104, inputting a plurality of label-free data into a plurality of second address entity identification models to perform label prediction respectively, and obtaining machine label data corresponding to the label-free data.
Namely, each non-tag data is input into a plurality of second address entity identification models for tag prediction, and the prediction results of the second address entity identification models are integrated to obtain the machine tag data corresponding to each non-tag data. The machine label data is obtained by comprehensively predicting the label of the label-free data through a plurality of second address entity identification models, the accuracy of the machine label data is guaranteed, the machine label data is used as a training sample, and the plurality of second address entity identification models are subjected to model training, so that the generalization capability of the trained plurality of second address entity identification models is improved.
In some embodiments, as shown in fig. 4, step S104 may include sub-step S1041 and sub-step S1042.
S1041, inputting the first label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain a plurality of corresponding label prediction results; wherein the first non-tag data is any one of a plurality of non-tag data.
Taking any one first non-tag data as an example, the first non-tag data is respectively input into a plurality of second address entity identification models for tag prediction, and a plurality of tag prediction results corresponding to the plurality of second address entity identification models are obtained. The label prediction result includes, but is not limited to, a label and a prediction probability value corresponding to the label.
S1042, determining machine label data corresponding to the first non-label data according to the label prediction results.
And on the basis of a plurality of label prediction results corresponding to a plurality of second address entity recognition models, synthesizing the plurality of label prediction results, determining a prediction label corresponding to the first non-label data, and obtaining machine label data corresponding to the first non-label data according to the prediction label.
For example, as shown in fig. 5, if the plurality of second address entity identification models include an address entity identification Model1, an address entity identification Model2, an address entity identification Model3, an address entity identification Model4, and an address entity identification Model5, any one of the first non-tag data is input into the address entity identification Model1, the address entity identification Model2, the address entity identification Model3, the address entity identification Model4, and the address entity identification Model5, respectively, to obtain a plurality of corresponding tag prediction results, and the plurality of tag prediction results are integrated to determine the machine tag data corresponding to the first non-tag data.
In some embodiments, determining, according to a plurality of the tag prediction results, machine tag data corresponding to the first non-tag data may include:
and calculating average probability values corresponding to a plurality of prediction probability values, and determining the machine tag data corresponding to the first non-tag data based on the average probability values.
For example, still taking the example that the plurality of second address entity identification models include the address entity identification Model1, the address entity identification Model2, the address entity identification Model3, the address entity identification Model4 and the address entity identification Model5, the first unlabeled data is respectively input into the address entity identification Model1, the address entity identification Model2, the address entity identification Model3, the address entity identification Model4 and the address entity identification Model5, the predicted probability value a, the predicted probability value b, the predicted probability value c, the predicted probability value d and the predicted probability value e corresponding to the label are obtained, the average probability value of the predicted probability value a, the predicted probability value b, the predicted probability value c, the predicted probability value d and the predicted probability value e is calculated, and determining a predicted tag corresponding to the first non-tag data according to the obtained average probability value, and obtaining machine tag data corresponding to the first non-tag data according to the predicted tag.
And S105, performing model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models.
In some embodiments, a plurality of second address entity recognition models are formed according to the artificial label data and the machine label data, and corresponding training samples are subjected to model training. The training sample comprises a training set and a verification set, the training set comprises artificial label data and/or machine label data, and the verification set only comprises the artificial label data.
That is, a training set is determined by a small number of artificial label data and a plurality of machine label data obtained by a plurality of second address entity identification models, a verification set is determined by the artificial label data, a training sample is formed according to the training set and the verification set, the training sample is input into the plurality of second address entity identification models, and the plurality of second address entity identification models are subjected to model training to obtain a plurality of trained second address entity identification models. Due to the fact that the number of training samples is large, the generalization capability of the trained second address entity recognition models is strong.
And then, extracting the address entity information of the text through the trained second address entity recognition models, so that the accuracy of obtaining the address entity information corresponding to the text can be improved.
In some embodiments, as shown in fig. 6, step S103 may be followed by step S106, and step S104 may comprise step S1043.
And S106, combining the plurality of second address entity recognition models to generate a corresponding machine learning integration model.
And S1043, inputting the label-free data into the machine learning integration model for label prediction, and obtaining machine label data corresponding to the label-free data.
Illustratively, a plurality of second address entity identification models are combined in a bagging mode to generate corresponding machine learning integration models, and label prediction is performed on a plurality of label-free data through the machine learning integration models to obtain machine label data corresponding to the plurality of label-free data.
For example, the performing model training on a plurality of second address entity recognition models according to the artificial tag data and the machine tag data to obtain a plurality of trained second address entity recognition models may include:
and inputting the artificial label data and the machine label data into the machine learning integration model for model training to obtain the trained machine learning integration model, wherein a training set corresponding to the machine learning integration model training is generated by the artificial label data and/or the machine label data, and a verification set corresponding to the machine learning integration model training is generated by the artificial label data.
Namely, a training set corresponding to machine learning integrated model training is determined by a small amount of artificial label data and a plurality of machine label data, a verification set corresponding to machine learning integrated model training is determined by the artificial label data, and model training is carried out on the machine learning integrated model according to the generated training set and verification set to obtain a trained machine learning integrated model.
The training data quantity corresponding to the machine learning integrated model training is large, so that the trained machine learning integrated model has strong generalization capability. And then, extracting the address entity information of the text through the trained machine learning integration model, so that the accuracy of obtaining the address entity information corresponding to the text can be improved.
In this embodiment, a plurality of sets of training data are generated according to a plurality of sets of artificial label data by obtaining the plurality of artificial label data, each set of training data is input into a first address entity recognition model for model training, a corresponding second address entity recognition model is obtained, a plurality of non-label data are input into a plurality of second address entity recognition models for label prediction, machine label data corresponding to the non-label data are obtained, and a plurality of second address entity recognition models are subjected to model training according to the artificial label data and the machine label data, so that a plurality of trained second address entity recognition models are obtained. That is, the manual label data is used as the training sample, and the machine label data is also used as the training sample, so that the number of the training samples is increased, and the generalization capability of the trained multiple second address entity recognition models is improved; and because the machine label data is obtained by comprehensively predicting the label of the label-free data through the plurality of second address entity identification models, the accuracy of the machine label data is guaranteed, namely the reliability of the training sample is ensured, and the generalization capability of the trained plurality of second address entity identification models is further improved. And then, extracting the address entity information of the text through the trained second address entity recognition models, so that the accuracy of obtaining the address entity information corresponding to the text can be improved.
Referring to fig. 7, fig. 7 is a schematic block diagram of an address entity recognition model training apparatus according to an embodiment of the present application, which may be configured in a computer device for executing the aforementioned address entity recognition model training method.
As shown in fig. 7, the apparatus 1000 for training an address entity recognition model includes: an acquisition module 1001, a generation module 1002, a first training module 1003, a label prediction module 1004, and a second training module 1005.
A first encoding module 1001 is provided for encoding a signal,
an obtaining module 1001 configured to obtain a plurality of pieces of artificial tag data;
a generating module 1002, configured to generate multiple sets of training data according to the multiple sets of artificial label data;
a first training module 1003, configured to train a first address entity recognition model based on each set of training data, to obtain a second address entity recognition model corresponding to each set of training data;
a tag prediction module 1004, configured to input multiple pieces of non-tag data into multiple second address entity identification models to perform tag prediction, respectively, so as to obtain machine tag data corresponding to the non-tag data;
a second training module 1005, configured to perform model training on the multiple second address entity identification models according to the artificial tag data and the machine tag data, so as to obtain multiple trained second address entity identification models.
In one embodiment, each set of training data includes a training set and a validation set, and the generating module 1002 is further configured to:
dividing the plurality of the artificial label data into N equal parts to generate N data sets; and sequentially taking each data set in the N data sets as a verification set, taking other data sets except the verification set as training sets, generating a group of training data by each group of verification sets and training sets, and obtaining a plurality of groups of training data containing different verification sets and training sets.
In one embodiment, each set of training data includes a training set and a validation set, and the generating module 1002 is further configured to:
and randomly selecting at least one artificial label data from the plurality of artificial label data according to a preset proportion value as a verification set, using the rest artificial label data as a training set, generating a group of training data by the verification set and the training set, and repeating the operation of generating a group of training data to obtain a plurality of groups of training data.
In one embodiment, the tag prediction module 1004 is further configured to:
inputting the first label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain a plurality of corresponding label prediction results; wherein the first non-tag data is any one of a plurality of non-tag data; and determining machine label data corresponding to the first non-label data according to a plurality of label prediction results.
In one embodiment, the tag prediction result includes a prediction probability value corresponding to the tag, and the tag prediction module 1004 is further configured to:
and calculating average probability values corresponding to a plurality of prediction probability values, and determining the machine tag data corresponding to the first non-tag data based on the average probability values.
In one embodiment, the apparatus 1000 for training the address entity recognition model further includes:
the model processing module is used for combining a plurality of second address entity recognition models to generate a corresponding machine learning integration model;
the tag prediction module 1004 is further configured to:
and inputting the label-free data into the machine learning integration model for label prediction to obtain machine label data corresponding to the label-free data.
In one embodiment, the second training module 1005 is further configured to:
and inputting the artificial label data and the machine label data into the machine learning integration model for model training to obtain the trained machine learning integration model, wherein a training set corresponding to the machine learning integration model training is generated by the artificial label data and/or the machine label data, and a verification set corresponding to the machine learning integration model training is generated by the artificial label data.
Each module in the device for training the address entity recognition model corresponds to each step in the embodiment of the method for training the address entity recognition model, and the functions and the implementation process are not described in detail herein.
The methods, apparatus, and devices of the present application may be deployed in numerous general-purpose or special-purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
For example, the method and apparatus described above may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 8.
Referring to fig. 8, fig. 8 is a schematic block diagram of a computer device according to an embodiment of the present disclosure.
Referring to fig. 8, the computer device includes a processor and a memory connected by a system bus, wherein the memory may include a nonvolatile storage medium and an internal memory.
The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.
The internal memory provides an environment for execution of a computer program in a non-volatile storage medium, which when executed by the processor, causes the processor to perform any one of a deep learning model training method or a deep learning model calling method.
It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:
acquiring a plurality of artificial label data;
generating a plurality of groups of training data according to the plurality of artificial label data;
training a first address entity recognition model based on each set of training data to obtain a second address entity recognition model corresponding to each set of training data;
inputting a plurality of label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain machine label data corresponding to the label-free data;
and performing model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models.
In one embodiment, each set of training data comprises a training set and a validation set, and the processor, when performing the generating of the plurality of sets of training data from the plurality of artificial label data, is configured to perform:
dividing the plurality of the artificial label data into N equal parts to generate N data sets; and sequentially taking each data set in the N data sets as a verification set, taking other data sets except the verification set as training sets, generating a group of training data by each group of verification sets and training sets, and obtaining a plurality of groups of training data containing different verification sets and training sets.
In one embodiment, each set of training data comprises a training set and a validation set, and the processor, when performing the generating of the plurality of sets of training data from the plurality of artificial label data, is configured to perform:
and randomly selecting at least one artificial label data from the plurality of artificial label data according to a preset proportion value as a verification set, using the rest artificial label data as a training set, generating a group of training data by the verification set and the training set, and repeating the operation of generating a group of training data to obtain a plurality of groups of training data.
In one embodiment, when implementing that the label prediction is performed by inputting a plurality of label-free data into a plurality of second address entity recognition models respectively to obtain machine label data corresponding to the label-free data, the processor is configured to implement:
inputting the first label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain a plurality of corresponding label prediction results; wherein the first non-tag data is any one of a plurality of non-tag data; and determining machine label data corresponding to the first non-label data according to a plurality of label prediction results.
In one embodiment, the tag prediction result includes a prediction probability value corresponding to a tag, and the processor, when implementing the determining of the machine tag data corresponding to the first non-tag data according to a plurality of the tag prediction results, is configured to implement:
and calculating average probability values corresponding to a plurality of prediction probability values, and determining the machine tag data corresponding to the first non-tag data based on the average probability values.
In one embodiment, the processor is configured to, after implementing the training of the first address entity recognition model based on each set of the training data and obtaining the second address entity recognition model corresponding to each set of the training data, implement:
combining a plurality of the second address entity recognition models to generate a corresponding machine learning integration model;
when the processor inputs a plurality of label-free data into a plurality of second address entity identification models to perform label prediction respectively to obtain machine label data corresponding to the label-free data, the processor is configured to:
and inputting the label-free data into the machine learning integration model for label prediction to obtain machine label data corresponding to the label-free data.
In an embodiment, when implementing the model training on the plurality of second address entity recognition models according to the artificial tag data and the machine tag data to obtain a plurality of trained second address entity recognition models, the processor is configured to implement:
and inputting the artificial label data and the machine label data into the machine learning integration model for model training to obtain the trained machine learning integration model, wherein a training set corresponding to the machine learning integration model training is generated by the artificial label data and/or the machine label data, and a verification set corresponding to the machine learning integration model training is generated by the artificial label data.
The embodiment of the application also provides a computer readable storage medium.
The computer readable storage medium of the present application has stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of training an address entity recognition model as described above.
The computer-readable storage medium may be an internal storage unit of the address entity recognition model training apparatus or the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the address entity recognition model training apparatus or the computer device. The computer readable storage medium may also be an external storage device of the training apparatus of the address entity recognition model or the computer device, such as a plug-in hard disk equipped on the training apparatus of the address entity recognition model or the computer device, a Smart Media Card (SMC), a Secure Digital Card (Secure Digital Card, SD Card), a Flash memory Card (Flash Card), and the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention.

Claims (10)

1. A training method of an address entity recognition model is characterized in that the training method of the address entity recognition model comprises the following steps:
acquiring a plurality of artificial label data;
generating a plurality of groups of training data according to the plurality of artificial label data;
training a first address entity recognition model based on each set of training data to obtain a second address entity recognition model corresponding to each set of training data;
inputting a plurality of label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain machine label data corresponding to the label-free data;
and performing model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models.
2. The method of training an address entity recognition model of claim 1, wherein each set of training data comprises a training set and a validation set, and wherein generating a plurality of sets of training data from the plurality of artificial label data comprises:
dividing the plurality of the artificial label data into N equal parts to generate N data sets;
and sequentially taking each data set in the N data sets as a verification set, taking other data sets except the verification set as training sets, generating a group of training data by each group of verification sets and training sets, and obtaining a plurality of groups of training data containing different verification sets and training sets.
3. The method of training an address entity recognition model of claim 1, wherein each set of training data comprises a training set and a validation set, and wherein generating a plurality of sets of training data from the plurality of artificial label data comprises:
and randomly selecting at least one artificial label data from the plurality of artificial label data according to a preset proportion value as a verification set, using the rest artificial label data as a training set, generating a group of training data by the verification set and the training set, and repeating the operation of generating a group of training data to obtain a plurality of groups of training data.
4. The method for training address entity recognition models according to claim 1, wherein the step of inputting a plurality of label-free data into a plurality of second address entity recognition models for label prediction to obtain machine label data corresponding to the label-free data comprises:
inputting the first label-free data into a plurality of second address entity identification models to respectively perform label prediction to obtain a plurality of corresponding label prediction results; wherein the first non-tag data is any one of a plurality of non-tag data;
and determining machine label data corresponding to the first non-label data according to a plurality of label prediction results.
5. The method for training an address entity recognition model according to claim 4, wherein the tag prediction result comprises a prediction probability value corresponding to a tag, and the determining the machine tag data corresponding to the first non-tag data according to a plurality of the tag prediction results comprises:
and calculating average probability values corresponding to a plurality of prediction probability values, and determining the machine tag data corresponding to the first non-tag data based on the average probability values.
6. The method for training address entity recognition models according to any one of claims 1 to 5, wherein the training a first address entity recognition model based on each set of the training data, and after obtaining a second address entity recognition model corresponding to each set of the training data, comprises:
combining a plurality of the second address entity recognition models to generate a corresponding machine learning integration model;
the inputting a plurality of label-free data into a plurality of second address entity recognition models for label prediction respectively to obtain machine label data corresponding to the label-free data includes:
and inputting the label-free data into the machine learning integration model for label prediction to obtain machine label data corresponding to the label-free data.
7. The method for training address entity recognition models according to claim 6, wherein the performing model training on a plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models comprises:
and inputting the artificial label data and the machine label data into the machine learning integration model for model training to obtain the trained machine learning integration model, wherein a training set corresponding to the machine learning integration model training is generated by the artificial label data and/or the machine label data, and a verification set corresponding to the machine learning integration model training is generated by the artificial label data.
8. An apparatus for training an address entity recognition model, the apparatus comprising:
the acquisition module is used for acquiring a plurality of pieces of artificial label data;
the generating module is used for generating a plurality of groups of training data according to the plurality of artificial label data;
the first training module is used for training a first address entity recognition model based on each group of training data to obtain a second address entity recognition model corresponding to each group of training data;
the label prediction module is used for inputting a plurality of label-free data into a plurality of second address entity identification models to perform label prediction respectively so as to obtain machine label data corresponding to the label-free data;
and the second training module is used for carrying out model training on the plurality of second address entity recognition models according to the artificial label data and the machine label data to obtain a plurality of trained second address entity recognition models.
9. A computer device, wherein the computer device comprises a memory and a processor;
the memory for storing a computer program;
the processor for executing the computer program and implementing the method of training an address entity recognition model according to any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when being executed by a processor, carries out the steps of the method for training an address entity recognition model according to any one of claims 1 to 7.
CN202111277134.4A 2021-10-29 2021-10-29 Method, device and equipment for training address entity recognition model and storage medium Pending CN113919357A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111277134.4A CN113919357A (en) 2021-10-29 2021-10-29 Method, device and equipment for training address entity recognition model and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111277134.4A CN113919357A (en) 2021-10-29 2021-10-29 Method, device and equipment for training address entity recognition model and storage medium

Publications (1)

Publication Number Publication Date
CN113919357A true CN113919357A (en) 2022-01-11

Family

ID=79243747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111277134.4A Pending CN113919357A (en) 2021-10-29 2021-10-29 Method, device and equipment for training address entity recognition model and storage medium

Country Status (1)

Country Link
CN (1) CN113919357A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117456416A (en) * 2023-11-03 2024-01-26 北京饼干科技有限公司 Method and system for intelligently generating material labels

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117456416A (en) * 2023-11-03 2024-01-26 北京饼干科技有限公司 Method and system for intelligently generating material labels

Similar Documents

Publication Publication Date Title
CN111859960B (en) Semantic matching method, device, computer equipment and medium based on knowledge distillation
CN111667067A (en) Recommendation method and device based on graph neural network and computer equipment
CN110019216B (en) Intellectual property data storage method, medium and computer device based on block chain
CN112732899A (en) Abstract statement extraction method, device, server and computer readable storage medium
CN109726664B (en) Intelligent dial recommendation method, system, equipment and storage medium
CN112686049A (en) Text auditing method, device, equipment and storage medium
CN112732741A (en) SQL statement generation method, device, server and computer readable storage medium
CN115859302A (en) Source code vulnerability detection method, device, equipment and storage medium
CN113592605A (en) Product recommendation method, device, equipment and storage medium based on similar products
CN113919357A (en) Method, device and equipment for training address entity recognition model and storage medium
CN111625567A (en) Data model matching method, device, computer system and readable storage medium
Martínez et al. Efficient model similarity estimation with robust hashing
CN113628043A (en) Complaint validity judgment method, device, equipment and medium based on data classification
CN116402630B (en) Financial risk prediction method and system based on characterization learning
CN117114909A (en) Method, device, equipment and storage medium for constructing accounting rule engine
CN117093619A (en) Rule engine processing method and device, electronic equipment and storage medium
CN111950623A (en) Data stability monitoring method and device, computer equipment and medium
CN116795978A (en) Complaint information processing method and device, electronic equipment and medium
CN115114073A (en) Alarm information processing method and device, storage medium and electronic equipment
CN113656466A (en) Policy data query method, device, equipment and storage medium
CN113780454A (en) Model training and calling method and device, computer equipment and storage medium
CN114492374A (en) Text processing method, device, equipment and storage medium
CN103761247B (en) A kind of processing method and processing device of error file
CN113821418A (en) Fault tracking analysis method and device, storage medium and electronic equipment
CN117574899A (en) Named entity recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination