CN112487814A

CN112487814A - Entity classification model training method, entity classification device and electronic equipment

Info

Publication number: CN112487814A
Application number: CN202011356458.2A
Authority: CN
Inventors: 杨虎; 汪琦; 冯知凡; 柴春光; 朱勇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-03-12
Anticipated expiration: 2040-11-27
Also published as: CN112487814B

Abstract

The application discloses an entity classification model training method, an entity classification device and electronic equipment, and relates to the technical fields of artificial intelligence such as knowledge maps, natural language processing and deep learning. The specific implementation scheme is as follows: training the pre-training model based on the general training sample to obtain a first entity classification model; training the pre-training model based on a first labeled sample of the first industry field to obtain a second entity classification model; extracting a second training sample of the first industry field from the general training samples according to a second entity classification model; and training the first entity classification model according to the second training sample to obtain a target classification model. In the training process, a large amount of data in the first industry field does not need to be labeled and then trained, and therefore the model training efficiency can be improved.

Description

Entity classification model training method, entity classification device and electronic equipment

Technical Field

The application relates to the technical field of artificial intelligence in computer technology, such as knowledge maps, natural language processing, deep learning and the like, in particular to an entity classification model training method, an entity classification device and electronic equipment.

Background

The video semantic tag technology is used for understanding the content of a video by utilizing a computer technology and automatically labeling tags with different dimensions (such as entities, subjects and sides) and the like to express the core content of the video. The video semantic tags can be applied to multiple scenes such as video recommendation, search, media asset management and the like. Among different types of video tags, an entity is one of the most important dimensions for understanding video content, and is also an important dependent item of tags such as topics and points of interest.

However, the types of entities concerned by video are different for different industry fields, and currently, after a industry (such as military) is migrated to another industry (such as education), a large amount of data of the migrated industry field needs to be labeled, and an entity classification model is trained by using the large amount of labeled data.

Disclosure of Invention

The application provides an entity classification model training method, an entity classification device and electronic equipment.

In a first aspect, an embodiment of the present application provides a method for training an entity classification model, where the method includes:

training the pre-training model based on the general training sample to obtain a first entity classification model;

training the pre-training model based on a first labeling sample of a first industry field to obtain a second entity classification model;

extracting a second training sample of the first industry field from the general training samples according to the second entity classification model;

and training the first entity classification model according to the second training sample to obtain a target classification model.

In the method of this embodiment, a general training model is first used for training to obtain a first entity classification model, first labeling data of a first industry field is used for training to obtain a second entity classification model, then the first entity classification model is used for extracting a second training sample of the first industry field from the general training model, and the first entity classification model is used for retraining to obtain a target classification model. In the training process, a large amount of data in the first industry field does not need to be labeled and then trained, and therefore the model training efficiency can be improved.

In a second aspect, an embodiment of the present application provides an entity classification model training apparatus, including:

the first training module is used for training the pre-training model based on the general training sample to obtain a first entity classification model;

the second training module is used for training the pre-training model based on the first labeling sample in the first industry field to obtain a second entity classification model;

the first extraction module is used for extracting a second training sample of the first industry field from the general training samples according to the second entity classification model;

and the third training module is used for training the first entity classification model according to the second training sample to obtain a target classification model.

In a third aspect, an embodiment of the present application provides an entity classification method, including:

acquiring an object to be classified in a first industry field;

classifying the object to be classified based on a target classification model, and determining a first entity classification result in the object to be classified;

extracting a second entity classification result corresponding to the entity type from the entity identification result based on the preset entity type of the first industry field; the entity identification result is an entity which is determined by entity identification of the object to be classified based on a named entity identification NER model;

and combining the first entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified.

In the entity classification method of this embodiment, an object to be classified may be classified by using a target classification model, a first entity classification result in the object to be classified is determined, a second entity classification result corresponding to an entity type is extracted from an entity identification result based on a preset entity type in a first industry field, and the first entity classification result and the second entity classification result are combined to obtain a target entity classification result of the object to be classified. The target entity classification result of the object to be classified is determined by combining the first entity classification result determined by the target classification model and the second entity classification result corresponding to the entity type extracted from the entity identification result, so that the entity classification accuracy can be improved.

In a fourth aspect, an embodiment of the present application provides a classification apparatus, including:

the first acquisition module is used for acquiring an object to be classified in a first industry field;

the first classification module is used for classifying the objects to be classified based on a target classification model and determining a first entity classification result in the objects to be classified;

the second extraction module is used for extracting a second entity classification result corresponding to the entity type from the entity identification result based on the preset entity type of the first industry field; the entity identification result is an entity which is determined by entity identification of the object to be classified based on a named entity identification NER model;

and the first merging module is used for merging the first entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified.

In a fifth aspect, an embodiment of the present application further provides an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform an entity classification model training method or an entity classification method provided by embodiments of the present application.

In a sixth aspect, an embodiment of the present application further provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the entity classification model training method or the entity classification method provided by the embodiments of the present application.

In a seventh aspect, an embodiment of the present application further provides a computer program product, which includes a computer program, and the computer program is configured to enable the computer to execute the entity classification model training method or the entity classification method provided in the embodiments of the present application.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a schematic flow chart diagram illustrating a method for training an entity classification model according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart diagram illustrating an entity classification method according to an embodiment of the present disclosure;

FIG. 3 is one of the application scene diagrams of a video semantic tag technology;

FIG. 4 is a graph of video semantic tag results obtained by video semantic tag technology;

FIG. 5 is a schematic diagram of video semantic tag migration provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of entity classification model migration provided by an embodiment of the present application;

FIG. 7 is a block diagram of a classification model training apparatus according to an embodiment provided herein;

FIG. 8 is a block diagram of a sorting apparatus according to one embodiment provided herein;

FIG. 9 is a block diagram of an electronic device for implementing an entity classification model training method or an entity classification method according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

As shown in fig. 1, according to an embodiment of the present application, there is provided an entity classification model training method, including:

step S101: and training the pre-training model based on the general training sample to obtain a first entity classification model.

The universal training sample can be a universal corpus sample, can be understood as a universal sample in various industry fields, has universality and meets the requirements of the general industry fields. The pre-training model is a pre-training entity classification model, which can be understood as a language model that has been trained in advance by using a sample set (which may be different from the samples used for training in the present embodiment). In this embodiment, the pre-training model is trained by using a general training sample to obtain a first entity classification model.

Step S102: and training the pre-training model based on the first labeled sample in the first industry field to obtain a second entity classification model.

The first industry field can be understood as the migrated industry field, and the entity classification model needs to be applied to the first industry field for entity classification, so as to improve the accuracy of entity classification in the first industry field. Therefore, in the model training process of the embodiment, the pre-training model can be trained by using the first labeled sample in the first industry field to obtain the second entity classification model, so that the obtained second entity classification model has better pertinence to the industry field, can better perform entity classification on the object to be classified in the first industry field, and can meet the requirement of a specific industry. As an example, the magnitude of the first labeled sample is in a preset magnitude range, for example, the preset magnitude range may be hundreds to thousands of magnitudes, that is, the first labeled sample is a small amount of labeled data in the first industry field.

Step S103: and extracting a second training sample of the first industry field from the universal training samples according to the second entity classification model.

The second entity classification model is obtained by training according to the first labeled sample of the first industry field, and the second entity classification model can be used for extracting the second training sample of the first industry field from the universal training samples, so that the relevance between the obtained second training sample and the first industry field is enhanced, and the accuracy of the second training sample is improved.

Step S104: and training the first entity classification model according to the second training sample to obtain a target classification model.

And retraining the first entity classification model by using a second training sample related to the first industry field to obtain a target classification model, so that the target classification model can better meet the requirements of the first industry field, and the performance of the target classification model in the first industry field can be improved. After the target classification model is obtained, the target classification model can be applied to the actual first industry field to obtain the object to be classified in the first industry field, and the target classification model is utilized to perform entity classification on the object to be classified in the first industry field, so that the entity classification accuracy is improved.

In the method of this embodiment, a general training model is first used for training to obtain a first entity classification model, first labeling data of a first industry field is used for training to obtain a second entity classification model, then the first entity classification model is used for extracting a second training sample of the first industry field from the general training model, and the first entity classification model is used for retraining to obtain a target classification model. In the training process, a large amount of data in the first industry field does not need to be labeled and then trained, and therefore the model training efficiency can be improved. Meanwhile, the first entity classification model obtained by training the general training sample is retrained through the second training sample, so that the performance of the obtained target classification model can be improved.

In one embodiment, before the training of the pre-training model based on the first labeled sample of the first industry field to obtain the second entity classification model, the method further includes: obtaining a first unmarked sample of a first industry field; and labeling the first unmarked sample to obtain a first labeled sample.

Before the pre-training model is trained by using the first labeled sample, the first labeled sample needs to be obtained, the first unlabeled sample in the first industry field is obtained firstly, and then the first labeled sample is labeled to obtain the first labeled sample. The labeling method is various, and is not limited in the embodiment of the present application, for example, labeling may be performed by a Mark labeling algorithm or a first industry field expert.

In this embodiment, a first unlabeled sample is obtained first, and then labeled to obtain a first labeled sample, and the pre-training model is trained through the first labeled sample to improve the performance of the obtained second entity classification model.

In one embodiment, after obtaining the first unlabeled sample, the method further includes:

and performing semi-supervised training on the pre-training model based on the first label-free sample to obtain a third entity classification model.

In this embodiment, not only the second entity classification model may be obtained by training the first labeled sample labeled with the first unlabeled sample, but also the third entity classification model (which may be understood as the semi-supervised model in the first field) may be obtained by semi-supervised training the pre-trained model with the first unlabeled sample in the first field, so that not only the second entity classification model may be obtained to obtain the target classification model, but also the third entity classification model may be obtained for subsequent classification. On the basis of the first entity classification model and the target classification model, a third entity classification model obtained by training with a first label-free sample in the first industry field is added for classification, so that the performance of the model obtained after training can be improved.

In one embodiment, semi-supervised training of the pre-training model based on the first label-free sample to obtain a third entity classification model includes:

performing data enhancement on the first unmarked sample to obtain a second unmarked sample;

and performing semi-supervised training on the pre-training model through the first label-free sample and the second label-free sample to obtain a third entity classification model.

In this embodiment, the data enhancement is performed on the first unlabeled sample, and the training is performed in combination with the first unlabeled sample and the second unlabeled sample after the data enhancement, that is, the training amount is increased, so that the performance of the third entity classification model is improved.

As an example, there are various ways of enhancing data of the first unlabeled sample, which are not limited in the embodiments of the present application, for example, synonym pairs (including multiple words and multiple synonyms corresponding to the multiple words) in the synonym library may be used to replace synonyms of the words for the first unlabeled sample to obtain the second unlabeled sample. For example, the first unlabeled sample includes "a goes to market B in the morning today", and the synonym library includes market B and the corresponding synonym market C, so that market B can be replaced by market C to obtain a sample of "a goes to market C in the morning today. For another example, the data enhancement may be performed by translating the first unlabeled sample, that is, translating the first unlabeled sample into a sample of a first language, where the language of the first unlabeled sample is different from the first language, and then translating the translated sample of the first language into a sample of a second language, where the second language is the language of the first unlabeled sample, to obtain a second unlabeled sample.

In one embodiment, training the pre-training model based on the first labeled sample of the first industry field to obtain a second entity classification model comprises:

based on the knowledge graph, performing data enhancement on the first labeled sample to obtain a second labeled sample;

and training the pre-training model through the first labeled sample and the second labeled sample to obtain a second entity classification model.

The knowledge graph is a semantic network for revealing the relationship among the entities, namely, the knowledge graph can comprise a plurality of entities and the relationship among the entities, and the entity replacement can be carried out on the first labeled sample through the knowledge graph, so that the data enhancement is realized. For example, for a labeled sample of "grandma of D is F", where the entities include D, grandma, and F, however, the aliases of these entities in the knowledge-graph correspond to: d. wife or wife, f, and the sample obtained after data enhancement may be "d wife is f" or "d wife is f", etc.

In this embodiment, the knowledge graph is used to perform data enhancement on the first labeled sample, and the first labeled sample and the data-enhanced second labeled sample are combined to perform training, that is, the training amount is increased, so that the performance of the second entity classification model is improved.

As an example, the knowledge graph may be a knowledge graph in the first industry field, so that the obtained second labeled sample has a stronger association with the first industry field, and the obtained second entity classification model has a stronger association with the first industry field, which may improve the classification performance of the second entity classification model in the first industry field.

In one embodiment, the first entity classification model comprises a first entity annotation model and a first core entity classification model, the second entity classification model comprises a second entity annotation model and a second core entity classification model, and the pre-trained model comprises a first pre-trained model and a second pre-trained model;

the first entity labeling model is obtained by training a first pre-training model through a general training sample, the first core entity classification model is obtained by training a second pre-training model through a general training sample, the second entity labeling model is obtained by training the first pre-training model through the first labeling sample, and the second core entity classification model is obtained by training the second pre-training model through the first labeling sample.

The entity labeling model is used for performing entity labeling, and can be understood as a sequence labeling model, and the core entity classification model is used for performing core entity identification on entities labeled by the entity labeling model, that is, identifying which of the labeled entities are core entities. I.e. the input of the first core entity classification model comprises the output of the first entity annotation model and the input of the second core entity classification model comprises the output of the second entity annotation model. Extracting a second training sample of the first industry field from the general training samples through the second entity classification model; and training the first entity classification model according to the second training sample to obtain a target classification model, so that the performance of the target classification model can be improved.

It should be noted that the first pre-training model and the second pre-training model are based on the same language model (e.g., ERNIE language model), and output models with different layers.

As shown in fig. 2, according to an embodiment of the present application, the present application provides an entity classification method, including:

step S201: acquiring an object to be classified in a first industry field;

step S202: classifying the object to be classified based on the target classification model, and determining a first entity classification result in the object to be classified;

it should be noted that the target classification model is a model obtained by training the first entity classification model according to a second training sample, the second training sample is a training sample of the first industry field extracted from the first general training sample according to the second entity classification model, the second entity classification model is obtained by training the pre-training model based on the first labeled sample of the first industry field, and the first entity classification model is obtained by training the pre-training model based on the first general training sample. It can be understood that the target classification model in this embodiment is the target classification model in the embodiment of the entity classification model training method, and details are not repeated here.

Step S203: extracting a second entity classification result corresponding to the entity type from the entity identification result based on the preset entity type of the first industry field;

and the entity identification result is an entity which is determined by entity identification of the object to be classified based on the named entity identification NER model.

Before step S203, Entity Recognition may be performed on the object to be classified through an NER (Named Entity Recognition) model, an Entity Recognition result may be obtained, and before step S203, a preset Entity type of the first industry field is configured in advance, for example, the first industry field is an educational industry, the Entity type includes a plurality of Entity types, for example, school, teacher, etc., a teacher type, a school type, etc., that is, the preset Entity type may include a teacher type, a school type, etc. Then, based on the preset entity type, a second entity classification result corresponding to the entity type is extracted from the entity identification result, that is, the second entity classification result includes the entity corresponding to the entity type. For example, for a preset entity type of teacher type, an entity of teacher type may be recalled from the entity identification result.

Step S204: and combining the first entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified.

Different entities may exist between the first entity classification result and the second entity classification result, and a target entity classification result of the object to be classified can be obtained by combining the first entity classification result and the second entity classification result, so that the accuracy of the target entity classification result is improved. As an example, the merging the first entity classification result and the second entity classification result may be a union of the first entity classification result and the second entity classification result, and the target entity classification result of the object to be classified is obtained by eliminating duplication.

As an example, the merging the first entity classification result and the second entity classification result to obtain the target entity classification result of the object to be classified may include: and under the condition that the first entity classification result meets the preset requirement, combining the first entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified so as to improve the accuracy of the classification result. As an example, the preset requirement may be that the accuracy of the first entity classification result is greater than a preset accuracy, or that the classification error rate of the first entity classification result is less than a preset error rate, and the like.

In one embodiment, classifying the object to be classified based on the target classification model, and after determining the first entity classification result in the object to be classified, the method further includes:

under the condition that the first entity classification result does not meet the preset requirement, classifying the objects to be classified based on the second entity classification model or the third entity classification model, and determining a third entity classification result in the objects to be classified;

and combining the third entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified.

If the first entity classification result does not meet the preset requirement, the effect of classifying through the target classification model is poor, at the moment, the second entity classification model or the third entity classification model can be used for classifying the object to be classified, the third entity classification result in the object to be classified is determined, and the third entity classification result and the second entity classification result are combined to obtain the target entity classification result of the object to be classified, so that the classification effect is improved.

As an example, merging the third entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified may include: and taking a union set of the third entity classification result and the second entity classification result, and eliminating repetition to obtain a target entity classification result of the object to be classified.

The process of training the entity classification model is described in detail below with an embodiment.

The video semantic tag technology is to understand the content of a video by using a computer technology, and represent the core content of the video by automatically labeling tags with different dimensions (such as entities, topics and sides), and the like, as shown in fig. 3. As shown in fig. 4, the semantic labeling result of the video is obtained by processing the input video through the video semantic labeling technology. The video semantic tags can be applied to multiple scenes such as video recommendation, search, media asset management and the like. Among different types of video tags, an entity is one of the most important dimensions for understanding video content, and is also an important dependent item of tags such as topics and points of interest. However, for different industries, the types of entities concerned by videos are different, and a technical scheme of video semantic tags supporting migration is needed to migrate from one industry field (such as military) to another industry field (such as education) so as to reduce the cost in the aspects of data annotation, model training and the like, and meet the requirement of rapidly and efficiently supporting video tag understanding of the migrated industry field.

As shown in fig. 5, the schematic diagram of video semantic tag migration is shown, where the video semantic tag technology includes a bottom dependency layer, a base policy layer, and a core entity policy layer.

Wherein the bottom dependent layer comprises:

automatic Speech Recognition (ASR) module: identifying the language in the video and converting the language into characters;

optical Character Recognition (OCR) module: identifying an optical text in the video and converting the optical text into the text;

a FACE module: and a face recognition module.

A video classification module: the video belongs to the category classification.

The basic strategy layer comprises:

named Entity Recognition (NER) module: the entities are identified from the inputs of different modalities of the video.

The core entity policy layer includes:

rule/SCHEMA based scheme: by configuring SCHEMA, the type of interest entity is recalled.

Inference-based approach: and verifying and expanding the entity by using the knowledge graph.

End-to-end (E2E) scheme: the entity extraction and the core degree judgment (judging core entity) of the text are fused in a set of models, and an entity labeling model and a core entity classification model are trained on the basis of a pre-training language model ERNIE 2.0.

PIPELINE (PIPELINE) protocol: firstly training the model to extract the entity, and then training the core entity classification model to judge the core entity

In the migration scheme, the NER module migration is developed based on a language model, and is not limited in the embodiment of the present application.

For the core entity classification migration scheme, as shown in fig. 6, the scheme marked as (r) in fig. 6 is trained by using the existing generic training samples (i.e., the existing samples in fig. 6) to obtain the original model (i.e., the first entity classification model). The scheme involves training of two models, a first core entity classification model (i.e., a first entity core degree decision model) and a first entity annotation model (i.e., a first core entity extraction model or a first sequence annotation model). The first core entity classification model is a classification model based on an ERNIE (enhanced Language Representation with information entities) Language model, and the first entity labeling model is a labeling model based on an ERNIE Language model + CRF (conditional random field). The two types of models are trained by utilizing the language materials in the general field, and have universality. The method meets the requirements of general industries, and increases the data of industry fields when facing specific industries.

Additionally, the scheme labeled as (c) in FIG. 6 may be enabled. Mark or an industry expert marks a small amount (hundreds to thousands of levels) of first unmarked samples in a first industry field (namely unmarked data in the field in fig. 6), performs data enhancement on the first unmarked samples, and trains by using the marked first marked samples (namely marked small samples in the field in fig. 6) and the second marked samples after the data enhancement to obtain a field-less sample model (namely a second entity classification model). And the scheme labeled 2.1 in figure 6 is initiated. And performing weak supervision and correction learning on the existing general training samples by using the field few-sample model (namely performing sample extraction on the general training samples by using the field few-sample model) to obtain a second training sample, and retraining the original model by using the second training sample to obtain an updated original model (namely a target classification model).

Optionally, the scheme marked by the third step in fig. 6 is started, the first unlabeled sample is subjected to data enhancement by combining a knowledge graph to obtain a second unlabeled sample, and the pre-training model is trained by using the first unlabeled sample and the second unlabeled sample to obtain a third entity classification model, i.e., the field semi-supervised model in fig. 6. The data enhancement mode based on the knowledge graph is used for carrying out data expansion by disambiguating unmarked data in the input field to obtain information such as alias and upper level of an entity in input. The data enhancement mode maintains the semantic reasonableness of the input compared with the data enhancement of the common synonym or translation mode. For example, the input: the wife of [ D ] is F ], disambiguating entity: 【D】 The aliases of the entities in the knowledge graph are respectively: 【d】 Wife, wife and wife, and f, the data-enhanced sample can be: the wife of [ D ] is f ], and the traditional synonym data replacing enhancement mode may replace [ D ] with [ E ], so that the consistency of semantic facts before and after data enhancement cannot be ensured.

Paradigm (SCHMA)/dictionary configuration scheme: and regulating and controlling the recall entity by configuring video SCHEMA in different industries. For example, the military industry is concerned with military-type programs, military television shows, military movies, military characters, military activities, military weapons, and the like.

The target classification model obtained by the entity classification model training method can be applied to application scenes such as video content understanding and the like facing different industrial fields (such as the industrial fields of education, military, medical treatment, finance and the like), and can also be applied to application scenes such as video recommendation, search, media resource management and the like.

In summary, the method of the embodiment of the application, the field migration technology of the entity classification model does not need manual review, small-batch data labeling is performed in the field, the migration cost is reduced, the migration efficiency is improved, the migration performance is strong, the semantic and logic consistency before and after data enhancement can be ensured by the data enhancement technology based on the knowledge graph, and the problem of semantic inconsistency before and after the data enhancement scheme of the traditional synonym replacement is avoided.

As shown in fig. 7, according to an embodiment of the present application, there is also provided an entity classification model training apparatus 700, where the apparatus 700 includes:

a first training module 701, configured to train a pre-training model based on a general training sample to obtain a first entity classification model;

a second training module 702, configured to train the pre-training model based on the first labeled sample in the first industry field to obtain a second entity classification model;

a first extraction module 703, configured to extract a second training sample of the first industry field from the general training samples according to the second entity classification model;

and a third training module 704, configured to train the first entity classification model according to the second training sample, so as to obtain a target classification model.

In one embodiment, the apparatus 700, further comprises:

the sample acquisition module is used for acquiring a first label-free sample of the first industry field before the second training module executes the first label sample based on the first industry field to train the pre-training model and obtain a second entity classification model;

and the marking module is used for marking the first unmarked sample to obtain a first marked sample.

In one embodiment, the apparatus 700, further comprises:

and the fourth training module is used for performing semi-supervised training on the pre-training model based on the first label-free sample after the sample acquisition module acquires the first label-free sample, so as to obtain a third entity classification model.

In one embodiment, a fourth training module, comprising:

the first data enhancement module is used for performing data enhancement on the first unmarked sample to obtain a second unmarked sample;

and the first training submodule is used for carrying out semi-supervised training on the pre-training model through the first label-free sample and the second label-free sample to obtain a third entity classification model.

In one embodiment, the second training module comprises:

the second data enhancement module is used for carrying out data enhancement on the first labeled sample based on the knowledge graph to obtain a second labeled sample;

and the second training submodule is used for training the pre-training model through the first labeled sample and the second labeled sample to obtain a second entity classification model.

The entity classification model training device in each of the embodiments is a device for implementing the entity classification model training method in each of the embodiments, and has corresponding technical features and technical effects, which are not described herein again.

As shown in fig. 8, the present application further provides a sorting apparatus 800 according to an embodiment of the present application, the apparatus 800 includes:

a first obtaining module 801, configured to obtain an object to be classified in a first industry field;

a first classification module 802, configured to classify an object to be classified based on a target classification model, and determine a first entity classification result in the object to be classified;

a second extraction module 803, configured to extract, based on a preset entity type in the first industry field, a second entity classification result corresponding to the entity type from the entity identification result; the entity identification result is an entity which is determined by entity identification of the object to be classified based on a named entity identification NER model;

the first merging module 804 is configured to merge the first entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified.

In one embodiment, the apparatus 800, further comprises:

the second classification module is used for classifying the objects to be classified based on the target classification model and determining a first entity classification result in the objects to be classified, and classifying the objects to be classified based on the second entity classification model or the third entity classification model and determining a third entity classification result in the objects to be classified under the condition that the first entity classification result does not meet the preset requirement;

and the second merging module is used for merging the third entity classification result and the second entity classification result to obtain a target entity classification result of the object to be classified.

The classification device of each embodiment is a device for implementing the entity classification method of each embodiment, and the technical features and technical effects correspond to each other, which are not described herein again.

There is also provided, in accordance with an embodiment of the present application, an electronic device, a readable storage medium, and a computer program product.

Fig. 9 is a block diagram of an electronic device for an entity classification model training method or an entity classification method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 9, the electronic apparatus includes: one or more processors 901, memory 902, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of the GUM on an external input/output device (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 9 illustrates an example of a processor 901.

Memory 902 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the entity classification model training method or the entity classification method provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the entity classification model training method or the entity classification method provided herein.

The computer program product of the embodiments of the present application includes a computer program, and the computer program is used to enable a computer to execute the entity classification model training method or the entity classification method provided by the embodiments of the present application.

The memory 902, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the entity classification model training method in the embodiments of the present application (e.g., the first training module 701, the second training module 702, the first extraction module 703, and the third training module 704 shown in fig. 7), or program instructions/modules corresponding to the entity classification method in the embodiments of the present application (e.g., the first obtaining module 801, the first classification module 802, the second extraction module 803, and the first merging module 804 shown in fig. 8). The processor 901 executes various functional applications of the server and data processing, i.e., an entity classification model training method or an entity classification method in the above method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory 902.

The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device displayed by the keyboard, and the like. Further, the memory 902 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 902 may optionally include memory located remotely from the processor 901, which may be connected to a keyboard-displayed electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the entity classification model training method or the entity classification method may further include: an input device 903 and an output device 904. The processor 901, the memory 902, the input device 903 and the output device 904 may be connected by a bus or other means, and fig. 9 illustrates the connection by a bus as an example.

The input device 903 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device displayed by the keyboard, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 904 may include a display device, auxiliary lighting devices (e.g., LEDs), tactile feedback devices (e.g., vibrating motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, special-purpose ASMC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using procedural and/or object oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the entity classification model training scheme, a pre-training model is trained based on a general training sample to obtain a first entity classification model; training the pre-training model based on a first labeled sample of the first industry field to obtain a second entity classification model; extracting a second training sample of the first industry field from the general training samples according to a second entity classification model; and training the first entity classification model according to the second training sample to obtain a target classification model. In the training process, a large amount of data in the first industry field does not need to be labeled and then trained, and therefore the model training efficiency can be improved. .

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method of entity classification model training, the method comprising:

2. The method of claim 1, wherein before the training the pre-trained model based on the first labeled sample of the first industry field to obtain the second entity classification model, the method further comprises:

obtaining a first unmarked sample of the first industry field;

and labeling the first unlabeled sample to obtain the first labeled sample.

3. The method of claim 2, after obtaining the first unlabeled sample, further comprising:

4. The method of claim 3, wherein semi-supervised training of the pre-trained model based on the first label-free sample results in a third entity classification model, comprising:

and performing semi-supervised training on the pre-training model through the first label-free sample and the second label-free sample to obtain the third entity classification model.

5. The method of claim 1, wherein training the pre-trained model based on the first labeled sample of the first industry segment to obtain a second entity classification model comprises:

based on a knowledge graph, performing data enhancement on the first labeled sample to obtain a second labeled sample;

and training the pre-training model through the first labeled sample and the second labeled sample to obtain the second entity classification model.

6. The method of claim 1, the first entity classification model comprising a first entity annotation model and a first core entity classification model, the second entity classification model comprising a second entity annotation model and a second core entity classification model, the pre-trained model comprising a first pre-trained model and a second pre-trained model;

the first entity labeling model is obtained by training the first pre-training model through the universal training sample, the first core entity classification model is obtained by training the second pre-training model through the universal training sample, the second entity labeling model is obtained by training the first pre-training model through the first labeling sample, and the second core entity classification model is obtained by training the second pre-training model through the first labeling sample.

7. A method of entity classification, the method comprising:

acquiring an object to be classified in a first industry field;

8. The method of claim 7, after the classifying the object to be classified based on the target classification model and determining the first entity classification result in the object to be classified, further comprising:

under the condition that the first entity classification result does not meet preset requirements, classifying the objects to be classified based on the second entity classification model or a third entity classification model, and determining a third entity classification result in the objects to be classified;

9. An entity classification model training apparatus, the apparatus comprising:

10. The apparatus of claim 9, further comprising:

the sample acquisition module is used for acquiring a first label-free sample of the first industry field before the second training module executes a first label sample based on the first industry field to train the pre-training model to obtain a second entity classification model;

and the marking module is used for marking the first unmarked sample to obtain the first marked sample.

11. The apparatus of claim 10, further comprising:

12. The apparatus of claim 11, the fourth training module, comprising:

and the first training submodule is used for carrying out semi-supervised training on the pre-training model through the first label-free sample and the second label-free sample to obtain the third entity classification model.

13. The apparatus of claim 9, the second training module, comprising:

and the second training submodule is used for training the pre-training model through the first labeled sample and the second labeled sample to obtain the second entity classification model.

14. The apparatus of claim 9, the first entity classification model comprising a first entity annotation model and a first core entity classification model, the second entity classification model comprising a second entity annotation model and a second core entity classification model, the pre-trained model comprising a first pre-trained model and a second pre-trained model;

15. An entity classification apparatus, the apparatus comprising:

16. The apparatus of claim 15, further comprising:

the second classification module is used for classifying the objects to be classified based on the second entity classification model or a third entity classification model and determining a third entity classification result in the objects to be classified based on the second entity classification model or the third entity classification model under the condition that the first entity classification result does not meet preset requirements after the first classification module executes the classification of the objects to be classified based on the target classification model and determines the first entity classification result in the objects to be classified;

17. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the entity classification model training method of any one of claims 1-6 or the entity classification method of any one of claims 7-8.

18. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the entity classification model training method of any one of claims 1-6 or the entity classification method of any one of claims 7-8.

19. A computer program product comprising a computer program for causing a computer to perform the entity classification model training method of any one of claims 1-6 or the entity classification method of any one of claims 7-8.