CN111858905B - Model training method, information identification device, electronic equipment and storage medium - Google Patents

Model training method, information identification device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111858905B
CN111858905B CN202010697599.4A CN202010697599A CN111858905B CN 111858905 B CN111858905 B CN 111858905B CN 202010697599 A CN202010697599 A CN 202010697599A CN 111858905 B CN111858905 B CN 111858905B
Authority
CN
China
Prior art keywords
text
sample
quality
network model
paragraph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010697599.4A
Other languages
Chinese (zh)
Other versions
CN111858905A (en
Inventor
白亚楠
刘子航
林荣逸
欧阳宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010697599.4A priority Critical patent/CN111858905B/en
Publication of CN111858905A publication Critical patent/CN111858905A/en
Application granted granted Critical
Publication of CN111858905B publication Critical patent/CN111858905B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a model training method, an information identification device, electronic equipment and a storage medium, and relates to the technical field of deep learning. The specific implementation scheme is as follows: acquiring a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain the target network model for identifying the text quality, so that the target network model can be adopted to audit the text information to identify the quality of the text information, and the identification efficiency is improved.

Description

Model training method, information identification device, electronic equipment and storage medium
Technical Field
The present application relates to deep learning in the field of data processing technologies, and in particular, to a model training method, an information identifying apparatus, an electronic device, and a storage medium.
Background
With the development of technology, more and more people perform information inquiry through search tools. For example, search papers, search merchandise related information, search medical information, and the like. Because the existing information base comprises a large amount of text information, the quality of the text information is uneven, the text information needs to be identified according to the text content, for example, the text information in the information base is audited in a manual auditing mode, and the quality is judged manually.
Disclosure of Invention
The disclosure provides a model training method, an information identification device, electronic equipment and a storage medium.
According to a first aspect of the present disclosure, there is provided a model training method, comprising:
Acquiring a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text;
and training the basic network model by using the first sample to obtain a target network model for text quality recognition.
According to a second aspect of the present disclosure, there is provided a model training apparatus comprising:
the device comprises an acquisition module, a judgment module and a display module, wherein the acquisition module is used for acquiring a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text;
And the training module is used for training the basic network model by using the first sample to acquire a target network model for text quality recognition.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
At least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the first aspects.
According to a fourth aspect of the present disclosure there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of the first aspects.
According to a fifth aspect of the present disclosure, there is provided an information identifying method including:
Acquiring a text to be identified;
Identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:
FIG. 1 is a flow chart of a model training method provided by an embodiment of the present application;
FIG. 2 is another flow chart of a model training method provided by an embodiment of the present application;
FIG. 3 is a block diagram of a deep learning sequence model provided by an embodiment of the present application;
FIG. 4 is a flowchart of an information identification method provided by an embodiment of the present application;
FIG. 5 is a block diagram of a model training apparatus provided by an embodiment of the present application;
Fig. 6 is a block diagram of an information identifying apparatus according to an embodiment of the present application;
FIG. 7 is a block diagram of an electronic device for implementing a model training method of an embodiment of the application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Referring to fig. 1, fig. 1 is a flowchart of a model training method provided by an embodiment of the present application, and as shown in fig. 1, the embodiment provides a model training method applied to an electronic device, including the following steps:
Step 101, acquiring a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text.
Generally, an article is composed of four elements, namely a title, a subtitle, a general paragraph, and a picture, and typically one title includes a plurality of subtitles, and one subtitle includes a plurality of paragraphs and pictures. In this embodiment, the first text includes a title, a subtitle, and one or more of the one or more paragraphs. The text quality of the first text may be set according to actual conditions, for example, the text quality of the first text is set to three levels, namely, three levels of high quality, normal and low quality; or the text quality of the first text is set to 10 grades, each grade corresponds to a score value interval, and the higher the score value is, the better the quality is. The labeling information, i.e. the quality level or score value of the first text, can be labeled manually. The first sample includes feature information of paragraphs of the first text, and if the first text includes a plurality of paragraphs, the first sample includes feature information of each paragraph of the first text, that is, the first sample includes feature information that is acquired in units of paragraphs.
And 102, training the basic network model by using the first sample to acquire a target network model for text quality recognition.
The underlying network model may be a neural network model, preferably a deep learning sequence model. And training the basic network model by using the first sample to obtain a target network model, wherein the target network model can be used for identifying the text quality.
In this embodiment, a first sample is obtained, where the first sample includes feature information of a paragraph of a first text and labeling information of text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample, and a target network model for identifying the text quality is obtained, so that the target network model can be adopted to audit the text information to identify the quality of the text information, and the identification efficiency is improved.
In one embodiment of the present application, the training the basic network model with the first sample to obtain the target network model includes:
training a basic network model by using the first sample to obtain an intermediate network model;
Predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
And if the prediction result meets the preset condition, training the basic network model by adopting the first sample and the second sample to obtain a target network model.
In this embodiment, the intermediate network model is a network model obtained after training the base network model with the first sample. The second sample may be considered as a sample for prediction, and the second sample is input into the intermediate network model to obtain a prediction result of the text quality of the second sample. The prediction result includes a plurality of prediction values and corresponding probability values. For example, if the text quality of the first text is set to three levels, i.e., high quality, normal, and low quality, respectively, the plurality of predicted values are high quality, normal, and low quality, respectively, and the predicted result includes high quality, normal, and low quality, respectively, probability values of high quality, normal, and low quality, respectively, wherein the probability values of high quality, normal, and low quality, respectively, are added to 100%.
In this embodiment, the second sample with the prediction result satisfying the prediction condition is manually labeled, and then the first sample and the labeled second sample are used to train the basic network model to obtain the target network model, so that the training data of each labeling is the data which is not well processed by the intermediate network model, the learning efficiency of the basic network model can be effectively improved, meanwhile, the manual labeling of excessive sample data is avoided, and the labeling efficiency is improved.
In one embodiment of the present application, the prediction result includes at least two prediction values, and a probability of each of the at least two prediction values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
The preset threshold may be set according to practical situations, for example, 2%, 4%, etc., which is not limited herein. The absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is smaller than a preset threshold value, which indicates that the probabilities corresponding to the two predicted values are not different, and the probability corresponding to one of the two predicted values is the maximum probability in the predicted result, so that when the intermediate network model identifies the second sample, the probability of identification error is high. For example, the prediction results are classified into high quality, normal and low quality, the corresponding probabilities are 49%, 45% and 6%, the probabilities of the two quality grades are similar, the probability of the second sample being high quality or the probability of the second sample being normal are larger, in order to further improve the recognition accuracy of the intermediate network model, the second sample of the prediction results meeting the prediction conditions is manually marked, and then the first sample and the marked second sample are used for training the basic network model, so that the target network model is obtained.
In this embodiment, the absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is set to be smaller than the preset threshold, so that the data which is not well processed by the intermediate network model can be manually marked, then the first sample and the marked second sample are utilized to train the basic network model, and the target network model is obtained, so that the training data marked each time is the data which is not well processed by the intermediate network model, the learning efficiency of the basic network model can be effectively improved, meanwhile, too many sample data are avoided to be manually marked, and the marking efficiency is improved.
In one embodiment of the application, the feature information includes inter-element relationship features and intra-element features of paragraphs;
wherein the inter-element relationship feature is determined from at least one of the following relationships:
a relationship between a title of the first text and a title of the paragraph;
a relationship between a title of the first text and the paragraph;
a relationship between the paragraph and other paragraphs of the first text;
The element intrinsic features include structural features and text features;
Wherein the structural features include an organization form of the paragraph and key information, and the text features include a quality of a title of the first text and a quality of a text of the paragraph.
The feature information includes inter-element relationship features and intra-element features of the paragraphs. Wherein, the relationship characteristics among elements are defined as follows:
Relationships among the elements of the article (also called text) can better reflect satisfaction, richness, logic and the like of the content, and element relationship features are extracted through the relationships among the elements. Relationships between elements are largely divided into three categories: the relation between a title (i.e. the title of an article) and a paragraph, the relation between a title and a subtitle (i.e. the title of a paragraph), the relation between paragraphs.
Wherein the relation between the article title and the paragraph is the first relation. Defining a first relationship tag includes: quoted, satisfied, summarized, related, uncorrelated, etc., the relationship between the title and the paragraph is characterized by a training model. For example, a "satisfy" tag indicates that the current paragraph is a direct answer to a question in the title, used to characterize the satisfaction, richness, text logic, answer integrity, etc. of the article.
The relationship of title and subtitle, i.e. the second relationship. Defining the second relationship tag includes: the relationship between headlines and subtitle are characterized by a training model, which is a complex, satisfied, correlated, etc. For example, the "repeat" tab indicates that the current subtitle is a duplicate representation of the title, used to locate the portion of the article that directly satisfies the title.
The relation between paragraphs is the third relation. The third relationship includes repetition, i.e., whether there is a repetition of the current paragraph with other paragraphs, which reduces the amount of information of the content.
The element-inherent features include: structural features and textual features to characterize the relationships inherent to the article elements.
Wherein, structural feature: the article has rich structural features, and the information acquisition convenience of the content can be reflected from the structural features. The structural features mainly comprise two types of information: one is an article organization form, such as whether a subtitle is contained, whether a picture is contained, whether paragraphs are organized in a list or order form, etc.; the other is key information processing: whether to thicken, highlight, etc. the key information.
Text characteristics: including both the text quality of the title and the text quality of the paragraph, are used to characterize the user experience factor. The text quality of a title mainly includes: whether or not to contain intent, whether or not to contain malicious intent, whether or not to have wrongly written words, phrases, and the like; the text quality of a paragraph mainly includes: content fluency, whether text is truncated, whether wrongly written words are included, and the like.
In this embodiment, the feature information includes the inter-element relationship feature and the intra-element feature of the paragraph, so that the feature information of the first sample is more comprehensive, and thus, the accuracy of the target network model obtained based on the first sample training for identifying the text quality is higher. When the target network model is applied to the search field, the search result can be filtered by utilizing the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, and meanwhile, the problem that low-quality content exists in the search result is avoided, and the user search satisfaction is further improved.
In one embodiment of the application, the structural features are obtained by applying preset rules to the paragraphs;
Extracting the first text by adopting a feature extraction model for a first feature except the structural feature in the feature information, wherein the feature extraction model is obtained by training a basic extraction model by adopting a marked third sample; wherein each first feature corresponds to a feature extraction model.
Specifically, when feature information of the first sample is obtained, for structural features in the feature information, for example, whether a subtitle is included, whether a picture is included, whether a text length is organized in a list or a sequence form between paragraphs, whether information is highlighted, etc., a preset rule may be adopted to determine whether the first sample includes a subtitle, whether a picture is included, whether a text length is organized in a list or a sequence form between paragraphs, whether information is highlighted, etc.
And for the first features except the structural features in the feature information, extracting by adopting a feature extraction model. Each first feature corresponds to a feature extraction model, the feature extraction model needs to be trained by adopting a third sample marked manually, and then the trained feature extraction model is used for extracting the features of the first text.
The relationship features and text features among elements are features of partial semantic understanding, and cannot be exhausted through construction rules. For example, the recognition of the fluency of the content cannot distinguish whether the text content is fluent or not through exhaustion, which can achieve a better classification effect through ERNIE/BERT finetuning by using a small number of samples, for example, ERNIE/BERT finetuning is used for training the relation between the title and the paragraph, and a relation extraction model of the title and the paragraph is obtained.
For example, if the text is extracted by cutting off the feature, it is necessary to manually label the cut-off sample and the non-cut-off sample, then train the basic extraction model by using the labeled cut-off sample and the non-cut-off sample to obtain a trained cut-off feature extraction model, and then extract the cut-off feature of the first text by using the trained cut-off feature extraction model. The basic extraction model may employ a neural network model.
For example, for extracting the feature of the text quality of the title, the quality of the title sample needs to be manually marked, then the marked title sample is used for training the basic extraction model to obtain a text quality feature extraction model of the title, and then the text quality feature extraction model of the title is used for extracting the text quality feature of the title for the first text.
That is, for the first features except the structural features in the feature information, it is necessary to obtain relevant samples according to the first features, label the samples, and train the basic extraction model by using the labeled samples to obtain feature extraction models corresponding to the first features, where each first feature corresponds to one feature extraction model. When the feature information of the first sample is obtained, feature extraction can be performed on the first sample through the extraction model of the first feature, so that the first feature of the first sample is obtained.
In this embodiment, the structural feature is obtained by using a preset rule for the paragraph; and extracting the first text by adopting a feature extraction model for the first features except the structural features in the feature information, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the first sample training on text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by utilizing the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, and meanwhile, the problem that low-quality content exists in the search result is avoided, and the user search satisfaction is further improved.
Fig. 2 is a block diagram of a model training method according to an embodiment of the present application, where, as shown in fig. 2, the relationship features between elements include: the relationship between the title of the text and the title of the paragraph, the relationship between the title of the text and the paragraph, and the relationship between the paragraph and other paragraphs of the text, the obtaining of the relationship features between the elements may be obtained by way of relationship modeling, for example, by manually labeling the third sample, as described above, training the basic extraction model to obtain the relationship extraction model, and then extracting the relationship features between the elements of the text by using the relationship extraction model.
The elemental intrinsic features include structural features including an organization form of the paragraph and key information, and text features including a quality of a title of the text and a quality of the text of the paragraph. The quality of the title of the text and the quality of the text of the paragraph may be determined by means of text modeling, for example, by manually labeling the third sample as described above, training the basic extraction model to obtain a title text quality extraction model, and subsequently extracting the title text quality of the text by using the title text quality extraction model.
And training the basic network model by adopting the inter-element relation features and the element internal features to obtain the deep learning sequence model. The deep learning sequence model is used for identifying text quality and outputting a quality signal.
Fig. 3 is a schematic diagram of a deep learning sequence model (i.e., a bidirectional RNN network structure diagram) according to an embodiment of the present application, where, as shown in fig. 3, the deep learning sequence model includes an input layer, an output layer, a forward layer, and a backward layer. The samples input into the deep learning sequence model (hereinafter referred to as sequence model) are feature information in units of paragraphs of text, for example, if the text includes 10 paragraphs, there are 10 groups of data input into the deep learning sequence model, each group corresponds to feature information of one paragraph, and in fig. 3, the first paragraph, the second paragraph and the third paragraph are different paragraphs in the same text.
Setting the maximum input group number of the sequence model as n, wherein n is a positive integer, and if the feature group number of the currently acquired paragraph is greater than n, performing truncation processing to enable the input group number input into the sequence model to be n; if the number of feature groups of the currently acquired paragraph is smaller than n, filling is adopted to fill in insufficient features, so that the number of input groups input into the sequence model is n.
Referring to fig. 4, fig. 4 is a flowchart of an information identification method provided in an embodiment of the present application, and as shown in fig. 5, the embodiment provides an information identification method, which is applied to an electronic device, and includes:
Step 201, a text to be identified is obtained, and the text to be identified is a text which needs to be identified in text quality through a target network model.
Step 202, identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text.
Generally, an article is composed of four elements, namely a title, a subtitle, a general paragraph, and a picture, and typically one title includes a plurality of subtitles, and one subtitle includes a plurality of paragraphs and pictures. In this embodiment, the first text includes a title, a subtitle, and one or more of the one or more paragraphs. The text quality of the paragraph can be set according to the actual situation, for example, the text quality of the first text is set to three grades, namely three grades of high quality, common quality and low quality; or the text quality of the first text is set to 10 grades, each grade corresponds to a score value interval, and the higher the score value is, the better the quality is. The labeling information, i.e. the quality level or score value of the first text, can be labeled manually.
The first sample includes feature information of paragraphs of the first text, and if the first text includes a plurality of paragraphs, the first sample includes feature information of each paragraph of the first text, that is, the first sample includes feature information that is acquired in units of paragraphs.
The underlying network model may be a neural network model, preferably a deep learning sequence model. And training the basic network model by using the first sample to obtain a target network model, wherein the target network model can be used for identifying the text quality.
In the embodiment, a text to be identified is obtained; identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text. By adopting the target network model to identify the text quality of the text to be identified, the labor cost is saved, and the identification efficiency is improved.
In one embodiment of the present application, the process of training the basic network model through the first sample to obtain the target network model includes:
training a basic network model by using the first sample to obtain an intermediate network model;
Predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
And if the prediction result meets the preset condition, training the basic network model by adopting the first sample and the second sample to obtain a target network model.
In this embodiment, the intermediate network model is a network model obtained after training the base network model with the first sample. The second sample may be considered as a sample for prediction, and the second sample is input into the intermediate network model to obtain a prediction result of the text quality of the second sample. The prediction result includes a plurality of prediction values and corresponding probability values. For example, if the text quality of the first text is set to three levels, i.e., high quality, normal, and low quality, respectively, the plurality of predicted values are high quality, normal, and low quality, respectively, and the predicted result includes high quality, normal, and low quality, respectively, probability values of high quality, normal, and low quality, respectively, wherein the probability values of high quality, normal, and low quality, respectively, are added to 100%.
Through interaction between the model and the labeling personnel, a small amount of labeling samples are utilized to train the model with the best generalization effect.
In the training process of the intermediate network model, a batch of fixed set data (namely a second sample) is screened, after each round of training is finished, the intermediate network model is adopted to predict the fixed set data, the data with the prediction result meeting the preset condition is marked, and then the marked data are added into the training set, so that the marked training data are data which are not well processed by the current model, the learning efficiency of the model can be effectively improved, and excessive invalid data are avoided.
In this embodiment, the second sample with the prediction result satisfying the prediction condition is manually labeled, and then the first sample and the labeled second sample are used to train the basic network model to obtain the target network model, so that the training data of each labeling is the data which is not well processed by the intermediate network model, the learning efficiency of the basic network model can be effectively improved, meanwhile, the manual labeling of excessive sample data is avoided, and the labeling efficiency is improved.
In one embodiment of the present application, the prediction result includes at least two predicted values, and a probability of each predicted value of the at least two predicted values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
The preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
The preset threshold may be set according to practical situations, for example, 2%, 4%, etc., which is not limited herein. The absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is smaller than a preset threshold value, which indicates that the probabilities corresponding to the two predicted values are not different, and the probability corresponding to one of the two predicted values is the maximum probability in the predicted result, so that when the intermediate network model identifies the second sample, the probability of identification error is high. For example, the prediction results are classified into high quality, normal and low quality, the corresponding probabilities are 49%, 45% and 6%, the probabilities of the two quality grades are similar, the probability of the second sample being high quality or the probability of the second sample being normal are larger, in order to further improve the recognition accuracy of the intermediate network model, the second sample of the prediction results meeting the prediction conditions is manually marked, and then the first sample and the marked second sample are used for training the basic network model, so that the target network model is obtained.
In this embodiment, the absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is set to be smaller than the preset threshold, so that the data which is not well processed by the intermediate network model can be manually marked, then the first sample and the marked second sample are utilized to train the basic network model, and the target network model is obtained, so that the training data marked each time is the data which is not well processed by the intermediate network model, the learning efficiency of the basic network model can be effectively improved, meanwhile, too many sample data are avoided to be manually marked, and the marking efficiency is improved.
In one embodiment of the application, the feature information includes inter-element relationship features and intra-element features of paragraphs;
wherein the inter-element relationship feature is determined from at least one of the following relationships:
a relationship between a title of the first text and a title of the paragraph;
a relationship between a title of the first text and the paragraph;
a relationship between the paragraph and other paragraphs of the first text;
The element intrinsic features include structural features and text features;
Wherein the structural features include an organization form of the paragraph and key information, and the text features include a quality of a title of the first text and a quality of a text of the paragraph.
The feature information includes inter-element relationship features and intra-element features of the paragraphs. Wherein, the relationship characteristics among elements are defined as follows:
Relationships among the elements of the article (also called text) can better reflect satisfaction, richness, logic and the like of the content, and element relationship features are extracted through the relationships among the elements. Relationships between elements are largely divided into three categories: the relation between a title (i.e. the title of an article) and a paragraph, the relation between a title and a subtitle (i.e. the title of a paragraph), the relation between paragraphs.
Wherein the relation between the article title and the paragraph is the first relation. Defining a first relationship tag includes: quoted, satisfied, summarized, related, uncorrelated, etc., the relationship between the title and the paragraph is characterized by a training model. For example, a "satisfy" tag indicates that the current paragraph is a direct answer to a question in the title, used to characterize the satisfaction, richness, text logic, answer integrity, etc. of the article.
The relationship of title and subtitle, i.e. the second relationship. Defining the second relationship tag includes: the relationship between headlines and subtitle are characterized by a training model, which is a complex, satisfied, correlated, etc. For example, the "repeat" tab indicates that the current subtitle is a duplicate representation of the title, used to locate the portion of the article that directly satisfies the title.
The relation between paragraphs is the third relation. The third relationship includes repetition, i.e., whether there is a repetition of the current paragraph with other paragraphs, which reduces the amount of information of the content.
The element-inherent features include: structural features and textual features to characterize the relationships inherent to the article elements.
Wherein, structural feature: the article has rich structural features, and the information acquisition convenience of the content can be reflected from the structural features. The structural features mainly comprise two types of information: one is an article organization form, such as whether a subtitle is contained, whether a picture is contained, whether paragraphs are organized in a list or order form, etc.; the other is key information processing: whether to thicken, highlight, etc. the key information.
Text characteristics: including both the text quality of the title and the text quality of the paragraph, are used to characterize the user experience factor. The text quality of a title mainly includes: whether or not to contain intent, whether or not to contain malicious intent, whether or not to have wrongly written words, phrases, and the like; the text quality of a paragraph mainly includes: content fluency, whether text is truncated, whether wrongly written words are included, and the like.
In this embodiment, the feature information includes the inter-element relationship feature and the intra-element feature of the paragraph, so that the feature information of the first sample is more comprehensive, and thus, the accuracy of the target network model obtained based on the first sample training for identifying the text quality is higher. When the target network model is applied to the search field, the search result can be filtered by utilizing the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, and meanwhile, the problem that low-quality content exists in the search result is avoided, and the user search satisfaction is further improved.
In one embodiment of the application, the structural features are obtained by applying preset rules to the paragraphs;
Extracting the first text by adopting a feature extraction model for a first feature except the structural feature in the feature information, wherein the feature extraction model is obtained by training a basic extraction model by adopting a marked third sample;
wherein each first feature corresponds to a feature extraction model.
Specifically, when feature information of the first sample is obtained, for structural features in the feature information, for example, whether a subtitle is included, whether a picture is included, whether a text length is organized in a list or a sequence form between paragraphs, whether information is highlighted, etc., a preset rule may be adopted to determine whether the first sample includes a subtitle, whether a picture is included, whether a text length is organized in a list or a sequence form between paragraphs, whether information is highlighted, etc.
And for the first features except the structural features in the feature information, extracting by adopting a feature extraction model. Each first feature corresponds to a feature extraction model, the feature extraction model needs to be trained by adopting a third sample marked manually, and then the trained feature extraction model is used for extracting the features of the first text.
The relationship features and text features among elements are features of partial semantic understanding, and cannot be exhausted through construction rules. For example, the recognition of the fluency of the content cannot distinguish whether the text content is fluent or not through exhaustion, which can achieve a better classification effect through ERNIE/BERT finetuning by using a small number of samples, for example, ERNIE/BERT finetuning is used for training the relation between the title and the paragraph, and a relation extraction model of the title and the paragraph is obtained.
For example, if the text is extracted by cutting off the feature, it is necessary to manually label the cut-off sample and the non-cut-off sample, then train the basic extraction model by using the labeled cut-off sample and the non-cut-off sample to obtain a trained cut-off feature extraction model, and then extract the cut-off feature of the first text by using the trained cut-off feature extraction model. The basic extraction model may employ a neural network model.
For example, for extracting the feature of the text quality of the title, the quality of the title sample needs to be manually marked, then the marked title sample is used for training the basic extraction model to obtain a text quality feature extraction model of the title, and then the text quality feature extraction model of the title is used for extracting the text quality feature of the title for the first text.
That is, for the first features except the structural features in the feature information, it is necessary to obtain relevant samples according to the first features, label the samples, and train the basic extraction model by using the labeled samples to obtain feature extraction models corresponding to the first features, where each first feature corresponds to one feature extraction model. When the feature information of the first sample is obtained, feature extraction can be performed on the first sample through the extraction model of the first feature, so that the first feature of the first sample is obtained.
In this embodiment, the structural feature is obtained by using a preset rule for the paragraph; and extracting the first text by adopting a feature extraction model for the first features except the structural features in the feature information, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the first sample training on text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by utilizing the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, and meanwhile, the problem that low-quality content exists in the search result is avoided, and the user search satisfaction is further improved.
As shown in fig. 2, the inter-element relationship features include: the relationship between the title of the text and the title of the paragraph, the relationship between the title of the text and the paragraph, and the relationship between the paragraph and other paragraphs of the text, the obtaining of the relationship features between the elements may be obtained by key modeling, for example, by manually labeling the third sample, as described above, training the basic extraction model to obtain the relationship extraction model, and extracting the relationship features between the elements of the text by using the relationship extraction model.
The elemental intrinsic features include structural features including an organization form of the paragraph and key information, and text features including a quality of a title of the text and a quality of the text of the paragraph. The quality of the title of the text and the quality of the text of the paragraph may be determined by means of text modeling, for example, by manually labeling the third sample as described above, training the basic extraction model to obtain a title text quality extraction model, and subsequently extracting the title text quality of the text by using the title text quality extraction model.
And training the basic network model by adopting the inter-element relation features and the element internal features to obtain the deep learning sequence model. The deep learning sequence model is used for identifying text quality and outputting a quality signal.
Fig. 3 is a structural diagram of a deep learning sequence model according to an embodiment of the present application, and as shown in fig. 3, the deep learning sequence model includes an input layer, an output layer, a forward layer, and a backward layer. The samples input into the deep learning sequence model (hereinafter referred to as sequence model) are feature information in units of paragraphs of text, for example, if the text includes 10 paragraphs, there are 10 groups of data input into the deep learning sequence model, each group corresponds to feature information of one paragraph, and the first paragraph, the second paragraph and the third paragraph in fig. 3 are different paragraphs in the same text.
Setting the maximum input group number of the sequence model as n, wherein n is a positive integer, and if the feature group number of the currently acquired paragraph is greater than n, performing truncation processing to enable the input group number input into the sequence model to be n; if the number of feature groups of the currently acquired paragraph is smaller than n, filling is adopted to fill in insufficient features, so that the number of input groups input into the sequence model is n.
Furthermore, when the quality recognition is performed on the long text, the abstract can be generated based on the long text, and then the text quality recognition is performed on the abstract through the deep learning sequence model, so that the quality recognition taking the abstract as a main body is realized. The long text is a text with the number of characters larger than a preset value.
Referring to fig. 5, fig. 5 is a block diagram of a model training apparatus according to an embodiment of the present application, and as shown in fig. 5, this embodiment provides a model training apparatus 500, including:
An obtaining module 501, configured to obtain a first sample, where the first sample includes feature information of a paragraph of a first text, and labeling information of text quality of the first text;
The training module 502 is configured to train the basic network model by using the first sample, and obtain a target network model for performing text quality recognition.
In one embodiment of the present application, the training module 502 includes:
the first acquisition submodule is used for training the basic network model by utilizing the first sample to acquire an intermediate network model;
The second obtaining submodule is used for predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
And the training sub-module is used for training the basic network model by adopting the first sample and the second sample if the prediction result meets the preset condition, so as to obtain a target network model.
In one embodiment of the present application, the prediction result includes at least two predicted values, and a probability of each predicted value of the at least two predicted values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
In one embodiment of the application, the feature information includes inter-element relationship features and intra-element features of paragraphs;
wherein the inter-element relationship feature is determined from at least one of the following relationships:
a relationship between a title of the first text and a title of the paragraph;
a relationship between a title of the first text and the paragraph;
a relationship between the paragraph and other paragraphs of the first text;
The element intrinsic features include structural features and text features;
Wherein the structural features include an organization form of the paragraph and key information, and the text features include a quality of a title of the first text and a quality of a text of the paragraph.
In one embodiment of the application, the structural features are obtained by applying preset rules to the paragraphs;
Extracting the first text by adopting a feature extraction model for a first feature except the structural feature in the feature information, wherein the feature extraction model is obtained by training a basic extraction model by adopting a marked third sample;
wherein each first feature corresponds to a feature extraction model.
The model training apparatus 500 can implement each process implemented by the electronic device in the method embodiment shown in fig. 1, and in order to avoid repetition, a description is omitted here.
According to the model training device 500 provided by the embodiment of the application, the first sample is obtained, and the first sample comprises the characteristic information of the paragraph of the first text and the labeling information of the text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain the target network model for identifying the text quality, so that the target network model can be adopted to audit the text information to identify the quality of the text information, and the identification efficiency is improved.
Referring to fig. 6, fig. 6 is a block diagram of a model training apparatus according to an embodiment of the present application, and as shown in fig. 6, this embodiment provides an information identifying apparatus 600, including:
an obtaining module 601, configured to obtain a text to be identified;
The recognition module 602 is configured to recognize the text to be recognized through a target network model, and obtain a quality recognition result of the text to be recognized; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text.
In one embodiment of the present application, the process of training the basic network model through the first sample to obtain the target network model includes:
training a basic network model by using the first sample to obtain an intermediate network model;
Predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
And if the prediction result meets the preset condition, training the basic network model by adopting the first sample and the second sample to obtain a target network model.
In one embodiment of the present application, the prediction result includes at least two predicted values, and a probability of each predicted value of the at least two predicted values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
In one embodiment of the application, the feature information includes inter-element relationship features and intra-element features of paragraphs;
wherein the inter-element relationship feature is determined from at least one of the following relationships:
a relationship between a title of the first text and a title of the paragraph;
a relationship between a title of the first text and the paragraph;
a relationship between the paragraph and other paragraphs of the first text;
The element intrinsic features include structural features and text features;
Wherein the structural features include an organization form of the paragraph and key information, and the text features include a quality of a title of the first text and a quality of a text of the paragraph.
In one embodiment of the application, the structural features are obtained by applying preset rules to the paragraphs;
Extracting the first text by adopting a feature extraction model for a first feature except the structural feature in the feature information, wherein the feature extraction model is obtained by training a basic extraction model by adopting a marked third sample;
wherein each first feature corresponds to a feature extraction model, and wherein each first feature corresponds to a feature extraction model.
The information identifying apparatus 600 can implement each process implemented by the electronic device in the method embodiment shown in fig. 4, and in order to avoid repetition, a description is omitted here.
The information identifying apparatus 600 of the embodiment of the present application obtains a text to be identified; identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text. By adopting the target network model to identify the text quality of the text to be identified, the labor cost is saved, and the identification efficiency is improved.
According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.
As shown in fig. 7, is a block diagram of an electronic device of a method of model training according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein. The block diagram shown in fig. 7 may also be a block diagram of an electronic device of a method of information identification.
As shown in fig. 7, the electronic device includes: one or more processors 701, memory 702, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 701 is illustrated in fig. 7.
Memory 702 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method of model training or information identification provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method of model training or information recognition provided by the present application.
The memory 702 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the acquisition module 501 and training module 502 shown in fig. 5, and the acquisition module 601 and recognition module 602 shown in fig. 6) corresponding to the method of model training or information recognition in the embodiment of the present application. The processor 701 executes various functional applications of the server and data processing, i.e., a method of implementing model training or information recognition in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 702.
Memory 702 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created from the use of the electronic device identified by the model training or information, and the like. In addition, the memory 702 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 702 optionally includes memory remotely located with respect to processor 701, which may be connected to the model training or information recognition electronics via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the method of model training or information recognition may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, in fig. 7 by way of example.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device for model training or information recognition, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, etc. input devices. The output device 704 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, a first sample is obtained, wherein the first sample comprises the characteristic information of a paragraph of a first text and the labeling information of the text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain the target network model for identifying the text quality, so that the target network model can be adopted to audit the text information to identify the quality of the text information, and the identification efficiency is improved.
And (3) manually marking a second sample with a prediction result meeting the prediction condition, and training the basic network model by using the first sample and the marked second sample to obtain a target network model, wherein the training data marked each time are data which are not well processed by the intermediate network model, so that the learning efficiency of the basic network model can be effectively improved, meanwhile, the manual marking of excessive sample data is avoided, and the marking efficiency is improved.
The absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is smaller than a preset threshold, so that the data which is not well processed by the intermediate network model can be manually marked, then the first sample and the marked second sample are utilized to train the basic network model, and a target network model is obtained.
The feature information comprises the inter-element relation features and the intra-element features of the paragraphs, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the first sample training on text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by utilizing the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, and meanwhile, the problem that low-quality content exists in the search result is avoided, and the user search satisfaction is further improved.
The structural characteristics are obtained by utilizing preset rules for the paragraphs; and extracting the first text by adopting a feature extraction model for the first features except the structural features in the feature information, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the first sample training on text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by utilizing the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, and meanwhile, the problem that low-quality content exists in the search result is avoided, and the user search satisfaction is further improved.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (9)

1. A model training method, comprising:
Acquiring a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text; wherein the feature information includes inter-element relationship features of paragraphs and inter-element intrinsic features, the inter-element relationship features are used for characterizing relationship features between different element types, the element types include text titles, paragraph titles, paragraphs, the inter-element intrinsic features include structural features and text features, the annotation information is used for indicating quality level or score information of text quality of the first text, and the text quality is determined based on the title text quality and paragraph text quality of the first text;
Training the basic network model by using the first sample to obtain a target network model for text quality identification;
Training the basic network model by using the first sample to obtain a target network model for text quality recognition, wherein the training comprises the following steps:
training a basic network model by using the first sample to obtain an intermediate network model;
Predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
if the prediction result meets a preset condition, training the basic network model by adopting the first sample and the second sample to obtain a target network model;
The prediction result comprises at least two prediction values and the probability of each prediction value in the at least two prediction values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
2. The method of claim 1, wherein the inter-element relationship feature is determined from at least one of the following relationships:
a relationship between a title of the first text and a title of the paragraph;
a relationship between a title of the first text and the paragraph;
a relationship between the paragraph and other paragraphs of the first text;
Wherein the structural features include an organization form of the paragraph and key information, and the text features include a quality of a title of the first text and a quality of a text of the paragraph.
3. The method of claim 2, wherein the structural feature is obtained by applying a preset rule to the paragraph;
Extracting the first text by adopting a feature extraction model for a first feature except the structural feature in the feature information, wherein the feature extraction model is obtained by training a basic extraction model by adopting a marked third sample;
wherein each first feature corresponds to a feature extraction model.
4. A model training apparatus comprising:
The device comprises an acquisition module, a judgment module and a display module, wherein the acquisition module is used for acquiring a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text; wherein the feature information includes inter-element relationship features of paragraphs and inter-element intrinsic features, the inter-element relationship features are used for characterizing relationship features between different element types, the element types include text titles, paragraph titles, paragraphs, the inter-element intrinsic features include structural features and text features, the annotation information is used for indicating quality level or score information of text quality of the first text, and the text quality is determined based on the title text quality and paragraph text quality of the first text;
the training module is used for training the basic network model by using the first sample to acquire a target network model for text quality identification;
the training module comprises:
the first acquisition submodule is used for training the basic network model by utilizing the first sample to acquire an intermediate network model;
The second obtaining submodule is used for predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
The training sub-module is used for training the basic network model by adopting the first sample and the second sample if the prediction result meets a preset condition to obtain a target network model;
The prediction result comprises at least two prediction values and the probability of each prediction value in the at least two prediction values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
5. The apparatus of claim 4, wherein the feature information includes inter-element relationship features and intra-element features of paragraphs;
wherein the inter-element relationship feature is determined from at least one of the following relationships:
a relationship between a title of the first text and a title of the paragraph;
a relationship between a title of the first text and the paragraph;
a relationship between the paragraph and other paragraphs of the first text;
Wherein the structural features include an organization form of the paragraph and key information, and the text features include a quality of a title of the first text and a quality of a text of the paragraph.
6. The apparatus of claim 5, wherein the structural feature is obtained by applying a preset rule to the paragraph;
Extracting the first text by adopting a feature extraction model for a first feature except the structural feature in the feature information, wherein the feature extraction model is obtained by training a basic extraction model by adopting a marked third sample;
wherein each first feature corresponds to a feature extraction model.
7. An electronic device, comprising:
at least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-3.
8. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-3.
9. An information identification method, comprising:
Acquiring a text to be identified;
Identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified;
The target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises characteristic information of a paragraph of a first text and labeling information of the text quality of the first text; wherein the feature information includes inter-element relationship features of paragraphs and inter-element intrinsic features, the inter-element relationship features are used for characterizing relationship features between different element types, the element types include text titles, paragraph titles, paragraphs, the inter-element intrinsic features include structural features and text features, the annotation information is used for indicating quality level or score information of text quality of the first text, and the text quality is determined based on the title text quality and paragraph text quality of the first text;
The training process of the target network model comprises the following steps:
training a basic network model by using the first sample to obtain an intermediate network model;
Predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;
if the prediction result meets a preset condition, training the basic network model by adopting the first sample and the second sample to obtain a target network model;
The prediction result comprises at least two prediction values and the probability of each prediction value in the at least two prediction values;
the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, and the at least two predicted values comprise the first predicted value and the second predicted value, wherein the probability of the first predicted value is the maximum probability in the predicted result.
CN202010697599.4A 2020-07-20 2020-07-20 Model training method, information identification device, electronic equipment and storage medium Active CN111858905B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010697599.4A CN111858905B (en) 2020-07-20 2020-07-20 Model training method, information identification device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010697599.4A CN111858905B (en) 2020-07-20 2020-07-20 Model training method, information identification device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111858905A CN111858905A (en) 2020-10-30
CN111858905B true CN111858905B (en) 2024-05-07

Family

ID=73001312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010697599.4A Active CN111858905B (en) 2020-07-20 2020-07-20 Model training method, information identification device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111858905B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464083A (en) * 2020-11-16 2021-03-09 北京达佳互联信息技术有限公司 Model training method, work pushing method, device, electronic equipment and storage medium
CN112507931B (en) * 2020-12-16 2023-12-22 华南理工大学 Deep learning-based information chart sequence detection method and system
CN114417974B (en) * 2021-12-22 2023-06-20 北京百度网讯科技有限公司 Model training method, information processing device, electronic equipment and medium
CN114462531A (en) * 2022-01-30 2022-05-10 支付宝(杭州)信息技术有限公司 Model training method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193805A (en) * 2017-06-06 2017-09-22 北京百度网讯科技有限公司 Article Valuation Method, device and storage medium based on artificial intelligence
CN110674414A (en) * 2019-09-20 2020-01-10 北京字节跳动网络技术有限公司 Target information identification method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150199913A1 (en) * 2014-01-10 2015-07-16 LightSide Labs, LLC Method and system for automated essay scoring using nominal classification
US20200184016A1 (en) * 2018-12-10 2020-06-11 Government Of The United States As Represetned By The Secretary Of The Air Force Segment vectors

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193805A (en) * 2017-06-06 2017-09-22 北京百度网讯科技有限公司 Article Valuation Method, device and storage medium based on artificial intelligence
CN110674414A (en) * 2019-09-20 2020-01-10 北京字节跳动网络技术有限公司 Target information identification method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于BERT特征的双向LSTM神经网络在中文电子病历输入推荐中的应用;赵璐偲;岁波;罗海琼;陈旭;宋晓霞;洪平;;中国数字医学;20200415(第04期) *
基于文档结构与深度学习的金融公告信息抽取;黄胜;王博博;朱菁;;计算机工程与设计;20200116(第01期) *

Also Published As

Publication number Publication date
CN111858905A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
JP7127106B2 (en) Question answering process, language model training method, apparatus, equipment and storage medium
CN111858905B (en) Model training method, information identification device, electronic equipment and storage medium
CN112507715B (en) Method, device, equipment and storage medium for determining association relation between entities
JP7283067B2 (en) A system and method for detangling interrupted conversations in a communication platform, a method, a program, and a computer device for parsing unstructured messages
CN111460083B (en) Method and device for constructing document title tree, electronic equipment and storage medium
CN111125435B (en) Video tag determination method and device and computer equipment
US11508153B2 (en) Method for generating tag of video, electronic device, and storage medium
CN111967262A (en) Method and device for determining entity tag
JP2021131528A (en) User intention recognition method, device, electronic apparatus, computer readable storage media and computer program
CN111078878B (en) Text processing method, device, equipment and computer readable storage medium
CN111177462B (en) Video distribution timeliness determination method and device
CN112380847B (en) Point-of-interest processing method and device, electronic equipment and storage medium
US11520835B2 (en) Learning system, learning method, and program
CN111310058B (en) Information theme recommendation method, device, terminal and storage medium
CN115099239B (en) Resource identification method, device, equipment and storage medium
CN111538815A (en) Text query method, device, equipment and storage medium
CN111563198B (en) Material recall method, device, equipment and storage medium
CN111125438A (en) Entity information extraction method and device, electronic equipment and storage medium
CN111291192B (en) Method and device for calculating triplet confidence in knowledge graph
CN111858880A (en) Method and device for obtaining query result, electronic equipment and readable storage medium
CN113516491A (en) Promotion information display method and device, electronic equipment and storage medium
CN111639234B (en) Method and device for mining core entity attention points
CN112052390A (en) Resource screening method and device, electronic equipment and storage medium
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium
CN111353070A (en) Video title processing method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant