CN111858905A

CN111858905A - Model training method, information identification method, device, electronic equipment and storage medium

Info

Publication number: CN111858905A
Application number: CN202010697599.4A
Authority: CN
Inventors: 白亚楠; 刘子航; 林荣逸; 欧阳宇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-07-20
Filing date: 2020-07-20
Publication date: 2020-10-30

Abstract

The application discloses a model training method, an information identification device, electronic equipment and a storage medium, and relates to the technical field of deep learning. The specific implementation scheme is as follows: acquiring a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain a target network model for text quality identification, so that the text information can be audited by adopting the target network model to identify the quality of the text information, and the identification efficiency is improved.

Description

Model training method, information identification method, device, electronic equipment and storage medium

Technical Field

The present application relates to deep learning technologies in the field of data processing technologies, and in particular, to a model training method, an information recognition apparatus, an electronic device, and a storage medium.

Background

With the development of technology, more and more people carry out information query through search tools. For example, searching for papers, searching for information related to merchandise, searching for medical information, and the like. Because the existing information base includes a large amount of text information, the quality of the text information is uneven, and the quality of the text information needs to be identified according to the text content, for example, the text information in the information base is checked in a manual checking mode, and the quality is manually judged.

Disclosure of Invention

The disclosure provides a model training method, an information identification device, an electronic device and a storage medium.

According to a first aspect of the present disclosure, there is provided a model training method, comprising:

acquiring a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text;

and training the basic network model by using the first sample to obtain a target network model for text quality recognition.

According to a second aspect of the present disclosure, there is provided a model training apparatus comprising:

the acquisition module is used for acquiring a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text;

and the training module is used for training the basic network model by using the first sample to obtain a target network model for text quality recognition.

According to a third aspect of the present disclosure, there is provided an electronic device comprising:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the first aspect.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of the first aspects.

According to a fifth aspect of the present disclosure, there is provided an information identifying method including:

acquiring a text to be identified;

identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a flow chart of a model training method provided by an embodiment of the present application;

FIG. 2 is another flow chart of a model training method provided by an embodiment of the present application;

Fig. 3 is a structural diagram of a deep learning sequence model provided in an embodiment of the present application;

FIG. 4 is a flow chart of an information identification method provided by an embodiment of the present application;

FIG. 5 is a block diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 6 is a flow chart of an information recognition apparatus provided in an embodiment of the present application;

FIG. 7 is a block diagram of an electronic device for implementing a model training method according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Referring to fig. 1, fig. 1 is a flowchart of a model training method provided in an embodiment of the present application, and as shown in fig. 1, the embodiment provides a model training method applied to an electronic device, including the following steps:

step 101, obtaining a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text.

Generally, an article is composed of four types of elements, i.e., a title, a subtitle, a general paragraph, and a picture, and usually, a plurality of subtitles are included under one title, and a plurality of paragraphs and pictures are included under one subtitle. In this embodiment, the first text includes one or more of a title, a subtitle, and one or more paragraphs. The text quality of the first text can be set according to actual conditions, for example, the text quality of the first text is set into three levels, namely, a high level, a normal level and a low level; or, the text quality of the first text is set to 10 levels, each level corresponds to a score value interval, and the higher the score value is, the better the quality is. The labeling information is the quality grade or score value of the first text, and the labeling information can be labeled manually. The first sample includes feature information of a paragraph of the first text, and if the first text includes a plurality of paragraphs, the first sample includes feature information of each paragraph of the first text, that is, the feature information included in the first sample is feature information obtained in units of paragraphs.

And 102, training the basic network model by using the first sample to obtain a target network model for text quality recognition.

The base network model may be a neural network model, preferably a deep learning sequence model. And training the basic network model by using the first sample to obtain a target network model, wherein the target network model can be used for text quality recognition.

In this embodiment, a first sample is obtained, where the first sample includes feature information of a paragraph of a first text and labeling information of text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain a target network model for text quality identification, so that the text information can be audited by adopting the target network model to identify the quality of the text information, and the identification efficiency is improved.

In an embodiment of the present application, the training the basic network model by using the first sample to obtain the target network model includes:

training a basic network model by using the first sample to obtain an intermediate network model;

predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;

And if the prediction result meets a preset condition, training the basic network model by adopting the first sample and the second sample to obtain a target network model.

In this embodiment, the intermediate network model is a network model obtained by training the basic network model using the first sample. The second sample can be regarded as a sample for prediction, and the second sample is input into the intermediate network model to obtain a prediction result of the text quality of the second sample. The prediction result comprises a plurality of prediction values and corresponding probability values. For example, if the text quality of the first text is set to three levels, respectively high quality, normal and low quality, the plurality of predicted values are high quality, normal and low quality, respectively, and the prediction result includes respective probability values of the high quality, normal and low quality, and the respective probability values of the high quality, normal and low quality add up to 100%.

In this embodiment, the second sample of which the prediction result meets the prediction condition is manually labeled, and then the basic network model is trained by using the first sample and the labeled second sample to obtain the target network model, so that the training data labeled each time is data that the intermediate network model does not process well, the learning efficiency of the basic network model can be effectively improved, and meanwhile, the manual labeling of too many sample data is avoided, and the labeling efficiency is improved.

In one embodiment of the present application, the prediction result includes at least two prediction values and a probability of each of the at least two prediction values;

the preset condition is that the absolute value of the difference between the probability of a first predicted value and the probability of a second predicted value is smaller than a preset threshold, the at least two predicted values comprise the first predicted value and the second predicted value, and the probability of the first predicted value is the maximum probability in the prediction results.

The preset threshold may be set according to practical situations, for example, 2%, 4%, etc., and is not limited herein. The absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is smaller than a preset threshold value, which indicates that the probabilities corresponding to the two predicted values are not different greatly, and the probability corresponding to one of the two predicted values is the maximum probability in the prediction result, so that when the intermediate network model identifies the second sample, the possibility of identification errors is high. For example, the prediction result is divided into high quality, normal and low quality, the corresponding probabilities are respectively 49%, 45% and 6%, the probabilities of the high quality and the normal quality are similar, the probability that the second sample is high quality or normal is higher, in order to further improve the identification accuracy of the intermediate network model, the second sample of which the prediction result meets the prediction condition is manually labeled, and then the basic network model is trained by using the first sample and the labeled second sample to obtain the target network model.

In this embodiment, the prediction condition is set such that the absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value is smaller than the preset threshold, so that data which is not well processed by the intermediate network model can be manually labeled, and then the basic network model is trained by using the first sample and the labeled second sample to obtain the target network model.

In one embodiment of the present application, the feature information includes an inter-element relationship feature and an intra-element feature of the paragraph;

wherein the inter-element relationship characteristic is determined according to at least one of the following relationships:

a relationship between a title of the first text and a title of the passage;

a relationship between a title of the first text and the passage;

relationships between the passage and other passages of the first text;

the element intrinsic features comprise structural features and textual features;

wherein the structural features include an organization form and key information of the passage, and the text features include a quality of a title of the first text and a quality of text of the passage.

The feature information includes inter-element relationship features and intra-element features of the paragraph. Wherein, the relationship characteristics among the elements are defined as follows:

the relation between the elements of the article (also called as text) can better reflect the satisfaction degree, richness, logicality and the like of the content, and the element relation characteristics are extracted through the relation between the elements. The relationships between elements are mainly classified into three categories: the relationship between a title (i.e., the title of an article) and a paragraph, the relationship between a title and a subtitle (i.e., the title of a paragraph), and the relationship between paragraphs.

The relationship between the article title and the paragraph is the first relationship. Defining the first relationship label includes: quote, satisfy, summarize, correlate, do not correlate, etc., the relationship between the title and the paragraph is characterized by the training model. For example, a "satisfy" label indicates that the current paragraph is a direct answer to a question in a title, characterizing the satisfaction, richness, literary logic, answer completeness, etc. of an article.

The relationship of title and subtitle, i.e., the second relationship. Defining the second relationship label includes: repeating, satisfying, correlating, etc., the relationship between the title and the subtitle is characterized by a training model. For example, the "repeat" tag indicates that the current subtitle is a repeated expression of a title for locating a portion of the article that directly satisfies the title.

The relationship between paragraphs and paragraphs, i.e., the third relationship. The third relationship includes the presence or absence of duplication, i.e., the current paragraph and other paragraphs, which would reduce the amount of information in the content.

The intrinsic characteristics of the elements include: structural features and textual features to characterize the relationships inherent to the article elements.

Wherein, the structural feature: the article has abundant structural characteristics, and information acquisition convenience of contents can be reflected from the structural characteristics. The structural features mainly include two types of information: one is the article organization form, such as whether to contain subtitles, whether to contain pictures, whether to organize between paragraphs in list or order, etc.; the other is key information processing: whether to thicken, highlight, etc. the key information.

Text characteristics: including both the text quality of a title and the text quality of a paragraph, are used to characterize the user experience factor. The text quality of the title mainly comprises: whether the intention is contained, whether the intention is malicious or not, whether wrongly written or wrongly written characters or wrong sentences exist or not, and the like; the text quality of a paragraph mainly includes: content fluency, whether the text is truncated, whether it contains wrongly written words, etc.

In this embodiment, the feature information includes inter-element relationship features and intra-element features of the paragraphs, so that the feature information of the first sample is more comprehensive, and thus, the accuracy of the target network model obtained based on the first sample training for text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by using the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, meanwhile, the problem-existing low-quality content is prevented from appearing in the search result, and the search satisfaction of the user is further improved.

In one embodiment of the present application, the structural feature is obtained by applying a preset rule to the paragraph;

for first features except the structural features in the feature information, extracting the first text by using a feature extraction model, and training a basic extraction model by using a labeled third sample by using the feature extraction model; wherein each first feature corresponds to a feature extraction model.

Specifically, when the feature information of the first sample is obtained, for the structural features in the feature information, for example, whether subtitles are included, whether pictures are included, whether paragraphs are organized in a list or sequence form, text length, whether information is highlighted, and the like, a preset rule may be adopted to determine whether the first sample includes subtitles, whether pictures are included, whether paragraphs are organized in a list or sequence form, text length, whether information is highlighted, and the like.

And for the first features except the structural features in the feature information, extracting by using a feature extraction model. And each first feature corresponds to a feature extraction model, the feature extraction model needs to be trained by adopting a third sample labeled manually, and then the trained feature extraction model is used for extracting the features of the first text.

The relationship features and text features between elements are features with partial semantic understanding, and cannot be exhausted through constructing rules. For example, the identification of the fluency of the content cannot distinguish whether the text content is fluent through exhaustion, which can achieve a better classification effect by using a small number of samples through ERNIE/BERT refining, for example, the relation between the title and the paragraph is trained by using ERNIE/BERT refining to obtain a relation extraction model of the title and the paragraph.

For example, for extracting the feature of whether the text is truncated, a truncated sample and a non-truncated sample need to be manually labeled, then the labeled truncated sample and the non-truncated sample are adopted to train the basic extraction model to obtain a trained truncated feature extraction model, and then the trained truncated feature extraction model is adopted to perform truncated feature extraction on the first text. The basic extraction model may employ a neural network model.

For example, for extracting the feature of the text quality of the title, the quality of a title sample needs to be manually labeled, then the labeled title sample is adopted to train a basic extraction model to obtain a text quality feature extraction model of the title, and then the text quality feature extraction model of the title is adopted to extract the text quality feature of the title for the first text.

That is to say, for a first feature except for a structural feature in the feature information, a related sample needs to be obtained according to the first feature, the sample is labeled, then the labeled sample is adopted to train the basic extraction model, a feature extraction model corresponding to the first feature is obtained, and each first feature corresponds to one feature extraction model. When the feature information of the first sample is obtained, feature extraction can be performed on the first sample through the extraction model of the first feature, so that the first feature of the first sample is obtained.

In this embodiment, the structural features are obtained by applying a preset rule to the paragraphs; for the first features except the structural features in the feature information, the feature extraction model is adopted to extract the first text, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the training of the first sample on the text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by using the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, meanwhile, the problem-existing low-quality content is prevented from appearing in the search result, and the search satisfaction of the user is further improved.

Fig. 2 is a block diagram of a model training method according to an embodiment of the present application, and as shown in fig. 2, the relationship features between elements include: the relationship between the text title and the paragraph title, the relationship between the text title and the paragraph, and the relationship between the paragraph and other paragraphs of the text, and the obtaining of the relationship characteristics between the elements can be obtained by a relationship modeling method, for example, the method of manually labeling the third sample as described above, trains the basic extraction model to obtain a relationship extraction model, and then extracts the relationship characteristics between the elements of the text by the relationship extraction model.

The intra-element features include structural features and textual features, wherein the structural features include an organization form and key information of the passage, and the textual features include a quality of a title of the text and a quality of the text of the passage. The quality of the title of the text and the quality of the text of the paragraph can be determined by text modeling, for example, the basic extraction model is trained by manually labeling the third sample as described above to obtain a title text quality extraction model, and then the title text quality of the text is extracted by using the title text quality extraction model.

And training the basic network model by adopting the relationship characteristics among elements and the internal characteristics of the elements to obtain a deep learning sequence model. And the deep learning sequence model is used for performing text quality identification and outputting a quality signal.

Fig. 3 is a structural diagram of a deep learning sequence model (i.e., a structural diagram of a bidirectional RNN network) provided in an embodiment of the present application, and as shown in fig. 3, the deep learning sequence model includes an input layer, an output layer, a forward layer, and a backward layer. The samples input into the deep learning sequence model (hereinafter referred to as the sequence model) are feature information in units of paragraphs of a text, for example, if the text includes 10 paragraphs, there are 10 groups of data input into the deep learning sequence model, each group corresponds to feature information of one paragraph, and in fig. 3, the first paragraph, the second paragraph, and the third paragraph are different paragraphs in the same text.

Setting the maximum input group number of the sequence model as n, wherein n is a positive integer, and if the feature group number of the currently acquired paragraph is greater than n, performing truncation processing to make the input group number input into the sequence model be n; if the number of feature groups of the currently acquired paragraph is less than n, a filling mode can be adopted to fill up the insufficient features, so that the number of input groups input into the sequence model is n.

Referring to fig. 5, fig. 5 is a flowchart of an information identification method provided in an embodiment of the present application, and as shown in fig. 5, the embodiment provides an information identification method applied to an electronic device, including:

step 201, obtaining a text to be recognized, wherein the text to be recognized is a text needing text quality recognition through a target network model.

Step 202, identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text.

Generally, an article is composed of four types of elements, i.e., a title, a subtitle, a general paragraph, and a picture, and usually, a plurality of subtitles are included under one title, and a plurality of paragraphs and pictures are included under one subtitle. In this embodiment, the first text includes one or more of a title, a subtitle, and one or more paragraphs. The text quality of the paragraphs can be set according to actual conditions, for example, the text quality of the first text is set into three levels, namely, a high-quality level, a normal level and a low-quality level; or, the text quality of the first text is set to 10 levels, each level corresponds to a score value interval, and the higher the score value is, the better the quality is. The labeling information is the quality grade or score value of the first text, and the labeling information can be labeled manually.

The first sample includes feature information of a paragraph of the first text, and if the first text includes a plurality of paragraphs, the first sample includes feature information of each paragraph of the first text, that is, the feature information included in the first sample is feature information obtained in units of paragraphs.

In the embodiment, a text to be recognized is obtained; identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text. By adopting the target network model to identify the text quality of the text to be identified, the labor cost is saved, and the identification efficiency is improved.

In an embodiment of the present application, the process of training a base network model through a first sample to obtain the target network model includes:

And training the model with the best generalization effect by utilizing a small amount of labeled samples through the interaction between the model and the labeled personnel.

In the process of training the intermediate network model, screening a batch of fixed set data (namely, a second sample), after each round of training is finished, predicting the fixed set data by using the intermediate network model, marking the data of which the prediction result meets the preset condition, and adding the marked data into a training set, so that the training data marked each time are the data which are not well processed by the current model, the learning efficiency of the model can be effectively improved, and the condition that too many invalid data are marked is avoided.

a relationship between a title of the first text and a title of the passage;

a relationship between a title of the first text and the passage;

relationships between the passage and other passages of the first text;

for first features except the structural features in the feature information, extracting the first text by using a feature extraction model, and training a basic extraction model by using a labeled third sample by using the feature extraction model;

wherein each first feature corresponds to a feature extraction model.

As shown in fig. 2, the inter-element relationship features include: the relationship between the text title and the paragraph title, the relationship between the text title and the paragraph, and the relationship between the paragraph and other paragraphs of the text, and the obtaining of the inter-element relationship features may be obtained by a key modeling manner, for example, the manner of manually labeling the third sample as described above, training the basic extraction model to obtain the relationship extraction model, and extracting the inter-element relationship features of the text by using the relationship extraction model.

Fig. 3 is a diagram of a deep learning sequence model structure provided in an embodiment of the present application, and as shown in fig. 3, the deep learning sequence model includes an input layer, an output layer, a forward layer, and a backward layer. The samples input into the deep learning sequence model (hereinafter referred to as the sequence model) are feature information in units of paragraphs of a text, for example, if the text includes 10 paragraphs, there are 10 groups of data input into the deep learning sequence model, each group corresponds to feature information of one paragraph, and the first paragraph, the second paragraph, and the third paragraph in fig. 3 are different paragraphs in the same text.

Furthermore, when the quality of the long text is identified, the abstract can be generated based on the long text, and then the text quality is identified for the abstract through a deep learning sequence model, so that the quality identification with the abstract as a main body is realized. The long text is a text with the number of characters larger than a preset numerical value.

Referring to fig. 5, fig. 5 is a structural diagram of a model training apparatus provided in the present embodiment, and as shown in fig. 5, the present embodiment provides a model training apparatus 500, including:

an obtaining module 501, configured to obtain a first sample, where the first sample includes feature information of a paragraph of a first text and labeling information of text quality of the first text;

the training module 502 is configured to train the basic network model by using the first sample, and obtain a target network model for performing text quality recognition.

In an embodiment of the present application, the training module 502 includes:

the first obtaining submodule is used for training a basic network model by utilizing the first sample to obtain an intermediate network model;

the second obtaining submodule is used for predicting a second sample by adopting the intermediate network model to obtain a prediction result of the text quality of the second sample;

And the training submodule is used for training the basic network model by adopting the first sample and the second sample to obtain a target network model if the prediction result meets a preset condition.

a relationship between a title of the first text and a title of the passage;

a relationship between a title of the first text and the passage;

relationships between the passage and other passages of the first text;

wherein each first feature corresponds to a feature extraction model.

The model training apparatus 500 can implement each process implemented by the electronic device in the method embodiment shown in fig. 1, and is not described herein again to avoid repetition.

The model training device 500 in the embodiment of the present application obtains a first sample, where the first sample includes feature information of a paragraph of a first text and labeling information of text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain a target network model for text quality identification, so that the text information can be audited by adopting the target network model to identify the quality of the text information, and the identification efficiency is improved.

Referring to fig. 6, fig. 6 is a structural diagram of a model training apparatus according to an embodiment of the present application, and as shown in fig. 6, the embodiment provides an information recognition apparatus 600 including:

an obtaining module 601, configured to obtain a text to be recognized;

the identification module 602 is configured to identify the text to be identified through a target network model, and obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text.

a relationship between a title of the first text and a title of the passage;

a relationship between a title of the first text and the passage;

relationships between the passage and other passages of the first text;

each first feature corresponds to one feature extraction model, and each first feature corresponds to one feature extraction model.

The information identification apparatus 600 can implement each process implemented by the electronic device in the method embodiment shown in fig. 4, and is not described herein again to avoid repetition.

The information identification device 600 of the embodiment of the application acquires a text to be identified; identifying the text to be identified through a target network model to obtain a quality identification result of the text to be identified; the target network model is obtained by training a basic network model through a first sample, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text. By adopting the target network model to identify the text quality of the text to be identified, the labor cost is saved, and the identification efficiency is improved.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

Fig. 7 is a block diagram of an electronic device according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein. The block diagram shown in fig. 7 may also be a block diagram of an electronic device of a method of information identification.

As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 7, one processor 701 is taken as an example.

The memory 702 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for model training or information recognition provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method of model training or information recognition provided herein.

Memory 702, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for model training or information recognition in the embodiments of the present application (e.g., acquisition module 501 and training module 502 shown in fig. 5, acquisition module 601 and recognition module 602 shown in fig. 6). The processor 701 executes various functional applications of the server and data processing, i.e., a method of implementing model training or information recognition in the above-described method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory 702.

The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the electronic device by model training or information recognition, or the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 702 may optionally include memory located remotely from processor 701, which may be connected over a network to model training or information recognition electronics. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method of model training or information recognition may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.

The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus for model training or information recognition, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, a first sample is obtained, wherein the first sample comprises feature information of a paragraph of a first text and labeling information of text quality of the first text; and training the basic network model by using the first sample to obtain a target network model for text quality recognition. The basic network model is trained by adopting the first sample to obtain a target network model for text quality identification, so that the text information can be audited by adopting the target network model to identify the quality of the text information, and the identification efficiency is improved.

And manually labeling the second sample of which the prediction result meets the prediction condition, and then training the basic network model by using the first sample and the labeled second sample to obtain the target network model, so that the training data labeled each time is data which is not well processed by the intermediate network model, the learning efficiency of the basic network model can be effectively improved, and meanwhile, the phenomenon that too much sample data is manually labeled is avoided, and the labeling efficiency is improved.

Setting the prediction condition as the absolute value of the difference between the probability of the first predicted value and the probability of the second predicted value to be smaller than a preset threshold value, so that data which are not well processed by the intermediate network model can be manually marked, then training the basic network model by using the first sample and the marked second sample to obtain the target network model, and thus, the training data marked each time are the data which are not well processed by the intermediate network model, the learning efficiency of the basic network model can be effectively improved, meanwhile, too much sample data are prevented from being manually marked, and the marking efficiency is improved.

The feature information comprises the inter-element relationship features and the intra-element features of the paragraphs, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the first sample training for text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by using the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, meanwhile, the problem-existing low-quality content is prevented from appearing in the search result, and the search satisfaction of the user is further improved.

The structural features are obtained by utilizing a preset rule for the paragraphs; for the first features except the structural features in the feature information, the feature extraction model is adopted to extract the first text, so that the feature information of the first sample is more comprehensive, and the accuracy of the target network model obtained based on the training of the first sample on the text quality recognition is higher. When the target network model is applied to the search field, the search result can be filtered by using the target network model, so that accurate, rich and friendly high-quality content is displayed to a user, meanwhile, the problem-existing low-quality content is prevented from appearing in the search result, and the search satisfaction of the user is further improved.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A model training method, comprising:

2. The method of claim 1, wherein the training the base network model with the first sample to obtain the target network model for text quality recognition comprises:

3. The method of claim 2, wherein the prediction result comprises at least two prediction values and a probability for each of the at least two prediction values;

4. The method of claim 1, wherein the feature information includes inter-element relational features and intra-element features of a paragraph;

a relationship between a title of the first text and a title of the passage;

a relationship between a title of the first text and the passage;

relationships between the passage and other passages of the first text;

5. The method of claim 4, wherein the structural features are obtained by applying a preset rule to the passage;

wherein each first feature corresponds to a feature extraction model.

6. A model training apparatus comprising:

7. The apparatus of claim 6, wherein the training module comprises:

8. The apparatus of claim 7, wherein the prediction result comprises at least two prediction values and a probability for each of the at least two prediction values;

9. The apparatus of claim 6, wherein the feature information comprises inter-element relational features and intra-element features of a paragraph;

a relationship between a title of the first text and a title of the passage;

a relationship between a title of the first text and the passage;

relationships between the passage and other passages of the first text;

10. The apparatus of claim 9, wherein the structural feature is obtained by applying a preset rule to the passage;

wherein each first feature corresponds to a feature extraction model.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. An information identification method, comprising:

acquiring a text to be identified;