CN109165380B - Neural network model training method and device and text label determining method and device - Google Patents

Neural network model training method and device and text label determining method and device Download PDF

Info

Publication number
CN109165380B
CN109165380B CN201810837902.9A CN201810837902A CN109165380B CN 109165380 B CN109165380 B CN 109165380B CN 201810837902 A CN201810837902 A CN 201810837902A CN 109165380 B CN109165380 B CN 109165380B
Authority
CN
China
Prior art keywords
label
neural network
network model
word
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810837902.9A
Other languages
Chinese (zh)
Other versions
CN109165380A (en
Inventor
刘伟伟
史佳慧
骆世顺
黄萍萍
斯凌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MIGU Digital Media Co Ltd
Original Assignee
MIGU Digital Media Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIGU Digital Media Co Ltd filed Critical MIGU Digital Media Co Ltd
Priority to CN201810837902.9A priority Critical patent/CN109165380B/en
Publication of CN109165380A publication Critical patent/CN109165380A/en
Application granted granted Critical
Publication of CN109165380B publication Critical patent/CN109165380B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a neural network model training method, which comprises the following steps: acquiring a sample characteristic set formed by semantic topic characteristic vectors of a plurality of texts and a label set formed by a plurality of labels which can be used as text labels; training a neural network model based on the sample feature set and the label set in the following way: training a level 1 neural network model by taking the sample feature set as the input of the layer 1 neural network model and taking the 1 st label in the label set as the output of the layer 1 neural network model; training an mth-level neural network model by taking the training result of the (m-1) th layer and the sample feature set as the input of the mth layer neural network model and taking the mth label in the label set as the output of the mth layer neural network model; and M is more than or equal to 2 and less than or equal to M, wherein M is the total number of the tags included in the tag set. The invention also discloses a neural network model training device, a text label determining method and a text label determining device.

Description

Neural network model training method and device and text label determining method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a neural network model training method and device and a text label determining method and device.
Background
In the related technology, texts are labeled by a multi-class classification method, each text corresponds to one label, and the problems of incomplete label identification result, low accuracy and low robustness of the texts exist.
Disclosure of Invention
In view of this, embodiments of the present invention are expected to provide a neural network model training method and apparatus, and a text label determination method and apparatus, which can perform multi-label identification on a text, and improve accuracy and robustness of a text label.
In order to achieve the above purpose, the technical solution of the embodiment of the present invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a neural network model training method, including:
acquiring a sample characteristic set formed by semantic topic characteristic vectors of a plurality of texts;
acquiring a label set consisting of a plurality of labels which can be used as text labels;
training a neural network model based on the sample feature set and the label set in the following way:
taking the sample feature set as the input of a layer 1 neural network model, taking the 1 st label in the label set as the output of the layer 1 neural network model, and training the layer 1 neural network model to predict the performance of the corresponding label according to the keywords of the text to be labeled;
taking the training result of the (m-1) th layer and the sample feature set as the input of the mth layer neural network model, taking the mth label in the label set as the output of the mth layer neural network model, and training the mth level neural network model to predict the performance of the corresponding label according to the keyword; and M is more than or equal to 2 and less than or equal to M, wherein M is the total number of the tags included in the tag set.
In a second aspect, an embodiment of the present invention provides a text label determining method based on the neural network model training method, including:
calculating a feature vector of a keyword corresponding to the text;
inputting the feature vectors of the keywords corresponding to the text into an m-level neural network model to obtain m corresponding labels, wherein m is more than or equal to 2;
calculating the distribution probability of each label in the label set under different categories;
and performing weighted calculation on the m labels and the distribution probability to obtain a label set corresponding to the text.
In a third aspect, an embodiment of the present invention provides a neural network model training apparatus, where the apparatus includes:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample characteristic set formed by semantic subject characteristic vectors of a plurality of texts and a label set formed by a plurality of labels which can be used as text labels;
the training unit is used for training the neural network model according to the sample feature set and the label set in the following modes:
taking the sample feature set as the input of a layer 1 neural network model, taking the 1 st label in the label set as the output of the layer 1 neural network model, and training the layer 1 neural network model to predict the performance of the corresponding label according to the keywords of the text to be assigned with the label;
training the mth-level neural network model to predict the performance of the corresponding label according to the keyword by taking the training result of the (m-1) th layer and the sample feature set as the input of the mth layer neural network model and taking the mth label in the label set as the output of the mth layer neural network model; and M is more than or equal to 2 and less than or equal to M, wherein M is the total number of the tags included in the tag set.
In a fourth aspect, an embodiment of the present invention provides a text label determination apparatus, where the apparatus includes:
the first calculation unit is used for calculating a feature vector of a keyword corresponding to the text;
the input unit is used for inputting the feature vectors of the keywords corresponding to the text into an m-level neural network model to obtain m corresponding labels, wherein m is more than or equal to 2;
and the second calculation unit is used for calculating the distribution probability of each label in the label sets under different categories, and performing weighted calculation on the m labels and the distribution probability to obtain the label set corresponding to the text.
The embodiment of the invention provides a neural network model training method and device and a text label determining method and device, wherein the neural network model is trained based on a sample feature set formed by semantic subject feature vectors of a text and a label set formed by a plurality of labels, and the text label is determined based on the neural network model obtained by training; therefore, a plurality of labels can be determined for one text, and the accuracy and the robustness of the text labels are improved.
Drawings
FIG. 1 is a schematic diagram of an alternative process flow of a neural network model training method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a processing flow for obtaining a sample feature set composed of semantic topic feature vectors of a plurality of texts according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a process flow for obtaining keywords of a text according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a process for calculating a word weight for each word according to an embodiment of the present invention;
FIG. 5 is a schematic process flow diagram for training a neural network model based on a sample feature set and a tag set according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a detailed structure of a chain neural network CMLP according to an embodiment of the present invention;
FIG. 7 is a schematic view of another alternative processing flow of the neural network model training method according to the embodiment of the present invention;
FIG. 8 is a schematic processing flow diagram of a text label determination method according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a neural network model training apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a text label determining apparatus according to an embodiment of the present invention;
fig. 11 is a schematic diagram of a hardware component structure of an electronic device according to an embodiment of the present invention.
Detailed Description
Before explaining the embodiments of the present invention in detail, terms related to the embodiments of the present invention will be explained.
1) Stop words means that in the information retrieval, some words or words are automatically filtered before or after processing the natural language data (or text) in order to save storage space and improve search efficiency.
2) The nonsense words refer to words such as "in", "go", "how" and the like, which have no definite meaning and only have a certain function when being put into a complete sentence.
3) Named entity recognition refers to recognition of entities with specific meanings in text, and mainly includes names of people, places, organizations, proper nouns and the like.
So that the manner in which the features and aspects of the embodiments of the present invention can be understood in detail, a more particular description of the embodiments of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings.
In view of the problems of labeling texts by a multi-class classification method, the invention provides a method for training sample label characteristics by using decision tree algorithms such as a binary classification method, a label set combination method, multi-label proximity classification, a pair of multi-combination support vector machines, random forests and the like.
In view of the above problem, an optional processing flow diagram of the neural network model training method provided in the embodiment of the present invention, as shown in fig. 1, includes the following steps:
step S101, a sample feature set composed of semantic topic feature vectors of a plurality of texts is obtained.
In some optional embodiments, the neural network model training device processes the text to obtain a plurality of semantic topic feature vectors, and the plurality of semantic topic feature vectors form a sample feature set. The processing process of the neural network model training device for acquiring the sample feature set formed by semantic topic feature vectors of a plurality of texts, as shown in fig. 2, comprises the following steps:
in step S1011, keywords of the text are acquired.
In some embodiments, the processing flow of the neural network model training device to obtain the keywords of the text, as shown in fig. 3, includes the following steps:
and step S1a, performing word segmentation processing on the text to obtain a plurality of words.
In an optional embodiment, the neural network model training device adopts a Language Technology Platform (LTP) open source tool to perform word segmentation on the text, and performs stop word filtering, nonsense word filtering, named entity recognition on the word segmentation result, and the like on the word segmentation result. When the word segmentation result is subjected to named entity recognition, Wujiang river Yin in Kun mountain is recognized as ns (place name), Bojian Junjian is recognized as ni (institution name), and Changshun is recognized as nh (human name). By segmenting the text, the important features in the text can be effectively identified.
Step S1b, a word weight for each word is calculated.
In an alternative embodiment, the process flow of calculating the word weight of each word by the neural network model training device, as shown in fig. 4, includes the following steps:
step S1b1, a first word weight is calculated based on the self-attribute of each word.
Here, the self attribute includes at least: part of speech, position of word, degree of correlation with topic; the parts of speech include: nouns, verbs, adjectives;
in actual implementation, the first word weight of the noun is greater than the first word weights of the verb and the adjective; the first word weight of the particle such as the auxiliary word and other referring words is zero. The first word weights of the words at the beginning and end are greater than the first word weights of the words at other positions, the first word weights of words in the text that are related or similar to the title of the text being significant. The first word of "free" in the text is significant, if the title of the text is "most conscious of free children".
Step S1b2, computing an inter-word weight based on the first word weight of each word and the first weights of the words in the set of words.
Here, the inter-word weight is a set of words of a preset number forward and backward centered on one word; and carrying out weighted iteration on the first word weight of each word and the first weight of each word in the word set to obtain the inter-word weight.
Let W denote the word, S (W, n) denote the set of words, where n denotes the number of words in the set of words.
In some embodiments, the inter-word weight is calculated based on the word feature TEXTRANK graph model, taking word a as an example, and performing weighted iteration on the first word weight of word a and the weight of each word in the word set corresponding to word a, as shown in the following formula:
Figure GDA0003558163030000051
step S1b3, the word weight under the total amount and the word weight under the category are calculated.
In one embodiment, the neural network model training device calculates the word weights at full scale using the following formula:
Figure GDA0003558163030000052
Figure GDA0003558163030000053
Tfidf(W)=Tf(W)*idf(W) (2)
wherein f (W) is the corresponding word frequency of the word W in all samples, f (all) is the total word number, c (all) is the number of all samples, and c (W) is the number of samples including W.
The processing procedure of the neural network model training device for calculating the word weight TFidfc under the category is similar to the processing procedure for calculating the word weight under the full amount, and is not repeated here.
Step S1b4, determining the word weight of each word based on the product of the inter-word weight, the word weight under the full amount and the word weight under the category.
The neural network model training device multiplies the inter-word weight, the word weight under the full quantity and the word weight under the category to obtain a word weight, and the following formula is shown as follows:
WT(W)=Weight(Wi)*TFidfF*TFidfC (3)
in some embodiments, the word weight may also be normalized, for example, by dividing the calculated word weight by the maximum weight value of the single sample to obtain a normalized weight of the word weight.
And S1c, sequencing the word weights according to sizes, and taking N words with large word weights as keywords of the text, wherein N is a positive integer.
In some embodiments, the obtained keywords may be further subjected to synonym transformation based on a synonym transformation dictionary, the synonym dictionary being compiled according to a text-specific transformation key, an example of the synonym dictionary is shown in table 1 below:
Figure GDA0003558163030000061
TABLE 1
Based on the synonym dictionary shown in table 1, at least words with synonymy relations among labels in the dimensions of plot, character, background, character and character identity are converted, and meanwhile, the labels which are not used or are wrongly marked in each dimension are cleaned and filtered, so that the uniqueness of the labels in different dimensions is ensured. Meanwhile, aiming at the difference of the semantics of the same label under each category, synonymy conversion is carried out based on the semantic mapping among the labels of each category, and the conversion format is as follows: category | original word | converted word; the labels of the original words and the converted words containing the label library are identified by the initial marks.
Step S1012, determining a semantic topic feature vector of the text based on the keyword.
In some embodiments, the neural network model training device performs TF-idf calculation on the sample set library aiming at the keywords; and then, LSI Latent Semantic Indexing (Laten Semantic Indexing) is adopted to perform feature dimension reduction and Semantic topic representation.
In concrete implementation, when TF-idf calculation is performed on the sample set library aiming at keywords, the sample set library is used as a column of a matrix, the keywords are used as rows to perform TF-idf calculation, and the keywords are further abstractly displayed.
When LSI latent semantic indexes are adopted for feature dimension reduction and semantic topic representation, the relevance among documents, topics, word senses and words is obtained based on Singular Value Decomposition (SVD); matrix A of j words for i documentsijCan be decomposed into the following equations:
Ai*j=Ui*kSk*kVk*j (4)
wherein, U represents the correlation degree between the document and the theme, S represents the correlation degree between the theme and the word meaning, and B represents the correlation degree between the word and the word meaning.
The semantic topic feature vector of the text is shown by the following formula:
Figure GDA0003558163030000071
wherein, X is the semantic theme characteristic vector of the text, and d is the word characteristic vector of the text.
Step S1013, a sample feature set is constructed based on the semantic topic feature vector.
In some embodiments, the neural network model training device constructs a set based on semantic topic feature vectors of several texts, taking the set as a sample feature set.
Step S102, a label set formed by a plurality of labels which can be used as text labels is obtained.
In some embodiments, the label set may be preset by the neural network model training device, or sent to the neural network model training device by the server; the label at least comprises: city workplace, mysterious fantasy, science fantasy, swordsman, history, speech, romance, war, etc.
And step S103, training the neural network model based on the sample feature set and the label set.
Because the text has complex and various characteristics, the expression mode of the text is also diversified, the pertinence of the theme or the viewpoint is not strong, and the content descriptions such as personal plot hypothesis, experience description and background culture are messy; the neural network model has strong nonlinear mapping capability, strong generalization capability on noise data, capability of self-learning sample characteristics and high self-adaption capability and fault-tolerant capability; therefore, the embodiment of the invention trains the neural network model based on the sample feature set and the label set, so that the neural network model has the performance of predicting the label corresponding to the text.
In some embodiments, the process flow for training the neural network model based on the sample feature set and the tag set, as shown in FIG. 5, the sample feature set includes n-dimensional sample features, denoted X (X)1,x2,x3…xn) (ii) a The tag set includes m-dimensional tags, denoted as Y (Y)1,y2,y3…ym) And sequentially combining and constructing a chain type neural network (CMLP) model by adopting a neural network MLP.
In particular implementations, the sample feature set X (X) is used1,x2,x3…xn) For the input of the layer 1 neural network model, the 1 st label y in the label set1For the output of the layer 1 neural network model, training the layer 1 neural network model to predict the performance of the corresponding label according to the keywords of the text of the label to be distributed;
training results at layer 1
Figure GDA0003558163030000081
And a sample feature set X (X)1,x2,x3…xn) For the input of the layer 2 neural network model, the 2 nd label y in the label set2For the output of the layer 2 neural network model, training the layer 2 neural network model to predict the performance of the corresponding label according to the keywords of the text of the label to be distributed;
by analogy, the training result of the m-1 layer
Figure GDA0003558163030000082
And the sample characteristic set is input into the mth layer neural network model, and the label set is used forThe mth label is the output of the mth layer neural network model, and the mth level neural network model is trained to predict the performance of the corresponding label according to the key words; wherein M is greater than or equal to 3 and less than or equal to M, and M is the total number of the tags included in the tag set.
In the above embodiment, the CMLP is composed of an input layer, a plurality of hidden layers, and an output layer, as shown in fig. 6. Wherein the input layer is a semantic topic feature vector X of the text; in order to reduce the characteristic noise and avoid characteristic sparseness, the keywords of the text are subjected to characteristic extraction in advance, meanwhile, for highlighting important implicit characteristics in the sample, the sample characteristics are expressed based on the theme distribution of the text, and the semantic theme vector X of the text is used as the input of the chain neural network. Then, the semantic theme vector X of the text and any label Y in the label set Y are combined1Training is carried out to obtain a first training result
Figure GDA0003558163030000083
The semantic topic vector X of the first training result and the text is used as an input and a next label Y in the label set Y2Performing parameter training to obtain a second training result
Figure GDA0003558163030000091
In the same way, the training result of the neural network model of the previous layer is combined with the semantic topic vector X of the text as characteristic input to enter hidden layer training, and in the iterative training, different perceptrons C (C) are constructed and trained according to different input of each layer of neural network model1,C2…Cm) And performing feature transfer through weight calculation and an activation function (such as a relu function) until all m labels in the label set are trained.
The training process of the neural network model is converted into a process of finding the weight and bias combination between neurons so as to minimize the loss between the actual value and the expected value. For input feature forward propagation, weight calculation f is wx + w', yl+ b and the activation function relu in the following formula, resulting in a loss value between the actual value and the expected value.
Figure GDA0003558163030000092
The loss function is represented by the following formula, and when the single sample prediction result y is 1, if the prediction probability h is 1, the loss function is 0; if h is 0, the loss value becomes infinite. And continuously updating parameters such as w, w', b and the like by adopting a chain type derivation through a random gradient descent method, and iterating for several times to enable a loss function to reach a minimum value, thereby obtaining an optimal parameter model.
Figure GDA0003558163030000093
In the embodiment of the invention, each text corresponds to a group of label sets through a multi-label classification method, for example, a movie can correspond to a plurality of labels such as comedy, history, war and the like, and the accuracy and robustness of the text labels are improved. And the method of multi-category classification in the related art enables each text to correspond to only one label, like a movie can correspond to only one label in comedy, history and war, and reduces the accuracy and robustness of the text label.
Based on the neural network model training method, the embodiment of the invention provides the abstract labels corresponding to the books under each category and each dimension of the web texts as shown in the following table 2, and the training is performed based on the training sample set and the cross dimension labels according to the sample category and the abstract label dimension. For example, 11499 curvilinearly drawn onbooks of boy class and 502 abstract labels of target are subjected to chain network mapping training, and the whole training process is subordinate to the book and the label attribution class from the book content to the final label set. Firstly, preprocessing pretreatment such as synonym conversion is carried out on the internet content keywords under each category, and the feature vector of each internet is obtained through feature identification and distribution calculation; and secondly, extracting any label in the target label set to train the sample set, and if the first training label is 'love', performing characteristic learning and model training on the 'love' result in the whole training set to obtain the identification result of the label in the whole training set. Then, for the first training result combined with the entire training set features, a second label, e.g., "romantic", is trained to obtain the training result for the second label. And repeating the iteration, finishing the training of all the labels in the whole abstract label set, and finishing the training of the model when the loss value reaches the minimum value. Each sample obtains the recognition results of 502 labels during the training model recognition, and taking a word saying as an example, the label result set obtained by performing multi-label recognition through the neural network model is "youth", "campus", "dark love" ….
Figure GDA0003558163030000101
TABLE 2
And finally, calculating the ratio of the number of samples covered by one label in the class labels to the number of samples of all labels in the class labels to obtain the distribution probability of each label in the label set under different classes. Combining the following distribution probability formulas of all labels in the label set under all categories:
Figure GDA0003558163030000102
calculating the number S of samples covered by the class label ttNumber of samples associated with all tags of the class
Figure GDA0003558163030000103
Finally, each sample performs weighted calculation on the identification result and the label distribution probability under the type to obtain a final label result set, and the obtained label result set is shown in the following table 3:
Figure GDA0003558163030000111
TABLE 3
As shown in fig. 7, another optional processing flow of the neural network model training method provided in the embodiment of the present invention includes the following steps:
step S201, extracting keywords based on the book brief introduction and the book text to obtain book keywords.
And step S202, carrying out synonym conversion on the book keywords.
Step S203, judging whether the keyword can be subjected to label identification, and if so, taking the keyword as a primary label result; if the determination result is no, step S204 is executed.
Step S204, TF-idf calculation is carried out on the keywords and a topic feature vector is determined by the LSI topic model.
And S205, performing multi-label classification by using the theme feature vector as a sample set and a label set by using a classification model to obtain a book label result.
An embodiment of the present invention further provides a text label determination method based on the neural network model training method, and an embodiment of the present invention provides a processing flow of the text label determination method, as shown in fig. 8, including:
step S301, calculating semantic topic feature vectors of the text.
In the embodiment of the present invention, the processing procedure for calculating the semantic subject feature vector of the text is the same as the processing procedure described in step S101, and is not described herein again.
Step S302, inputting the semantic topic feature vector of the text into an m-level neural network model to obtain m corresponding labels, wherein m is more than or equal to 2.
Step S303, calculating the distribution probability of each label in the label set under different categories.
In the embodiment of the invention, the distribution probability of each label in the label set under different categories is calculated by using the formula (8).
And S304, performing weighted calculation on the m labels and the distribution probability to obtain a label set corresponding to the text.
Based on the above neural network model training method, an embodiment of the present invention further provides a neural network model training device, and a structure of the neural network model training device 400, as shown in fig. 9, includes:
an obtaining unit 401, configured to obtain a sample feature set formed by semantic topic feature vectors of a plurality of texts and a tag set formed by a plurality of tags that can be used as text tags;
a training unit 402, configured to train a neural network model according to the sample feature set and the label set in the following manner:
taking the sample feature set as the input of a layer 1 neural network model, taking the 1 st label in the label set as the output of the layer 1 neural network model, and training the layer 1 neural network model to predict the performance of the corresponding label according to the keywords of the text to be assigned with the label;
training the mth-level neural network model to predict the performance of the corresponding label according to the keyword by taking the training result of the (m-1) th layer and the sample feature set as the input of the mth layer neural network model and taking the mth label in the label set as the output of the mth layer neural network model; and M is more than or equal to 2 and less than or equal to M, wherein M is the total number of the tags included in the tag set.
In this embodiment of the present invention, the obtaining unit 401 is further configured to obtain a keyword of a text; determining semantic topic feature vectors of the text based on the keywords; and constructing a sample feature set based on the semantic topic feature vector.
In this embodiment of the present invention, the obtaining unit 401 is further configured to perform word segmentation processing on the text to obtain a plurality of words;
calculating a word weight of each word;
and sequencing the word weights according to the sizes, and taking N words with large word weights as the keywords of the text, wherein N is a positive integer.
In this embodiment of the present invention, the obtaining unit 401 is further configured to calculate a first word weight based on the attribute of each word;
calculating an inter-word weight based on a first word weight of each word and first weights of words in a word set, wherein the word set is a set formed by words with a preset number forward and backward around one word;
calculating the word weight under the total amount and the word weight under the category;
determining a word weight for each word based on a product of the inter-word weight, the word weight at the full amount, and the word weight at the category.
In this embodiment of the present invention, the obtaining unit 401 is further configured to perform weighted iteration on the first word weight of each word and the first weights of the words in the word set, so as to obtain an inter-word weight.
In this embodiment of the present invention, the obtaining unit 401 is further configured to calculate a semantic topic feature vector of a text;
inputting the semantic topic feature vector of the text into an m-level neural network model to obtain m corresponding labels, wherein m is more than or equal to 2;
calculating the distribution probability of each label in the label set under different categories;
and performing weighted calculation on the m labels and the distribution probability to obtain a label set corresponding to the text.
In this embodiment of the present invention, the obtaining unit 401 is further configured to calculate a ratio between the number of samples covered by one label in the class labels and the number of samples of all labels in the class labels, so as to obtain the distribution probability of each label in the label set under different classes.
Based on the foregoing text label determining method, an embodiment of the present invention further provides a text label determining apparatus, where a constituent structure of the text label determining apparatus 500 is as shown in fig. 10, and the text label determining apparatus includes:
a first calculating unit 501, configured to calculate a feature vector of a keyword corresponding to a text;
an input unit 502, configured to input the feature vectors of the keywords corresponding to the text into an m-level neural network model to obtain m corresponding tags, where m is greater than or equal to 2;
the second calculating unit 503 is configured to calculate a distribution probability of each label in the label sets under different categories, and perform weighted calculation on the m labels and the distribution probability to obtain a label set corresponding to the text.
In this embodiment of the present invention, the second calculating unit 503 is further configured to calculate a ratio between the number of samples covered by one label in the class label and the number of samples of all labels in the class label, so as to obtain the distribution probability of each label in the label set under different classes.
Fig. 11 is a schematic diagram of a hardware component structure of an electronic device (a neural network model training apparatus or a text label determination apparatus) according to an embodiment of the present invention, where the electronic device 700 includes: at least one processor 701, a memory 702, and at least one network interface 704. The various components in the electronic device 700 are coupled together by a bus system 705. It is understood that the bus system 705 is used to enable communications among the components. The bus system 705 includes a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration the various busses are labeled in figure 11 as the bus system 705.
It will be appreciated that the memory 702 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The non-volatile Memory may be ROM, Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic random access Memory (FRAM), Flash Memory (Flash Memory), magnetic surface Memory, optical Disc, or Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration, and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Double Data Rate Synchronous Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Random Access Memory (DRAM), Synchronous Random Access Memory (DRAM), Direct Random Access Memory (DRmb Access Memory). The memory 702 described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
The memory 702 in embodiments of the present invention is used to store various types of data in support of the operation of the electronic device 700. Examples of such data include: any computer program for operating on electronic device 700, such as application 7022. Programs that implement methods in accordance with embodiments of the present invention can be included within application program 7022.
The method disclosed in the above embodiments of the present invention may be applied to the processor 701, or implemented by the processor 701. The processor 701 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 701. The Processor 701 may be a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 701 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 702, and the processor 701 may read the information in the memory 702 and perform the steps of the aforementioned methods in conjunction with its hardware.
In an exemplary embodiment, the electronic Device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), FPGAs, general purpose processors, controllers, MCUs, MPUs, or other electronic components for performing the foregoing methods.
Accordingly, an embodiment of the present invention further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the computer program is used to implement the neural network model training method according to the embodiment of the present invention or the text label determining method according to the embodiment of the present invention.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements, etc. that are within the spirit and principle of the present invention should be included in the present invention.

Claims (9)

1. A neural network model training method, the method comprising:
acquiring a sample characteristic set formed by semantic topic characteristic vectors of a plurality of texts;
acquiring a label set consisting of a plurality of labels as text labels;
training the neural network model based on the sample feature set and the label set in the following way:
taking the sample feature set as the input of a layer 1 neural network model, taking the 1 st label in the label set as the output of the layer 1 neural network model, and training the layer 1 neural network model to predict the performance of the corresponding label according to the keywords of the text to be assigned with the label;
training the mth-level neural network model to predict the performance of the corresponding label according to the keyword by taking the training result of the (m-1) th layer and the sample feature set as the input of the mth layer neural network model and taking the mth label in the label set as the output of the mth layer neural network model; m is more than or equal to 2 and less than or equal to M, and M is the total number of the tags included in the tag set; the neural network model includes: the system comprises an input layer, M hidden layers and an output layer, wherein the M hidden layers take neural networks of M multilayer perceptrons (MLPs) as M classifiers and construct the M classifiers in a chain structure.
2. The method of claim 1, wherein obtaining a sample feature set consisting of semantic topic feature vectors of a plurality of texts comprises:
acquiring keywords of a text;
determining semantic subject feature vectors of the text based on the keywords;
and constructing a sample feature set based on the semantic topic feature vector.
3. The method of claim 2, wherein obtaining keywords of the text comprises:
performing word segmentation processing on the text to obtain a plurality of words;
calculating a word weight of each word;
and sequencing the word weights according to the sizes, and taking N words with large word weights as the keywords of the text, wherein N is a positive integer.
4. The method of claim 3, wherein the calculating a word weight for each word comprises:
calculating a first word weight based on the self-attribute of each word;
calculating an inter-word weight based on a first word weight of each word and first weights of words in a word set, wherein the word set is a set formed by words with a preset number forward and backward around one word;
calculating the word weight under the total amount and the word weight under the category;
determining a word weight for each word based on a product of the inter-word weight, the word weight at the full amount, and the word weight at the category.
5. The method of claim 4, wherein computing inter-word weights based on the first word weight for each word and the first weights for words in the set of words comprises:
and carrying out weighted iteration on the first word weight of each word and the first weight of each word in the word set to obtain the inter-word weight.
6. A method for text label determination, the method comprising:
calculating semantic topic feature vectors of the text;
inputting the semantic topic feature vector into an m-level neural network model, wherein the m-level neural network model is obtained by training according to the neural network model training method of claim 1, and corresponding m labels are obtained, wherein m is more than or equal to 2;
calculating the distribution probability of each label in the label set under different categories;
and performing weighted calculation on the m labels and the distribution probability to obtain a label set corresponding to the text.
7. The method of claim 6, wherein the calculating the distribution probability of each label in the label set under different categories comprises:
and calculating the ratio of the number of samples covered by one label in the class labels to the number of samples of all labels in the class labels to obtain the distribution probability of each label in the label set under different classes.
8. An apparatus for neural network model training, the apparatus comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample characteristic set formed by semantic subject characteristic vectors of a plurality of texts and a label set formed by a plurality of labels as text labels;
a training unit, configured to train the neural network model according to the sample feature set and the label set in the following manner:
taking the sample feature set as the input of a layer 1 neural network model, taking the 1 st label in the label set as the output of the layer 1 neural network model, and training the layer 1 neural network model to predict the performance of the corresponding label according to the keywords of the text to be assigned with the label;
training the mth-level neural network model to predict the performance of the corresponding label according to the keyword by taking the training result of the (m-1) th layer and the sample feature set as the input of the mth layer neural network model and taking the mth label in the label set as the output of the mth layer neural network model; m is more than or equal to 2 and less than or equal to M, and M is the total number of the tags included in the tag set; the neural network model includes: the system comprises an input layer, M hidden layers and an output layer, wherein the M hidden layers take neural networks of M multilayer perceptrons (MLPs) as M classifiers and construct the M classifiers in a chain structure.
9. A text label determination apparatus, the apparatus comprising:
the first calculation unit is used for calculating a feature vector of a keyword corresponding to the text;
an input unit, configured to input feature vectors of keywords corresponding to the text into an m-level neural network model, where the m-level neural network model is obtained by training according to the neural network model training method of claim 1, and m corresponding tags are obtained, where m is greater than or equal to 2;
and the second calculating unit is used for calculating the distribution probability of each label in the label sets under different categories, and performing weighted calculation on the m labels and the distribution probability to obtain the label set corresponding to the text.
CN201810837902.9A 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device Active CN109165380B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810837902.9A CN109165380B (en) 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810837902.9A CN109165380B (en) 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device

Publications (2)

Publication Number Publication Date
CN109165380A CN109165380A (en) 2019-01-08
CN109165380B true CN109165380B (en) 2022-07-01

Family

ID=64898322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810837902.9A Active CN109165380B (en) 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device

Country Status (1)

Country Link
CN (1) CN109165380B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992646B (en) * 2019-03-29 2021-03-26 腾讯科技(深圳)有限公司 Text label extraction method and device
CN110222160B (en) * 2019-05-06 2023-09-15 平安科技(深圳)有限公司 Intelligent semantic document recommendation method and device and computer readable storage medium
CN110147499B (en) * 2019-05-21 2021-09-14 智者四海(北京)技术有限公司 Labeling method, recommendation method and recording medium
CN110472665A (en) * 2019-07-17 2019-11-19 新华三大数据技术有限公司 Model training method, file classification method and relevant apparatus
CN110428052B (en) * 2019-08-01 2022-09-06 江苏满运软件科技有限公司 Method, device, medium and electronic equipment for constructing deep neural network model
CN110491374A (en) * 2019-08-27 2019-11-22 北京明日汇科技管理有限公司 Hotel service interactive voice recognition methods neural network based and device
CN111177385B (en) * 2019-12-26 2023-04-07 北京明略软件系统有限公司 Multi-level classification model training method, multi-level classification method and device
CN111339301B (en) * 2020-02-28 2023-11-28 创新奇智(青岛)科技有限公司 Label determining method, label determining device, electronic equipment and computer readable storage medium
CN111666769A (en) * 2020-06-11 2020-09-15 暨南大学 Method for extracting financial field event sentences in annual newspaper
CN111695053A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Sequence labeling method, data processing device and readable storage medium
CN111695052A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Label classification method, data processing device and readable storage medium
CN113822013B (en) * 2021-03-08 2024-04-05 京东科技控股股份有限公司 Labeling method and device for text data, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN105046274A (en) * 2015-07-13 2015-11-11 浪潮软件集团有限公司 Automatic labeling method for electronic commerce commodity category
KR20170039951A (en) * 2015-10-02 2017-04-12 네이버 주식회사 Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning
CN106909654A (en) * 2017-02-24 2017-06-30 北京时间股份有限公司 A kind of multiclass classification system and method based on newsletter archive information
CN107944946A (en) * 2017-11-03 2018-04-20 清华大学 Commercial goods labels generation method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11086918B2 (en) * 2016-12-07 2021-08-10 Mitsubishi Electric Research Laboratories, Inc. Method and system for multi-label classification

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN105046274A (en) * 2015-07-13 2015-11-11 浪潮软件集团有限公司 Automatic labeling method for electronic commerce commodity category
KR20170039951A (en) * 2015-10-02 2017-04-12 네이버 주식회사 Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning
CN106909654A (en) * 2017-02-24 2017-06-30 北京时间股份有限公司 A kind of multiclass classification system and method based on newsletter archive information
CN107944946A (en) * 2017-11-03 2018-04-20 清华大学 Commercial goods labels generation method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Classifier chains for multi-label classification;Jsse Read 等;《Machine Learning》;20110630(第85期);第333-359页 *
基于Spark的组合分类器链多标签分类方法;王进 等;《中国科学技术大学学报》;20170415;第47卷(第4期);第350-357页 *
基于路径选择的层次多标签分类;张春焰等;《计算机技术与发展》;20180516(第10期);第40-50页 *

Also Published As

Publication number Publication date
CN109165380A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
CN109165380B (en) Neural network model training method and device and text label determining method and device
WO2018049960A1 (en) Method and apparatus for matching resource for text information
Zhao et al. ZYJ123@ DravidianLangTech-EACL2021: Offensive language identification based on XLM-RoBERTa with DPCNN
Wang et al. Encoding syntactic dependency and topical information for social emotion classification
Mahmoud et al. A text semantic similarity approach for Arabic paraphrase detection
Schütz et al. Automatic sexism detection with multilingual transformer models
Li et al. LSTM-based deep learning models for answer ranking
Celikyilmaz et al. An empirical investigation of word class-based features for natural language understanding
Mahmoud et al. Hybrid Attention-based Approach for Arabic Paraphrase Detection
Ramesh et al. Abstractive text summarization using t5 architecture
Yuan et al. Personalized sentence generation using generative adversarial networks with author-specific word usage
Alarcón et al. Hulat-ALexS CWI Task-CWI for Language and Learning Disabilities Applied to University Educational Texts.
CN107729509A (en) The chapter similarity decision method represented based on recessive higher-dimension distributed nature
Zhu et al. A named entity recognition model based on ensemble learning
Lazemi et al. Persian plagirisim detection using CNN s
CN113641789A (en) Viewpoint retrieval method and system based on hierarchical fusion of multi-head attention network and convolutional network
Aydinov et al. Investigation of automatic part-of-speech tagging using CRF, HMM and LSTM on misspelled and edited texts
Ma et al. ASR hypothesis reranking using prior-informed restricted boltzmann machine
Sathyanarayanan et al. Kannada named entity recognition and classification using bidirectional long short-term memory networks
Magooda et al. Rdi_team at semeval-2016 task 3: RDI unsupervised framework for text ranking
Thu et al. Generating myanmar news headlines using recursive neural network
Le-Hong et al. A semantics-aware approach for multilingual natural language inference
Bosc et al. Learning word embeddings from dictionary definitions only
Song et al. A hybrid model for community-oriented lexical simplification
Rahmath et al. Pre-trained Word Embeddings for Malayalam Language: A Review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant