CN109783801B

CN109783801B - Electronic device, multi-label classification method and storage medium

Info

Publication number: CN109783801B
Application number: CN201811529912.2A
Authority: CN
Inventors: 刘俊; 肖龙源; 蔡振华; 李稀敏; 刘晓葳; 谭玉坤
Original assignee: Xiamen Kuaishangtong Technology Corp ltd
Current assignee: Xiamen Kuaishangtong Technology Corp ltd
Priority date: 2018-12-14
Filing date: 2018-12-14
Publication date: 2023-08-25
Anticipated expiration: 2038-12-14
Also published as: CN109783801A

Abstract

The invention discloses an electronic device, a multi-label classification method and a storage medium, wherein the method comprises the following steps: and (3) identifying and resolving zero-pronouns: recognizing and resolving zero pronouns of sentences to be classified to obtain expanded sentences; sentence splitting: carrying out syntactic analysis on the extended sentences, and extracting parallel relation items in the extended sentences; splitting the expansion sentence through replacement or marking training to form a plurality of split sentences; or pertinently designing corpus labels, manually marking parallel relation items and other items in the digested expansion sentences, training a Bi-LSTM-CRF model for splitting sentences, and classifying and splitting the expansion sentences by using the trained Bi-LSTM-CRF model to form a plurality of split sentences. The method can effectively split the complex multi-label sentence into a plurality of simple single-label sentences.

Description

Electronic device, multi-label classification method and storage medium

Technical Field

The invention relates to the technical field of multi-label classification, in particular to an electronic device, a multi-label classification method and a storage medium.

Background

The existing deep learning statement multi-label classification technology has two main directions: firstly, multi-label classification indexes are adopted, such as: a hamming loss directly predicts the label set; and secondly, converting the sentence into a plurality of single-label two-classification problems, and respectively predicting the probability of each label coincidence. The deep learning sentence multi-label classification technology has the defects that the degree of freedom of label set is high, the training difficulty is high, a large number of independent training samples are needed, a single label training sample cannot be shared and the like; the latter prediction results may be disturbed by non-current predicted tag information, or may have predictable deviations because the training samples of a single tag are not consistent with the test sample distribution of multiple tags.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides an electronic device, a multi-label classification method and a storage medium.

In order to achieve the above object, the present invention provides an electronic device, including a memory and a processor connected to the memory, where the memory stores a processing system that can be executed on the processor, and the processing system when executed by the processor implements the following steps:

and (3) identifying and resolving zero-pronouns:

recognizing and resolving zero pronouns of the sentences to be classified to obtain expanded sentences, wherein the zero pronouns are recognizable phrases or blank spaces of words in the sentences to be classified;

sentence splitting:

carrying out syntactic analysis on the extended sentences, and extracting parallel relation items in the extended sentences; splitting the expansion sentence through replacement or marking training to form a plurality of split sentences;

or pertinently designing corpus labels, manually marking parallel relation items and other items in the digested expansion sentences, training a Bi-LSTM-CRF model for splitting sentences, and classifying and splitting the expansion sentences by using the trained Bi-LSTM-CRF model to form a plurality of split sentences; the other items include a shared item and a deleted item.

Further, the processing system of the electronic device further implements an intention recognition step when executed by the processor, where the intention recognition step: and respectively inputting a plurality of split sentences obtained in the sentence splitting step as a model for single intention recognition to obtain a plurality of intentions.

The above electronic device, preferably, the identifying and digesting steps of the zero pronoun specifically include:

dividing sentences to be classified by adopting full-mode crust segmentation to obtain candidate antecedent sets;

performing feature learning according to the zero pronoun text by using a first cyclic neural network to obtain zero pronoun text vector representation, calculating the attention of each word in each candidate antecedent by using a general attention model, performing attention calculation on each word in each candidate antecedent by using a general attention model, performing weighted average on the vector of each word according to the attention to obtain the candidate antecedent representation, splicing the candidate antecedent representation and the zero pronoun text vector representation together, and calculating the probability of whether the candidate antecedent is the zero pronoun antecedent or not by using the first feedforward neural network;

and performing feature learning according to the context of the zero pronoun by using a second cyclic neural network to obtain a zero pronoun context vector representation, simultaneously performing attention calculation on each word in each candidate antecedent by using a general attention model calculation, performing weighted average on the vector of each word according to the attention to obtain a representation of the candidate antecedent, splicing the representation of the candidate antecedent and the context vector representation of the zero pronoun together, and calculating the probability of whether the candidate antecedent is the zero pronoun antecedent by using a second feedforward neural network.

In the implementation step when the processing system is executed by the processor, the syntactic analysis of the extended sentence is implemented by adopting the syntactic analysis function in the Stanford NLP tool, the syntactic analysis of the extended sentence obtained after the zero pronouncing is resolved to obtain a syntactic structure tree, and the parallel relation items in the extended sentence are extracted.

Correspondingly, the invention also provides a multi-label classification method, which comprises the following steps:

and (3) identifying and resolving zero-pronouns:

sentence splitting:

Further, the multi-label classification method further comprises,

an intention recognition step: and respectively inputting a plurality of split sentences obtained in the sentence splitting step as a model for single intention recognition to obtain a plurality of intentions.

Further, optionally, the identifying and digesting step of the zero pronoun specifically includes:

In the multi-label classification method, the syntax analysis of the extended sentences is implemented by adopting a syntax analysis function in a Stanford NLP tool, the syntax analysis of the extended sentences obtained after the zero pronoun digestion is implemented to obtain a syntax structure tree, and parallel relation items in the extended sentences are extracted.

The invention also provides a computer readable storage medium having stored thereon a processing system which when executed by a processor implements the steps of the multi-label classification method described above.

The beneficial effects of the invention are as follows: the multi-label sentence sample to be classified is split into the effective single-label sentence sample set, so that the trained single-label classification model can be effectively utilized to conduct multi-label prediction on the premise of not damaging prediction precision, and the problem that the distribution of the prediction sample is inconsistent with that of the training sample is avoided. The method is beneficial to saving development cost and training cost of a large number of multi-label classification algorithms in industrial application, effectively integrating existing resources and furthest exerting the use of the existing single-label training data and models. In addition, the invention has expandability and can meet the requirement of quick feedback of fast-changing markets in industrial application. For example, a new demand label appears in the market, and only the corresponding single label data of the demand label is collected for modeling training and can be added into the multi-label classification system without retraining a multi-label model. The model can also be used for conveniently and rapidly transplanting the excellent open source classification model of other people, and the model can be grafted after the model is thoroughly researched.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a schematic diagram of an electronic device according to the present invention;

fig. 2 is a flow chart of the multi-label classification method according to the present invention.

In an embodiment of fig. 3, the expanded sentence obtained after the zero-pronoun digestion is subjected to syntactic analysis to obtain a schematic diagram of a syntactic structure tree;

FIG. 4 is a schematic diagram of classification splitting by Bi-LSTM-CRF model according to an embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical schemes and beneficial effects to be solved more clear and obvious, the invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature.

The invention provides an electronic device, which is an electronic device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. The electronic device comprises an electronic computer, a single server, a server group formed by a plurality of servers or a cloud server formed by a large number of hosts or servers based on cloud computing. As shown in fig. 1, in an embodiment of the present invention, the electronic device includes, but is not limited to, a memory 2 and a processor 1 connected to the memory 2, where the memory 2 stores a processing system that can run on the processor 1.

The memory 1 to which the present invention refers includes a memory and at least one type of readable storage medium. The Memory provides a buffer for the operation of the electronic device, and the readable storage medium includes, but is not limited to, various media capable of storing program codes, including, but not limited to, a usb (universal serial bus), a removable hard disk, a Read Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk.

The processor 1 to which the present invention refers may be a central processing unit or other data processing chip. The processor 1 is arranged to control the overall operation of the electronic device for running program code or processing data stored in the memory 2, such as running a processing system or the like.

The processing system, when executed by the processor 1, performs the steps of:

and (3) identifying and resolving zero-pronouns:

sentence splitting:

Further, the processing system of the electronic device further implements an intention recognition step when executed by the processor 1, the intention recognition step: and respectively inputting a plurality of split sentences obtained in the sentence splitting step as a model for single intention recognition to obtain a plurality of intentions.

In one embodiment, the identifying and resolving steps of the zero-pronoun preferably include:

and performing feature learning according to the context of the zero pronoun by using a second cyclic neural network to obtain a zero pronoun context vector representation, simultaneously performing attention calculation on each word in each candidate antecedent by using a general attention model, performing weighted average on the vector of each word according to the attention to obtain a representation of the candidate antecedent, splicing the representation of the candidate antecedent and the context vector representation of the zero pronoun together, calculating the probability of whether the candidate antecedent is the zero pronoun antecedent by using a second feedforward neural network, and putting the candidate antecedent with the maximum digestion probability into a vacancy of the corresponding zero pronoun in an original sentence to obtain a sentence after zero pronoun digestion.

In addition, the invention also provides a multi-label classification method, as shown in fig. 2, comprising the following steps:

step S1, identifying and resolving zero pronouns:

for example, a sentence to be classified: "I want to visit and stroll with girl friends to Beijing hometown museum. "segmentation obtains candidate antecedent sets: i want, and, girl friends, go, beijing palace museum, visiting, and strolling

Step S2, sentence splitting step:

it should be noted that, the traditional zero pronoun refers to a grammar space of a recognizable noun phrase, but in the present invention, for practical requirement, the zero pronoun refers to not only a noun phrase, but also words or phrases with various parts of speech. Such as sentences to be classified: "ask you what price is you dehairing lips and armpits? "zero-pronoun follows" lip "in the sentence to be classified, which refers to the verb phrase" unhairing ". The word "unhairing" for a zero-pronoun is a precursor word to that zero-pronoun. It follows that the antecedent may appear after the zero pronoun.

Further, the multi-label classification method further comprises,

step S3, an intention recognition step: and respectively inputting a plurality of split sentences obtained in the sentence splitting step as a model for single intention recognition to obtain a plurality of intentions.

The candidate antecedent of the invention refers to the word obtained by segmenting the sentence to be classified, and the granularity of the candidate antecedent is not determined by adopting the technical scheme, so that the invention preferably adopts a full-mode segmentation mode, and the full-mode fully considers various granularities of the sentence to be classified segmentation and considers the possibility of the candidate antecedent as much as possible.

For example: statement to be classified: "ask you what price is you dehairing lips and armpits? The syntax analysis of the extended sentence is to adopt the syntax analysis function in the Stanford NLP tool to obtain the syntax structure tree by the syntax analysis of the extended sentence obtained after the zero pronoun is resolved as shown in figure 3,

the parallel relation indicator in the sentence to be classified is 'and', and the parallel relation item is 'lip' and 'armpit'. Next, the parallel relation item is respectively replaced with the parallel relation instruction word and all corresponding parallel relation item parts to obtain a split sentence 1 and a split sentence 2, wherein the split sentence 1: ask you what price to dehairing lips? Splitting sentence 2: ask what price you are in underarm dehairing?

In another embodiment of the present invention, there is provided a multi-tag classification method, including:

and (3) identifying and resolving zero-pronouns: recognizing and resolving zero pronouns of the sentences to be classified to obtain expanded sentences, wherein the zero pronouns are recognizable phrases or blank spaces of words in the sentences to be classified;

sentence splitting: the corpus labeling is designed pertinently, the manual labeling refers to parallel relation items and other items in the digested expansion sentences, a Bi-LSTM-CRF model for splitting sentences is trained, and the trained Bi-LSTM-CRF model is used for classifying and splitting the expansion sentences to form a plurality of split sentences; the other items include a shared item and a deleted item. The shared item is an original sentence part which is reserved in two split sentences, the deleted item is an original sentence part which is not reserved in two split sentences, and the parallel relation item is an original sentence part which is reserved in two split sentences respectively. Statement to be classified: "I want the arms and lower legs dehairing". "refer to parallel relationship items in digested extended sentences by artificial marking: "arm", "lower leg" and their shares: "I", "want", a "dehairing". "and delete item" and ". And classifying and splitting the extended sentences by using a trained Bi-LSTM-CRF model to form a split sentence 1: "I want the arm dehairing". ", split sentence 2: "I want the lower leg to dehairing". "wherein, the Bi-LSTM-CRF model is shown in FIG. 4, a word vector (word pattern) is transferred into a Bi-directional long-short-term memory model (Bi-LSTM). li token i and its context, ri token i and its context, and concatenating these two token vectors to generate a vector ci of token i and its context. According to ci, the non-normalized probability of mapping each word to the corresponding mark is obtained through the full connection layer, and finally a mark sequence corresponding to the maximum probability of each sentence is selected through the CRF layer.

In addition, the invention also provides a computer readable storage medium, on which a processing system is stored, where the processing system implements the steps of the multi-label classification method described above when executed by a processor, and the steps of the multi-label classification method are not described herein.

According to the method, the multi-label sentence sample to be classified is split into the effective single-label sentence sample set, so that the trained single-label classification model can be effectively utilized to conduct multi-label prediction on the premise of not damaging prediction precision, and the problem that the distribution of the prediction sample is inconsistent with that of the training sample is avoided. The method is beneficial to saving development cost and training cost of a large number of multi-label classification algorithms in industrial application, effectively integrating existing resources and furthest exerting the use of the existing single-label training data and models. In addition, the invention has expandability and can meet the requirement of quick feedback of fast-changing markets in industrial application. For example, a new demand label appears in the market, and only the corresponding single label data of the demand label is collected for modeling training and can be added into the multi-label classification system without retraining a multi-label model. The model can also be used for conveniently and rapidly transplanting the excellent open source classification model of other people, and the model can be grafted after the model is thoroughly researched.

The foregoing description describes preferred embodiments of the present invention, but it is to be understood that the invention is not limited to those described above and is not to be construed as excluding other embodiments. Numerous variations, changes, substitutions and alterations are now apparent to those skilled in the art without departing from the principles and spirit of the invention in light of the above teachings and prior art and knowledge.

Claims

1. An electronic device, which is characterized in that,

the electronic device comprises a memory and a processor connected with the memory, wherein a processing system capable of running on the processor is stored in the memory, and the processing system realizes the following steps when being executed by the processor:

and (3) identifying and resolving zero-pronouns:

recognizing and resolving zero pronouns of sentences to be classified to obtain expanded sentences, wherein the zero pronouns are recognizable phrases or word gaps in the sentences to be classified, and refer to noun phrases and words or phrases with other various parts of speech;

the identifying and digesting step of the zero pronoun specifically comprises the following steps:

performing feature learning according to the context of the zero pronoun by using a second cyclic neural network to obtain a zero pronoun context vector representation, calculating the attention of each word in each candidate antecedent by using a general attention model, performing attention calculation on each word according to the general attention model, performing weighted average on the vector of each word according to the attention to obtain a representation of the candidate antecedent, splicing the representation of the candidate antecedent and the context vector representation of the zero pronoun together, and calculating the probability of whether the candidate antecedent is the zero pronoun antecedent by using a second feedforward neural network;

sentence splitting:

or pertinently designing corpus labels, manually marking parallel relation items and other items in the digested expansion sentences, training a Bi-LSTM-CRF model for splitting sentences, and classifying and splitting the expansion sentences by using the trained splitting Bi-LSTM-CRF model to form a plurality of split sentences; the other items comprise a sharing item and a deleting item;

the processing system when executed by the processor also implements an intent recognition step,

the intention recognition step: and respectively inputting a plurality of split sentences obtained in the sentence splitting step as a model for single intention recognition to obtain a plurality of intentions.

2. The electronic device of claim 1, wherein the electronic device comprises a plurality of electronic devices,

the syntactic analysis of the extended sentence is to use the syntactic analysis function in the Stanford NLP tool to carry out syntactic analysis on the extended sentence obtained after the zero pronoun is resolved to obtain a syntactic structure tree, and the parallel relation items in the extended sentence are extracted.

3. A multi-tag classification method, comprising:

and (3) identifying and resolving zero-pronouns:

sentence splitting:

the multi-tag classification method further includes,

4. The multi-label classification method according to claim 3, characterized in that,

5. A computer-readable storage medium comprising,

the computer readable storage medium has stored thereon a processing system which when executed by a processor implements the steps of the multi-label classification method according to any of claims 3 to 4.