CN109710757A - Construction method, system and the computer readable storage medium of textual classification model - Google Patents

Construction method, system and the computer readable storage medium of textual classification model Download PDF

Info

Publication number
CN109710757A
CN109710757A CN201811440834.9A CN201811440834A CN109710757A CN 109710757 A CN109710757 A CN 109710757A CN 201811440834 A CN201811440834 A CN 201811440834A CN 109710757 A CN109710757 A CN 109710757A
Authority
CN
China
Prior art keywords
classification model
textual classification
input
dialog information
textual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811440834.9A
Other languages
Chinese (zh)
Inventor
程源泉
欧阳一村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE ICT Technologies Co Ltd
Original Assignee
ZTE ICT Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE ICT Technologies Co Ltd filed Critical ZTE ICT Technologies Co Ltd
Priority to CN201811440834.9A priority Critical patent/CN109710757A/en
Publication of CN109710757A publication Critical patent/CN109710757A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention proposes a kind of construction method of textual classification model, the building system of textual classification model and computer readable storage mediums.Wherein, the construction method of textual classification model includes: acquisition at least three-wheel dialog information;At least three-wheel dialog information is input to convolutional neural networks textual classification model parallel;Convolutional neural networks textual classification model is trained according at least three-wheel dialog information, obtains textual classification model.Convolutional neural networks textual classification model is trained using the dialog information inputted parallel, the information inputted parallel as used in training process has the dialog information of context relation relationship, therefore, obtained training result, which can be realized, carries out text classification in conjunction with context, and then improves the accuracy of text classification.

Description

Construction method, system and the computer readable storage medium of textual classification model
Technical field
The present invention relates to Text Classification fields, construction method, text in particular to a kind of textual classification model The building system and computer readable storage medium of this disaggregated model.
Background technique
Multichannel convolutive neural network is applied in field of image processing more, such as applies the image recognition in human-computer interaction, or Person realizes the quick determination of target in video object picture charge pattern.
In the related technology, what text classification was used is that (a kind of loss function is with softmax function for softmax classifier Classifier), but softmax function have serious problem be softmax classification output probability (0~1) mutual exclusion.A certain classification Probability height can allow other class probabilities all very low.Such as in government affairs text classification, a data is both social security classification and levies Class of service is paid, so being difficult to judge data with single classification.Softmax classifier is originally used for the knowledge of convolutional neural networks image Not, it needs to export 1000 label (label) and determines classification, however, in the related technology, it is pair that text classification, which is used in the process, Single data be trained as a result, obtained training pattern cannot be associated place to the information with context relation Reason, causes accuracy rate lower.
Therefore, a kind of construction method of textual classification model is needed, so that the model that building obtains can be realized context Association process, and then improve the accuracy of classification.
Summary of the invention
The present invention is directed to solve at least one of the technical problems existing in the prior art or related technologies.
For this purpose, first aspect of the present invention is to propose a kind of construction method of textual classification model.
The second aspect of the invention is to propose a kind of building system of textual classification model.
The third aspect of the invention is to propose a kind of computer readable storage medium.
In view of this, according to an aspect of the present invention, it proposes a kind of construction methods of textual classification model, comprising: Acquisition at least three-wheel dialog information;At least three-wheel dialog information is input to convolutional neural networks textual classification model parallel;Root Convolutional neural networks textual classification model is trained according at least three-wheel dialog information, obtains textual classification model.
The construction method of textual classification model provided by the invention, acquisition at least three-wheel dialog information, wherein at least three-wheel Dialog information can derive from the human-computer interaction scene under real scene, or be the record information of dialogue both sides, such as medical treatment Dialog information under scene is also possible to people society office staff and handles the talk dialog information of personnel;It will acquire Dialog information is input to parallel in CNN textual classification model, (CNN, Convolutional Neural Networks convolution mind Through network), CNN textual classification model is trained using the dialog information inputted parallel, as used in training process The information inputted parallel has the dialog information of context relation relationship, and therefore, obtained training result can be realized in conjunction with upper Text classification is hereafter carried out, and then improves the accuracy of text classification.
The construction method of above-mentioned textual classification model according to the present invention, can also have following technical characteristic:
In the above-mentioned technical solutions, it is preferable that at least three-wheel dialog information is inputted parallel by any one following mode To convolutional neural networks textual classification model: each round will be talked at least in three-wheel dialog information using the mode of word DUAL PROBLEMS OF VECTOR MAPPING Information is word for word mapped to vector space, generates corresponding first image, will at least corresponding first image of three-wheel dialog information is simultaneously Row is input to convolutional neural networks textual classification model;At least three-wheel dialog information the second image will be compiled by one-hot coding It is input to convolutional neural networks textual classification model.
In the technical scheme, each round at least three-wheel dialog information is talked with into letter in the way of word DUAL PROBLEMS OF VECTOR MAPPING Breath is word for word mapped to vector space, generates at least three the first images, will at least corresponding first image of three-wheel dialog information is simultaneously Row is input in CNN textual classification model, it is preferable that CNN textual classification model is multichannel convolutive neural network, and at least Three the first images are input to multichannel convolutive neural network parallel, are trained, can be same in the way of word DUAL PROBLEMS OF VECTOR MAPPING When at least three-wheel dialog information is mapped, and then accelerate the formation speed of the first image, reduce training sample generation The time of process wastes;Alternatively, at least three-wheel dialog information the second image will be compiled into using one-hot coding, wherein the second image Middle storage at least three-wheel dialog information, the second image is input in CNN textual classification model, and then reduce the figure in storage As quantity, to be trained the management of sample, while the requirement to convolutional neural networks can reduce, be not necessarily to multi-channel mode The sample input of context relation can be realized.
In any of the above-described technical solution, it is preferable that according at least three-wheel dialog information to convolutional neural networks text point Class model is trained, and obtains textual classification model, is specifically included: the first image or the second image being input to convolutional layer and carried out Convolution algorithm, and operation result is inputted in the layer of pond and carries out down-sampled processing using presetting method;By down-sampled processing result It is input to full articulamentum, is classified by classifier, and classification results are input to optimizer and are optimized, obtains text point Class model.
In the technical scheme, the first image or the second image are input to convolutional layer and carry out convolution algorithm, by the first figure The feature extraction of picture or the second image comes out, and the operation result extracted is input in the layer of pond, uses presetting method Operation result is sampled, to reduce number of samples, and sampled result is input to full articulamentum and is classified and is optimized, with Model after being trained, and then the textual classification model obtained using above-mentioned steps is used in the training process with upper The hereafter dialog information of incidence relation, thus, it can be combined when obtained textual classification model can classify to text upper Hereafter related information is classified, relative to the accurate of the textual classification model raising classification for not combining contextual information training Property.
In any of the above-described technical solution, it is preferable that presetting method max-pooling.
In the technical scheme, presetting method max-pooling, i.e., to most strong in the characteristic value being input in the layer of pond Carry out retain, give up other weaker characteristic values, and then guarantee position and the rotational invariance of feature, furthermore it is possible to reduce The problem of number of parameters of textual classification model, reduction model over-fitting, while input X length can be arranged as regular length Input, so as to the quantity of determination neuron during network structure.
In any of the above-described technical solution, it is preferable that down-sampled processing result is input to full articulamentum, passes through classifier Classify, and classification results are input to optimizer optimize and specifically include: down-sampled processing result being input to and is connected entirely Layer is connect, is classified by sigmoid classifier, operation is iterated according to selected sigmoid loss function, until The numerical value of sigmoid loss function is minimum.
In the technical scheme, classified using sigmoid classifier, that is, the result for avoiding the occurrence of classification can only be 0- 1 two kinds of situations are avoided the occurrence of as used mutual exclusion situation present in softmax classifier, wherein sigmoid classifier is base In the convolutional neural networks disaggregated model of sigmoid function, it is iterated operation using sigmoid loss function, until The numerical value of sigmoid loss function is minimum, and then the superiority and inferiority of obtained model is determined using the minimum value of loss function, works as loss When the numerical value minimum of function to get to textual classification model reach the optimum state under the classifier, use obtained model The accuracy classified is higher.
In any of the above-described technical solution, it is preferable that be iterated operation according to selected sigmoid loss function, directly Numerical value minimum to sigmoid loss function specifically includes:
Operation is iterated to selected sigmoid loss function using Adam improved stochastic gradient descent algorithm, Until the numerical value of sigmoid loss function is minimum.
In the technical scheme, being iterated in calculating process using stochastic gradient descent algorithm to loss function will loss Inappropriate math portions are improved to Adam method in function, wherein Adam is that one kind can substitute traditional stochastic gradient descent The first-order optimization method of process, so that the sigmoid loss function that improved stochastic gradient descent algorithm adaptation is selected, obtains The numerical value of sigmoid loss function is minimum, improves the accuracy of textual classification model classification results.
In any of the above-described technical solution, it is preferable that word vector is to be instructed in advance using the improved Cove of Chinese-English translation training It practises handwriting vector.
In the technical scheme, word vector is using the improved Cove pre-training word vector of Chinese-English translation training, wherein Cove pre-training word vector is that scene vector (context vectors, i.e. Cove) carries out the word vector after pre-training, due in More contextual informations can be generated during translator of English, therefore word vector is chosen and is selected as Chinese-English translation training in the process Improved Cove pre-training word vector, and then it is mapped to the contextual information in the first image generated after vector space included more The accuracy of horn of plenty, finally obtained textual classification model classification is higher.
In any of the above-described technical solution, it is preferable that after down-sampled processing result is input to full articulamentum, logical It crosses before sigmoid classifier classified, further includes:
Down-sampled processing result will be sequentially input to dropout and relu activation.
In the technical scheme, fine in order to avoid there is the degree of fitting of training set in the training process, and collect in verifying The middle situation for fitting difference occur, is input to dropout for down-sampled processing result, according to certain during carrying out model training Probability to network parameter carry out stochastical sampling, the new target network that sub-network is updated as this, so that each iteration Not will use identical sub-network, so avoid the occurrence of network by overfitting to training set the case where, utilize relu activation energy The convergence rate for enough accelerating stochastic gradient descent algorithm, reduces the trained time.
In any of the above-described technical solution, it is preferable that in acquisition at least after three-wheel dialog information, will at least three-wheel pair Words information is input to parallel before convolutional neural networks textual classification model, further includes:
Data cleansing is carried out at least three-wheel dialog information.
In the technical scheme, in order to avoid repeat statement present in dialog information or the punctuation mark of mistake influence The incidence relation of context carries out data cleansing at least three-wheel dialog information, duplicate sentence or punctuation mark is deleted, And the dialog information that dialog length is less than three-wheel is spliced to three-wheel and is talked with, it can be tied with ensuring model in the training process The context for closing training sample is trained, and then improves the classification accuracy for the textual classification model that training obtains.
According to the second aspect of the invention, a kind of building system of textual classification model is proposed, comprising: memory, For storing computer program;Processor, for executing computer program with acquisition at least three-wheel dialog information;To at least three Wheel dialog information is input to convolutional neural networks textual classification model parallel;According at least three-wheel dialog information to convolutional Neural net Network textual classification model is trained, and obtains textual classification model.
The building system of textual classification model provided by the invention, memory store computer program;Processor executes meter When calculation machine program, acquisition at least three-wheel dialog information, wherein at least three-wheel dialog information can derive from the people under real scene Machine interaction scenarios, or to talk with the dialog information under the record information of both sides, such as medical scene, it is also possible to office, people society work Make personnel and handles the talk dialog information of personnel;The dialog information that will acquire is input to CNN textual classification model parallel In, (CNN, Convolutional Neural Networks convolutional neural networks) utilize the dialog information pair inputted parallel CNN textual classification model is trained, and the information inputted parallel as used in training process has context relation relationship Dialog information, therefore, obtained training result can be realized in conjunction with context carry out text classification, and then improve text classification Accuracy.
The building system of textual classification model according to the present invention, can also have following technical characteristic:
In the above-mentioned technical solutions, it is preferable that at least three-wheel dialog information is inputted parallel by any one following mode To convolutional neural networks textual classification model: each round will be talked at least in three-wheel dialog information using the mode of word DUAL PROBLEMS OF VECTOR MAPPING Information is word for word mapped to vector space, generates corresponding first image, will at least to talk with corresponding first image defeated parallel for three-wheel Enter to convolutional neural networks textual classification model;Will at least three-wheel dialog information by one-hot coding be compiled into the second image input To convolutional neural networks textual classification model.
In the technical scheme, each round at least three-wheel dialog information is talked with into letter in the way of word DUAL PROBLEMS OF VECTOR MAPPING Breath is word for word mapped to vector space, generates at least three the first images, will at least corresponding first image of three-wheel dialog information is simultaneously Row is input in CNN textual classification model, it is preferable that CNN textual classification model is multichannel convolutive neural network, and at least Three the first images are input to multichannel convolutive neural network parallel, are trained, can be same in the way of word DUAL PROBLEMS OF VECTOR MAPPING When at least three-wheel dialog information is mapped, and then accelerate the formation speed of the first image, reduce training sample generation The time of process wastes;Alternatively, at least three-wheel dialog information the second image will be compiled into using one-hot coding, wherein the second image Middle storage at least three-wheel dialog information, the second image is input in CNN textual classification model, and then reduce the figure in storage As quantity, to be trained the management of sample, while the requirement to convolutional neural networks can reduce, be not necessarily to multi-channel mode The sample input of context relation can be realized.
In any of the above-described technical solution, it is preferable that be specifically used for execute computer program with: by the first image or second Image is input to convolutional layer and carries out convolution algorithm, and operation result is inputted in the layer of pond and carries out down-sampled place using presetting method Reason;
Down-sampled processing result is input to full articulamentum, is classified by classifier, and classification results are input to Optimizer optimizes, and obtains textual classification model.
In the technical scheme, the first image or the second image are input to convolutional layer and carry out convolution algorithm, by the first figure The feature extraction of picture or the second image comes out, and the operation result extracted is input in the layer of pond, uses presetting method Operation result is sampled, to reduce number of samples, and sampled result is input to full articulamentum and is classified and is optimized, with Model after being trained, and then the textual classification model obtained using above-mentioned steps is used in the training process with upper The hereafter dialog information of incidence relation, thus, it can be combined when obtained textual classification model can classify to text upper Hereafter related information is classified, relative to the accurate of the textual classification model raising classification for not combining contextual information training Property.
In any of the above-described technical solution, it is preferable that presetting method max-pooling.
In the technical scheme, presetting method max-pooling, i.e., to most strong in the characteristic value being input in the layer of pond Carry out retain, give up other weaker characteristic values, and then guarantee position and the rotational invariance of feature, furthermore it is possible to reduce The problem of number of parameters of textual classification model, reduction model over-fitting, while input X length can be arranged as regular length Input, so as to the quantity of determination neuron during network structure.
In any of the above-described technical solution, it is preferable that processor, be specifically used for execute computer program with: will be down-sampled Processing result is input to full articulamentum, is classified by sigmoid classifier, according to selected sigmoid loss function into Row iteration operation, until the numerical value of sigmoid loss function is minimum.
In the technical scheme, classified using sigmoid classifier, that is, the result for avoiding the occurrence of classification can only be 0- 1 two kinds of situations are avoided the occurrence of as used mutual exclusion situation present in softmax classifier, are carried out using sigmoid loss function Interative computation until the numerical value of sigmoid loss function is minimum, and then determines obtained model using the minimum value of loss function Superiority and inferiority, when the numerical value minimum of loss function to get to textual classification model reach the optimum state under the classifier, make The accuracy classified with obtained model is higher.
In any of the above-described technical solution, it is preferable that processor, be specifically used for execute computer program with: use Adam Improved stochastic gradient descent algorithm is iterated operation to selected sigmoid loss function, until sigmoid loses letter Several numerical value is minimum.
In the technical scheme, being iterated in calculating process using stochastic gradient descent algorithm to loss function will loss Inappropriate math portions are improved to Adam method in function, so that improved stochastic gradient descent algorithm adaptation was selected Sigmoid loss function, the numerical value for obtaining sigmoid loss function is minimum, improves the accurate of textual classification model classification results Property.
In any of the above-described technical solution, it is preferable that word vector is to be instructed in advance using the improved Cove of Chinese-English translation training It practises handwriting vector.
In the technical scheme, word vector is to train improved Cove pre-training word vector using Chinese-English translation, due to More contextual informations can be generated during Chinese-English translation, therefore word vector is chosen and is selected as Chinese-English translation instruction in the process Practice improved Cove pre-training word vector, and then is mapped to the contextual information in the first image generated after vector space included The accuracy of more horn of plenty, finally obtained textual classification model classification is higher.
In any of the above-described technical solution, it is preferable that processor, be also used to execute computer program with: will be down-sampled Processing result sequentially inputs dropout and relu activation.
In the technical scheme, fine in order to avoid there is the degree of fitting of training set in the training process, and collect in verifying The middle situation for fitting difference occur, is input to dropout for down-sampled processing result, according to certain during carrying out model training Probability to network parameter carry out stochastical sampling, the new target network that sub-network is updated as this, so that each iteration Not will use identical sub-network, so avoid the occurrence of network by overfitting to training set the case where, utilize relu activation energy The convergence rate for enough accelerating stochastic gradient descent algorithm, reduces the trained time.
In any of the above-described technical solution, it is preferable that processor, be also used to execute computer program with: at least three-wheel Dialog information carries out data cleansing.
In the technical scheme, in order to avoid repeat statement present in dialog information or the punctuation mark of mistake influence The incidence relation of context carries out data cleansing at least three-wheel dialog information, duplicate sentence or punctuation mark is deleted, And the dialog information that dialog length is less than three-wheel is spliced to three-wheel and is talked with, it can be tied with ensuring model in the training process The context for closing training sample is trained, and then improves the classification accuracy for the textual classification model that training obtains.
According to the third aspect of the present invention, it the present invention provides a kind of computer readable storage medium, is stored thereon with Computer program realizes the building side of textual classification model in any of the above-described technical solution when computer program is executed by processor The step of method.
A kind of computer readable storage medium provided by the invention is stored thereon with computer program, computer program quilt The step of processor realizes the construction method of textual classification model in any of the above-described technical solution when executing, therefore there is the text Whole technical effects of the construction method of disaggregated model, details are not described herein.
Additional aspect and advantage of the invention will become obviously in following description section, or practice through the invention Recognize.
Detailed description of the invention
Above-mentioned and/or additional aspect of the invention and advantage will become from the description of the embodiment in conjunction with the following figures Obviously and it is readily appreciated that, in which:
Fig. 1 shows the flow diagram of the construction method of the textual classification model of one embodiment of the present of invention;
Fig. 2 shows the flow diagrams of the construction method of the textual classification model of another embodiment of the invention;
Fig. 3 shows the flow diagram of the construction method of the textual classification model of yet another embodiment of the present invention;
Fig. 4 shows the flow diagram of the construction method of the textual classification model of another embodiment of the invention;
Fig. 5 shows the schematic block diagram of the building system of the textual classification model of one embodiment of the present of invention.
Specific embodiment
It is with reference to the accompanying drawing and specific real in order to be more clearly understood that aforementioned aspect of the present invention, feature and advantage Applying mode, the present invention is further described in detail.It should be noted that in the absence of conflict, the implementation of the application Feature in example and embodiment can be combined with each other.
In the following description, numerous specific details are set forth in order to facilitate a full understanding of the present invention, still, the present invention may be used also To be implemented using other than the one described here other modes, therefore, protection scope of the present invention is not limited to following public affairs The limitation for the specific embodiment opened.
The embodiment of first aspect present invention proposes that a kind of construction method of textual classification model, Fig. 1 show the present invention One embodiment textual classification model construction method flow diagram.As shown in Figure 1, this method comprises:
S102, acquisition at least three-wheel dialog information;
S104 at least three-wheel dialog information will be input to convolutional neural networks textual classification model parallel;
S106 is trained convolutional neural networks textual classification model according at least three-wheel dialog information, obtains text Disaggregated model.
The construction method of textual classification model provided by the invention, acquisition at least three-wheel dialog information, wherein at least three-wheel Dialog information can derive from the human-computer interaction scene under real scene, or be the record information of dialogue both sides, such as medical treatment Dialog information under scene is also possible to people society office staff and handles the talk dialog information of personnel;It will acquire Dialog information is input to parallel in CNN textual classification model, (CNN, Convolutional Neural Networks convolution mind Through network), CNN textual classification model is trained using the dialog information inputted parallel, as used in training process The information inputted parallel has the dialog information of context relation relationship, and therefore, obtained training result can be realized in conjunction with upper Text classification is hereafter carried out, and then improves the accuracy of text classification.
It is worth noting that in order to ensure model training as a result, at least three-wheel dialog information of acquisition be default problem with Problem and answer are input to CNN text text class model and are trained by answer, can also acquire at least three-wheel dialog information training After, be input to CNN text text class model using default problem and answer and be trained, with improve model complexity and Fitting degree.
In the above embodiment, it is preferable that in S104 will at least three-wheel dialog information by any one following mode simultaneously Row is input to convolutional neural networks textual classification model: using word DUAL PROBLEMS OF VECTOR MAPPING mode will at least in three-wheel dialog information it is each Wheel dialog information is word for word mapped to vector space, generates corresponding first image, will at least three-wheel dialog information corresponding first Image is input to convolutional neural networks textual classification model parallel;At least three-wheel dialog information by one-hot coding it will be compiled into the Two images are input to convolutional neural networks textual classification model.
In this embodiment, by each round dialog information at least three-wheel dialog information in the way of word DUAL PROBLEMS OF VECTOR MAPPING Word for word be mapped to vector space, generate at least three the first images, will at least corresponding first image of three-wheel dialog information it is parallel It being input in CNN textual classification model, it is preferable that CNN textual classification model is multichannel convolutive neural network, and at least three It opens the first image and is input to multichannel convolutive neural network parallel, be trained, it can be simultaneously in the way of word DUAL PROBLEMS OF VECTOR MAPPING At least three-wheel dialog information is mapped, and then accelerates the formation speed of the first image, reduces training sample and generated The time of journey wastes;Alternatively, at least three-wheel dialog information the second image will be compiled into using one-hot coding, wherein in the second image Storage at least three-wheel dialog information, the second image is input in CNN textual classification model, and then reduces the image in storage Quantity to be trained the management of sample, while can reduce the requirement to convolutional neural networks, be without multi-channel mode The sample input of context relation can be achieved.
Fig. 2 shows the flow diagrams of the construction method of the textual classification model of another embodiment of the invention.Its In, this method comprises:
S202, acquisition at least three-wheel dialog information;
S204 at least three-wheel dialog information will be input to convolutional neural networks textual classification model parallel;
First image or the second image are input to convolutional layer and carry out convolution algorithm, and operation result is inputted pond by S206 Change in layer and carries out down-sampled processing using presetting method;
Down-sampled processing result is input to full articulamentum, is classified by classifier, and classification results are defeated by S208 Enter to optimizer and optimize, obtains textual classification model.
In this embodiment, the first image or the second image are input to convolutional layer and carry out convolution algorithm, by the first image Or second the feature extraction of image come out, and the operation result extracted is input in the layer of pond, uses presetting method pair Operation result is sampled, and to reduce number of samples, and sampled result is input to full articulamentum and is classified and is optimized, with Model after to training, and then the textual classification model obtained using above-mentioned steps is used in the training process with up and down The dialog information of literary incidence relation, thus, it can be in conjunction with up and down when obtained textual classification model can classify to text Literary related information is classified, the accuracy relative to the textual classification model raising classification for not combining contextual information training.
In the above embodiment, it is preferable that presetting method is max-pooling.
In this embodiment, presetting method max-pooling, i.e., to strongest in the characteristic value being input in the layer of pond Retained, give up other weaker characteristic values, and then guarantee position and the rotational invariance of feature, furthermore it is possible to reduce text The problem of number of parameters of this disaggregated model, reduction model over-fitting, while input X length can be arranged as regular length Input, so as to the quantity of determination neuron during network structure.
Fig. 3 shows the flow diagram of the construction method of the textual classification model of yet another embodiment of the present invention.Its In, this method comprises:
S302, acquisition at least three-wheel dialog information;
S304 at least three-wheel dialog information will be input to convolutional neural networks textual classification model parallel;
First image or the second image are input to convolutional layer and carry out convolution algorithm, and operation result is inputted pond by S306 Change in layer and carries out down-sampled processing using presetting method;
Down-sampled processing result is input to full articulamentum, is classified by sigmoid classifier, according to choosing by S308 Fixed sigmoid loss function is iterated operation, until the numerical value of sigmoid loss function is minimum, obtains text classification mould Type.
In this embodiment, classified using sigmoid classifier, that is, the result for avoiding the occurrence of classification can only be 0-1 Two kinds of situations are avoided the occurrence of as used mutual exclusion situation present in softmax classifier, are carried out using sigmoid loss function Interative computation until the numerical value of sigmoid loss function is minimum, and then determines obtained model using the minimum value of loss function Superiority and inferiority, when the numerical value minimum of loss function to get to textual classification model reach the optimum state under the classifier, make The accuracy classified with obtained model is higher.
In any of the above-described embodiment, it is preferable that operation is iterated according to selected sigmoid loss function, until The numerical value minimum of sigmoid loss function specifically includes:
Operation is iterated to selected sigmoid loss function using Adam improved stochastic gradient descent algorithm, Until the numerical value of sigmoid loss function is minimum.
In this embodiment, letter will be lost by being iterated in calculating process using stochastic gradient descent algorithm to loss function Inappropriate math portions are improved to Adam method in number, so that improved stochastic gradient descent algorithm adaptation was selected Sigmoid loss function, the numerical value for obtaining sigmoid loss function is minimum, improves the accurate of textual classification model classification results Property.
In any of the above-described embodiment, it is preferable that word vector is using the improved Cove pre-training of Chinese-English translation training Word vector.
In this embodiment, word vector is using the improved Cove pre-training word vector of Chinese-English translation training, due in More contextual informations can be generated during translator of English, therefore word vector is chosen and is selected as Chinese-English translation training in the process Improved Cove pre-training word vector, and then it is mapped to the contextual information in the first image generated after vector space included more The accuracy of horn of plenty, finally obtained textual classification model classification is higher.
In any of the above-described embodiment, it is preferable that after down-sampled processing result is input to full articulamentum, passing through Before sigmoid classifier is classified, further includes:
Down-sampled processing result will be sequentially input to dropout and relu activation.
In this embodiment, fine in order to avoid there is the degree of fitting of training set in the training process, and concentrated in verifying There is the situation of fitting difference, down-sampled processing result is input to dropout, according to certain during carrying out model training Probability carries out stochastical sampling, the new target network that sub-network is updated as this to network parameter, so that each iteration is not Will use identical sub-network, so avoid the occurrence of network by overfitting to training set the case where, can using relu activation The convergence rate for accelerating stochastic gradient descent algorithm, reduces the trained time.
In any of the above-described embodiment, it is preferable that after acquisition at least three-wheel dialog information, will at least three-wheel talk with Information is input to parallel before convolutional neural networks textual classification model, further includes:
Data cleansing is carried out at least three-wheel dialog information.
In this embodiment, in order to avoid repeat statement present in dialog information or the punctuation mark of mistake influence Incidence relation hereafter carries out data cleansing at least three-wheel dialog information, duplicate sentence or punctuation mark is deleted, and The dialog information that dialog length is less than three-wheel is spliced to three-wheel and is talked with, can be combined with ensuring model in the training process The context of training sample is trained, and then improves the classification accuracy for the textual classification model that training obtains.
Fig. 4 shows the flow diagram of the construction method of the textual classification model of another embodiment of the invention.Its In, this method comprises:
S402 collects data from office, people society and medical hospital's scene, and carries out data cleansing;
S404 is mapped with the word vector of Cove method and is mapped the data after cleaning;
S406, adjustment softmax loss function are sigmoid loss function;
Three images after word DUAL PROBLEMS OF VECTOR MAPPING are input to multichannel convolutive neural network and are trained, obtain text by S408 This disaggregated model.
The embodiment of second aspect of the present invention, proposes a kind of building system 500 of textual classification model, and Fig. 5 shows this The schematic block diagram of the building system 500 of the textual classification model of one embodiment of invention.As shown in figure 5, textual classification model Building system 500 include: memory 502, for storing computer program;Processor 504, for executing computer program With: acquisition at least three-wheel dialog information;At least three-wheel dialog information is input to convolutional neural networks textual classification model parallel; Convolutional neural networks textual classification model is trained according at least three-wheel dialog information, obtains textual classification model.
The building system 500 of textual classification model provided by the invention, memory 502 store computer program;Processor When 504 execution computer program, acquisition at least three-wheel dialog information, wherein at least three-wheel dialog information can derive from true field Human-computer interaction scene under scape, or to talk with the dialog information under the record information of both sides, such as medical scene, be also possible to People society office staff and the talk dialog information for handling personnel;The dialog information that will acquire is input to CNN text parallel In disaggregated model, (CNN, Convolutional Neural Networks convolutional neural networks) utilize the dialogue inputted parallel Information is trained CNN textual classification model, and the information inputted parallel as used in training process is closed with context The dialog information of connection relationship, therefore, obtained training result, which can be realized, carries out text classification in conjunction with context, and then improves text The accuracy of this classification.
It is worth noting that in order to ensure model training as a result, at least three-wheel dialog information of acquisition be default problem with Problem and answer are input to CNN text text class model and are trained by answer, can also acquire at least three-wheel dialog information training After, be input to CNN text text class model using default problem and answer and be trained, with improve model complexity and Fitting degree.
In the above embodiment, it is preferable that at least three-wheel dialog information is input to parallel by any one following mode Convolutional neural networks textual classification model: will each round dialogue letter at least in three-wheel dialog information using the mode of word DUAL PROBLEMS OF VECTOR MAPPING Breath is word for word mapped to vector space, generates corresponding first image, and at least corresponding first image of three-wheel dialogue is inputted parallel To convolutional neural networks textual classification model;At least three-wheel dialog information it the second image will be compiled by one-hot coding will be input to Convolutional neural networks textual classification model.
In this embodiment, by each round dialog information at least three-wheel dialog information in the way of word DUAL PROBLEMS OF VECTOR MAPPING Word for word be mapped to vector space, generate at least three the first images, will at least corresponding first image of three-wheel dialog information it is parallel It being input in CNN textual classification model, it is preferable that CNN textual classification model is multichannel convolutive neural network, and at least three It opens the first image and is input to multichannel convolutive neural network parallel, be trained, it can be simultaneously in the way of word DUAL PROBLEMS OF VECTOR MAPPING At least three-wheel dialog information is mapped, and then accelerates the formation speed of the first image, reduces training sample and generated The time of journey wastes;Alternatively, at least three-wheel dialog information the second image will be compiled into using one-hot coding, wherein in the second image Storage at least three-wheel dialog information, the second image is input in CNN textual classification model, and then reduces the image in storage Quantity to be trained the management of sample, while can reduce the requirement to convolutional neural networks, be without multi-channel mode The sample input of context relation can be achieved.
In any of the above-described embodiment, it is preferable that processor 504 specifically in execute computer program with: by the first image Or second image be input to convolutional layer and carry out convolution algorithm, and operation result is inputted in the layer of pond and is dropped using presetting method Sampling processing;
Down-sampled processing result is input to full articulamentum, is classified by classifier, and classification results are input to Optimizer optimizes, and obtains textual classification model.
In this embodiment, processor 504 specifically in execute computer program with: the first image or the second image are inputted Convolution algorithm, the operation knot that the feature extraction of the first image or the second image is come out, and will be extracted are carried out to convolutional layer Fruit is input in the layer of pond, is sampled using presetting method to operation result, to reduce number of samples, and sampled result is defeated Enter to full articulamentum and is classified and optimized, with the model after being trained, and then the text classification obtained using above-mentioned steps Model uses the dialog information with context relation relationship in the training process, thus, obtained textual classification model It can classify in conjunction with context relation information when can classify to text, relative to not combining contextual information training Textual classification model improve classification accuracy.
In any of the above-described embodiment, it is preferable that presetting method max-pooling.
In this embodiment, presetting method max-pooling, i.e., to strongest in the characteristic value being input in the layer of pond Retained, give up other weaker characteristic values, and then guarantee position and the rotational invariance of feature, furthermore it is possible to reduce text The problem of number of parameters of this disaggregated model, reduction model over-fitting, while input X length can be arranged as regular length Input, so as to the quantity of determination neuron during network structure.
In any of the above-described embodiment, it is preferable that processor 504, be specifically used for execute computer program with: will be down-sampled Processing result is input to full articulamentum, is classified by sigmoid classifier, according to selected sigmoid loss function into Row iteration operation, until the numerical value of sigmoid loss function is minimum.
In this embodiment, processor 504 be specifically used for execute computer program with: carried out using sigmoid classifier Classification, that is, the result for avoiding the occurrence of classification can only be two kinds of situations of 0-1, avoid the occurrence of and exist as used in softmax classifier Mutual exclusion situation, operation is iterated using sigmoid loss function, until the numerical value of sigmoid loss function is minimum, in turn The superiority and inferiority that obtained model is determined using the minimum value of loss function, when the numerical value minimum of loss function to get the text arrived Disaggregated model reaches the optimum state under the classifier, and the accuracy classified using obtained model is higher.
In any of the above-described embodiment, it is preferable that processor 504, be specifically used for execute computer program with: use Adam Improved stochastic gradient descent algorithm is iterated operation to selected sigmoid loss function, until sigmoid loses letter Several numerical value is minimum.
In the technical scheme, processor 504 be specifically used for execute computer program with: use stochastic gradient descent algorithm Loss function is iterated in calculating process, math portions inappropriate in loss function are improved to Adam method, so as to change The sigmoid loss function that stochastic gradient descent algorithm adaptation after is selected, obtains the numerical value of sigmoid loss function most It is small, improve the accuracy of textual classification model classification results.
In any of the above-described embodiment, it is preferable that word vector is using the improved Cove pre-training of Chinese-English translation training Word vector.
In this embodiment, word vector is using the improved Cove pre-training word vector of Chinese-English translation training, due in More contextual informations can be generated during translator of English, therefore word vector is chosen and is selected as Chinese-English translation training in the process Improved Cove pre-training word vector, and then it is mapped to the contextual information in the first image generated after vector space included more The accuracy of horn of plenty, finally obtained textual classification model classification is higher.
In any of the above-described embodiment, it is preferable that processor 504, be also used to execute computer program with: will be down-sampled Processing result sequentially inputs dropout and relu activation.
In this embodiment, fine in order to avoid there is the degree of fitting of training set in the training process, and concentrated in verifying There is the situation of fitting difference, down-sampled processing result is input to dropout, according to certain during carrying out model training Probability carries out stochastical sampling, the new target network that sub-network is updated as this to network parameter, so that each iteration is not Will use identical sub-network, so avoid the occurrence of network by overfitting to training set the case where, can using relu activation The convergence rate for accelerating stochastic gradient descent algorithm, reduces the trained time.
In any of the above-described embodiment, it is preferable that processor 504, be also used to execute computer program with: at least three-wheel Dialog information carries out data cleansing.
In this embodiment, in order to avoid repeat statement present in dialog information or the punctuation mark of mistake influence Incidence relation hereafter carries out data cleansing at least three-wheel dialog information, duplicate sentence or punctuation mark is deleted, and The dialog information that dialog length is less than three-wheel is spliced to three-wheel and is talked with, can be combined with ensuring model in the training process The context of training sample is trained, and then improves the classification accuracy for the textual classification model that training obtains.
The embodiment of third aspect present invention provides a kind of computer readable storage medium, is stored thereon with computer Program realizes the step of the construction method of textual classification model in any of the above-described embodiment when computer program is executed by processor Suddenly.
A kind of computer readable storage medium provided by the invention is stored thereon with computer program, computer program quilt The step of processor realizes the construction method of textual classification model in any of the above-described embodiment when executing, therefore there is this article one's duty Whole technical effects of the construction method of class model, details are not described herein.
In the description of this specification, the description of term " one embodiment ", " some embodiments ", " specific embodiment " etc. Mean that particular features, structures, materials, or characteristics described in conjunction with this embodiment or example are contained at least one reality of the invention It applies in example or example.In the present specification, schematic expression of the above terms are not necessarily referring to identical embodiment or reality Example.Moreover, description particular features, structures, materials, or characteristics can in any one or more of the embodiments or examples with Suitable mode combines.
These are only the preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art For member, the invention may be variously modified and varied.All within the spirits and principles of the present invention, it is made it is any modification, Equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (19)

1. a kind of construction method of textual classification model characterized by comprising
Acquisition at least three-wheel dialog information;
At least three-wheel dialog information is input to convolutional neural networks textual classification model parallel;
The convolutional neural networks textual classification model is trained according at least three-wheel dialog information, obtains the text This disaggregated model.
2. the construction method of textual classification model according to claim 1, which is characterized in that
At least three-wheel dialog information is input to convolutional neural networks text classification mould by any one following mode parallel Type:
Each round dialog information in at least three-wheel dialog information is word for word mapped to vector using the mode of word DUAL PROBLEMS OF VECTOR MAPPING Space generates corresponding first image, corresponding first image of at least three-wheel dialog information is input to the volume parallel Product neural network textual classification model;
At least three-wheel dialog information is compiled into the second image by one-hot coding and is input to the convolutional neural networks text This disaggregated model.
3. the construction method of textual classification model according to claim 2, which is characterized in that
Described at least three-wheel dialog information is trained the convolutional neural networks textual classification model according to, obtains institute Textual classification model is stated, is specifically included:
The first image or second image are input to convolutional layer and carry out convolution algorithm, and operation result is inputted into pond Down-sampled processing is carried out using presetting method in layer;
Down-sampled processing result is input to full articulamentum, is classified by classifier, and classification results are input to optimization Device optimizes, and obtains the textual classification model.
4. the construction method of textual classification model according to claim 3, which is characterized in that the presetting method is max- pooling。
5. the construction method of textual classification model according to claim 3, which is characterized in that described to tie down-sampled processing Fruit is input to full articulamentum, is classified by classifier, and classification results are input to optimizer optimize and specifically include:
Down-sampled processing result is input to full articulamentum, is classified by sigmoid classifier, according to selected Sigmoid loss function is iterated operation, until the numerical value of the sigmoid loss function is minimum.
6. the construction method of textual classification model according to claim 5, which is characterized in that the basis was selected Sigmoid loss function is iterated operation, until the numerical value minimum of the sigmoid loss function specifically includes:
Operation is iterated to the selected sigmoid loss function using Adam improved stochastic gradient descent algorithm, Until the numerical value of the sigmoid loss function is minimum.
7. the construction method of textual classification model according to claim 2, which is characterized in that the word vector is in using The improved Cove pre-training word vector of translator of English training.
8. the construction method of textual classification model according to claim 2, which is characterized in that
It is described down-sampled processing result is input to full articulamentum after, classified described by sigmoid classifier Before, further includes:
Down-sampled processing result is sequentially input into dropout and relu activation by described.
9. the construction method of textual classification model according to claim 1, which is characterized in that talk in acquisition at least three-wheel After information, it is described at least three-wheel dialog information is input to convolutional neural networks textual classification model parallel before, Further include:
Data cleansing is carried out at least three-wheel dialog information.
10. a kind of building system of textual classification model characterized by comprising
Memory, for storing computer program;
Processor, for execute the computer program with:
Acquisition at least three-wheel dialog information;
At least three-wheel dialog information is input to convolutional neural networks textual classification model parallel;
The convolutional neural networks textual classification model is trained according at least three-wheel dialog information, obtains the text This disaggregated model.
11. the building system of textual classification model according to claim 10, which is characterized in that
At least three-wheel dialog information is input to convolutional neural networks text classification mould by any one following mode parallel Type:
Each round dialog information in at least three-wheel dialog information is word for word mapped to vector using the mode of word DUAL PROBLEMS OF VECTOR MAPPING Space generates corresponding first image, and at least three-wheel is talked with corresponding first image and is input to the convolution mind parallel Through Web text classification model;
At least three-wheel dialog information is compiled into the second image by one-hot coding and is input to the convolutional neural networks text This disaggregated model.
12. the building system of textual classification model according to claim 11, which is characterized in that
The processor, be specifically used for executing the computer program with: the first image or the second image are input to volume Lamination carries out convolution algorithm, and operation result is inputted in the layer of pond and carries out down-sampled processing using presetting method;
Down-sampled processing result is input to full articulamentum, is classified by classifier, and classification results are input to optimization Device optimizes, and obtains the textual classification model.
13. the building system of textual classification model according to claim 12, which is characterized in that the presetting method is max-pooling。
14. the building system of textual classification model according to claim 12, which is characterized in that
The processor, be specifically used for executing the computer program with: down-sampled processing result is input to full articulamentum, is led to It crosses sigmoid classifier to classify, operation is iterated according to selected sigmoid loss function, until the sigmoid The numerical value of loss function is minimum.
15. the building system of textual classification model according to claim 14, which is characterized in that
The processor, be specifically used for executing the computer program with: use the improved stochastic gradient descent algorithm of Adam Operation is iterated to selected sigmoid loss function, until the numerical value of the sigmoid loss function is minimum.
16. the building system of textual classification model according to claim 11, which is characterized in that the word vector is to use The improved Cove pre-training word vector of Chinese-English translation training.
17. the building system of textual classification model according to claim 11, which is characterized in that the processor is also used In execute the computer program with: down-sampled processing result sequentially input into dropout and relu activate described.
18. the building system of textual classification model according to claim 10, which is characterized in that
The processor, be also used to execute the computer program with: at least three-wheel dialog information carry out data cleansing.
19. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of construction method of textual classification model as claimed in any one of claims 1-9 wherein is realized when being executed by processor.
CN201811440834.9A 2018-11-29 2018-11-29 Construction method, system and the computer readable storage medium of textual classification model Pending CN109710757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811440834.9A CN109710757A (en) 2018-11-29 2018-11-29 Construction method, system and the computer readable storage medium of textual classification model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811440834.9A CN109710757A (en) 2018-11-29 2018-11-29 Construction method, system and the computer readable storage medium of textual classification model

Publications (1)

Publication Number Publication Date
CN109710757A true CN109710757A (en) 2019-05-03

Family

ID=66255257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811440834.9A Pending CN109710757A (en) 2018-11-29 2018-11-29 Construction method, system and the computer readable storage medium of textual classification model

Country Status (1)

Country Link
CN (1) CN109710757A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111400452A (en) * 2020-03-16 2020-07-10 腾讯科技(深圳)有限公司 Text information classification processing method, electronic device and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025284A (en) * 2017-04-06 2017-08-08 中南大学 The recognition methods of network comment text emotion tendency and convolutional neural networks model
CN108170848A (en) * 2018-01-18 2018-06-15 重庆邮电大学 A kind of session operational scenarios sorting technique towards China Mobile's intelligent customer service

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025284A (en) * 2017-04-06 2017-08-08 中南大学 The recognition methods of network comment text emotion tendency and convolutional neural networks model
CN108170848A (en) * 2018-01-18 2018-06-15 重庆邮电大学 A kind of session operational scenarios sorting technique towards China Mobile's intelligent customer service

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111400452A (en) * 2020-03-16 2020-07-10 腾讯科技(深圳)有限公司 Text information classification processing method, electronic device and computer readable storage medium
CN111400452B (en) * 2020-03-16 2023-04-07 腾讯科技(深圳)有限公司 Text information classification processing method, electronic device and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN110070183B (en) Neural network model training method and device for weakly labeled data
CN108536679B (en) Named entity recognition method, device, equipment and computer readable storage medium
CN110597991B (en) Text classification method and device, computer equipment and storage medium
CN112541501B (en) Scene character recognition method based on visual language modeling network
CN109785833A (en) Human-computer interaction audio recognition method and system for smart machine
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN116168352B (en) Power grid obstacle recognition processing method and system based on image processing
CN108549658A (en) A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree
CN109977428A (en) A kind of method and device that answer obtains
CN109840322A (en) It is a kind of based on intensified learning cloze test type reading understand analysis model and method
CN109711356B (en) Expression recognition method and system
CN111597341B (en) Document-level relation extraction method, device, equipment and storage medium
CN108509833A (en) A kind of face identification method, device and equipment based on structured analysis dictionary
CN116311483B (en) Micro-expression recognition method based on local facial area reconstruction and memory contrast learning
CN114841151B (en) Medical text entity relation joint extraction method based on decomposition-recombination strategy
CN107832721A (en) Method and apparatus for output information
CN116994188A (en) Action recognition method and device, electronic equipment and storage medium
CN115455194A (en) Knowledge extraction and analysis method and device for railway faults
CN111126155A (en) Pedestrian re-identification method for generating confrontation network based on semantic constraint
CN109710757A (en) Construction method, system and the computer readable storage medium of textual classification model
CN113362852A (en) User attribute identification method and device
CN111522923A (en) Multi-round task type conversation state tracking method
CN116758379A (en) Image processing method, device, equipment and storage medium
CN116311493A (en) Two-stage human-object interaction detection method based on coding and decoding architecture
CN115116444A (en) Processing method, device and equipment for speech recognition text and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190503