WO2020164267A1 - 文本分类模型构建方法、装置、终端及存储介质 - Google Patents

文本分类模型构建方法、装置、终端及存储介质 Download PDF

Info

Publication number
WO2020164267A1
WO2020164267A1 PCT/CN2019/117225 CN2019117225W WO2020164267A1 WO 2020164267 A1 WO2020164267 A1 WO 2020164267A1 CN 2019117225 W CN2019117225 W CN 2019117225W WO 2020164267 A1 WO2020164267 A1 WO 2020164267A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
convolutional neural
text
network model
word
Prior art date
Application number
PCT/CN2019/117225
Other languages
English (en)
French (fr)
Inventor
徐亮
金戈
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020164267A1 publication Critical patent/WO2020164267A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]

Definitions

  • This application relates to the technical field of neural networks, and in particular to a method, device, terminal and storage medium for building a text classification model.
  • Text classification is data mining and information retrieval.
  • This application provides a text classification model construction method, device, terminal and storage medium to solve the maintenance of the text classification model constructed due to the lengthy tensorflow framework code and the obscure interface design when the tensorflow framework is currently used to construct the text classification model. Problems that are difficult, inconvenient to debug, and not easy to operate.
  • This application provides a method for constructing a text classification model, including the following steps:
  • the word vector is input to the convolutional neural network model for classification training, until convergence, a text classification model is obtained.
  • the method before using the Word2Vec algorithm to perform word vector training on the text training data, the method further includes:
  • the Chinese word segmentation is performed on the text training data with stop words and symbols removed using the stuttering word segmentation database.
  • the step of performing Chinese word segmentation on the text training data from which stop words and symbols are removed by using the stuttering word segmentation database includes:
  • the Chinese characters whose relevance is greater than the preset value are grouped into word segmentation to obtain the word segmentation result.
  • the step of inputting the word vector into a convolutional neural network model for classification training includes:
  • the convolutional neural network model is classified and trained through the cross-entropy loss function and the ADAM optimization algorithm.
  • the step of using the Word2Vec algorithm to perform word vector training on the text training data includes:
  • the text training data is converted into word vectors according to the word vector dictionary.
  • the method further includes:
  • the positional attention mechanism and the channel attention mechanism are established on the convolutional neural network model; wherein the input of the positional attention mechanism and the channel attention mechanism is connected to the output of the activation layer of the convolutional neural network model, and the The output of the positional attention mechanism and the channel attention mechanism are connected to the input of the fully connected layer of the convolutional neural network model.
  • the step of inputting the word vector into a convolutional neural network model for classification training, until convergence, to obtain a text classification model includes:
  • the parameters of the convolutional neural network model are adjusted, and the word vectors are used to retrain the convolutional neural network model until convergence, to obtain a text classification model.
  • a text classification model construction device includes:
  • the building module is used to build a convolutional neural network model using the pytorch framework; wherein the convolutional neural network model is set in the embedding layer;
  • the acquisition module is used to acquire text classification training data, and use the Word2Vec algorithm to train the text training data with word vectors to obtain word vectors;
  • the training module is used to input the word vector into the convolutional neural network model for classification training, and when it converges, a text classification model is obtained.
  • the present application provides a terminal, including a memory and a processor.
  • the memory stores computer-readable instructions.
  • the processor executes any of the above The steps of the text classification model construction method.
  • the present application provides a storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the method for constructing a text classification model as described in any one of the above is realized.
  • the text classification model construction method provided by this application uses the pytorch framework to build a convolutional neural network model, and then obtains text classification training data, uses the Word2Vec algorithm to train the text training data with word vectors to obtain word vectors; Vector input convolutional neural network model for classification training, until convergence, a text classification model is obtained.
  • the text classification model of this application uses the Pytorch framework. Because the object-oriented interface design of the Pytorch framework comes from torch, and the torch interface design is flexible and easy to use, and the PyTorch framework can print out the calculation results layer by layer for debugging, so the construction The text classification model is easier to maintain and debug.
  • FIG. 1 is a flowchart of an embodiment of a method for constructing a text classification model of the present application
  • FIG. 2 is a block diagram of an embodiment of the text classification model construction device of this application.
  • FIG. 3 is a block diagram of the internal structure of a terminal in an embodiment of the application.
  • This application provides a text classification model construction method to solve the difficulty in maintaining the text classification model constructed due to the lengthy tensorflow framework code and the obscure interface design when the text classification model is currently constructed using the tensorflow framework , The debugging is inconvenient, and the operation is not easy.
  • the text classification model construction method includes the following steps:
  • the convolutional neural network model of this embodiment is built based on the pytorch framework.
  • the pytorch framework is a python-first deep learning framework. Compared with the Tensorflow deep learning framework, it has better flexibility. It can dynamically construct or adjust the calculation graph during execution, so that the training process can directly print the hidden variable value for Perform debugging.
  • the Tensorflow deep learning framework When the Tensorflow deep learning framework is running, it must build a static calculation graph in advance, and then repeatedly execute the built graph through feed and run. Because the graph is static, the network structure needs to be pre-built and compiled, and then trained. During the training process, each Hidden variables cannot be printed directly, but the data needs to be reloaded for related output, which is inconvenient to operate.
  • the embedding layer has the function of dimensionality reduction.
  • the vector input to the convolutional neural network model is often high-dimensional data, such as 8000 dimensions.
  • the embedding layer can reduce it to a space of, for example, 100 dimensions for calculations. At the same time, the loss of information is minimized, thereby improving computational efficiency.
  • Internet information can be automatically captured by crawling technology
  • text classification training data can be extracted from Internet information
  • word vector training is performed on the text training data using the Word2Vec algorithm to obtain word vectors.
  • the word2vec algorithm is based on a shallow neural network, which can be efficiently trained on millions of dictionaries and hundreds of millions of data sets to obtain the training result-word vectors, and can well measure words and words. The similarity between.
  • the word vector obtained by training is input into a pre-built convolutional neural network model to classify and train the convolutional neural network model until it converges, that is, when the training result meets the requirements, a trained text is obtained
  • the classification model is subsequently used to classify text data, such as news headline classification and comment sentiment classification. It should be noted that the more word vectors input to the convolutional neural network model, the higher the classification accuracy of the trained text classification model.
  • the text classification model construction method provided in this application uses the pytorch framework to build a convolutional neural network model, and then obtains text classification training data, and uses the Word2Vec algorithm to train the text training data with word vectors to obtain word vectors; finally The word vector is input into the convolutional neural network model for classification training, until convergence, a qualified text classification model is obtained to improve the classification accuracy of the text classification model; at the same time, since the text classification model of this application adopts the Pytorch framework, the Pytorch framework is oriented to The object interface design comes from torch, and the torch interface design is flexible and easy to use, and the PyTorch framework can print out the calculation results layer by layer for debugging, so the text classification model built is easier to maintain and debug.
  • the method may further include:
  • a regular expression is a character string composed of some characters with special meaning, and is mostly used to find and replace character strings that meet the rules.
  • the regular expression matching rules can operate on character strings to simplify complex operations on character strings, and its main functions include matching, cutting, replacing, and obtaining.
  • regular expression matching rules may be used to remove stop words and symbols from the text training data, such as the deletion of punctuation marks in the text, to obtain effective text training data.
  • the Chinese word segmentation is performed on the text training data with stop words and symbols removed using the stuttering word segmentation database.
  • Chinese word segmentation can be performed on the text training data according to the collocation frequency of the words in the stuttering word library.
  • the text training data after Chinese word segmentation is more efficient when using the Word2Vec algorithm for word vector training. High, better training results.
  • the words in the dictionary can be segmented according to the stutter
  • the frequency of matching is divided into "swallow/go, there/come again/when/when; willow/withered, there/again/green/when/time; peach blossom/thank, there/reopen/when/when”.
  • the text training data can also be segmented in other ways, which is not specifically limited here.
  • the step of performing Chinese word segmentation on the text training data from which stop words and symbols are removed by using the stuttering word segmentation database includes:
  • the Chinese characters whose relevance is greater than the preset value are grouped into word segmentation to obtain the word segmentation result.
  • the correlation between adjacent Chinese characters in the text training data can be calculated, and the Chinese characters with high correlation can be grouped into word segmentation to obtain the word segmentation result to improve the accuracy of word segmentation.
  • the correlation between "ku” and "teng” is higher than that of "teng” and "old”.
  • the degree is higher, so it can form participle "withered vine”, and so on, the participle results of this text are “withered vine", “old tree”, “faint crow”, “small bridge”, “flowing water”, “ Others”.
  • the preset value of the correlation degree can be flexibly adjusted as required.
  • step of inputting the word vector into the convolutional neural network model for classification training in step S13 includes:
  • the convolutional neural network model is classified and trained through the cross-entropy loss function and the ADAM optimization algorithm.
  • the cross-entropy loss function can be used to evaluate the difference between the current training probability distribution and the true distribution, to understand the classification accuracy of the text classification model, and to adjust the relevant parameters of the text classification model in real time until The training is qualified.
  • the ADAM optimization algorithm is formed by fusing an accelerated gradient descent algorithm on the basis of the momentum-driven gradient descent method. Compared with the momentum-driven gradient descent method, the ADAM optimization algorithm can correct the deviation of the classification training result to improve the classification accuracy.
  • step of using the Word2Vec algorithm to perform word vector training on the text training data in step S12 may specifically include:
  • word vector training can be performed on large-scale corpus data through the Word2Vec algorithm to obtain the word-direction dictionary volume.
  • This step can be implemented by the gensim library in Python, which is a python-based natural language processing library that can use TF-IDF, Models such as LDA, LSI, etc. convert the text into a vector mode for further processing.
  • the text training data is converted into word vectors according to the word vector dictionary.
  • the text training data can be converted into word vectors through the word vector dictionary obtained by training, where each word in the text training data has a corresponding word vector in the word vector dictionary, thereby obtaining all the words in the text training data.
  • Word vector of words Word vector of words.
  • step S11 after building a convolutional neural network model using the pytorch framework in step S11, it may further include:
  • the positional attention mechanism and the channel attention mechanism are established on the convolutional neural network model; wherein the input of the positional attention mechanism and the channel attention mechanism is connected to the output of the activation layer of the convolutional neural network model, and the The output of the positional attention mechanism and the channel attention mechanism are connected to the input of the fully connected layer of the convolutional neural network model.
  • the input of the positional attention mechanism and the channel attention mechanism comes from the output of the activation layer of the convolutional neural network.
  • the output of the convolutional neural network model can be a three-dimensional matrix of 384*100*1.
  • the output three-dimensional matrix of the convolutional neural network model can be first converted into a 384*100 matrix, and through two parallel full The connection layer outputs 100*384 and 384*100 matrices, and then performs matrix multiplication and softmax mapping to obtain a 100*100 position attention matrix.
  • another parallel fully connected layer outputs a 384*100 matrix and a position attention matrix to perform matrix multiplication to obtain a 384*100 matrix and convert it into a 384*100*1 three-dimensional matrix, which is combined with the volume
  • the output of the product neural network model is summed to obtain the output result of the positional attention mechanism.
  • For channel attention first convert the output three-dimensional matrix of the convolutional neural network model into a 384*100 matrix, and output 384*100 and 100*384 matrices through two parallel fully connected layers, and then perform matrix multiplication and softmax Mapping, get the 384*384 channel attention matrix.
  • another parallel fully connected layer outputs a 100*384 matrix and a channel attention matrix for matrix multiplication to obtain a 100*384 matrix and convert it into a 384*100*1 three-dimensional matrix, and convolve it with
  • the output of the neural network model is summed to obtain the output result of the channel attention mechanism.
  • the positional attention mechanism and the channel attention mechanism are output and input into the fully connected layer to complete the output of the entire convolutional neural network model.
  • the fully connected layer functions as a "classifier" in the entire convolutional neural network model. If the convolutional layer, pooling layer, and activation layer of the convolutional neural network model are to map the original data to the hidden layer feature space, the fully connected layer functions to map the learned "distributed feature representation" to The role of sample labeling space.
  • the convolutional layer of the convolutional neural network model contains one-dimensional convolutions with heights of 1, 3, and 5 and 128 channels (the input and output dimensions of the convolutional layer are consistent through padding), and the activation layer function can be ReLU, its convergence is faster, and can maintain the same effect.
  • the step S13 of inputting the word vector into the convolutional neural network model for classification training, until convergence, the step of obtaining a text classification model may specifically include:
  • the parameters of the convolutional neural network model are adjusted, and the word vectors are used to retrain the convolutional neural network model until convergence, to obtain a text classification model.
  • the classification accuracy rate of the convolutional neural network model is calculated, and it is judged whether the classification accuracy rate of the convolutional neural network model is lower than the preset value. If so, the parameters of the convolutional neural network model are adjusted and the word vector is used to The convolutional neural network model is retrained until the classification accuracy of the convolutional neural network model is higher than the preset value, and a trained text classification model is obtained, thereby ensuring that the trained text classification model has a better classification effect.
  • an embodiment of the present application also provides a text classification model construction device.
  • it includes a construction module 21, an acquisition module 22 and a training module 23. among them,
  • the building module 21 is used to build a convolutional neural network model using the pytorch framework; wherein the convolutional neural network model is set in the embedding layer;
  • the convolutional neural network model of this embodiment is built based on the pytorch framework.
  • the pytorch framework is a python-first deep learning framework. Compared with the Tensorflow deep learning framework, it has better flexibility. It can dynamically construct or adjust the calculation graph during execution, so that the training process can directly print the hidden variable value for Perform debugging.
  • the Tensorflow deep learning framework When the Tensorflow deep learning framework is running, it must build a static calculation graph in advance, and then repeatedly execute the built graph through feed and run. Because the graph is static, the network structure needs to be pre-built and compiled, and then trained. During the training process, each Hidden variables cannot be printed directly, but the data needs to be reloaded for related output, which is inconvenient to operate.
  • the embedding layer has the function of dimensionality reduction.
  • the vector input to the convolutional neural network model is often high-dimensional data, such as 8000 dimensions.
  • the embedding layer can reduce it to a space of, for example, 100 dimensions for calculations. At the same time, the loss of information is minimized, thereby improving computational efficiency.
  • the obtaining module 22 is configured to obtain text classification training data, and use the Word2Vec algorithm to train the text training data with word vectors to obtain word vectors;
  • Internet information can be automatically captured by crawling technology
  • text classification training data can be extracted from Internet information
  • word vector training is performed on the text training data using the Word2Vec algorithm to obtain word vectors.
  • the word2vec algorithm is based on a shallow neural network, which can be efficiently trained on millions of dictionaries and hundreds of millions of data sets to obtain the training result-word vectors, and can well measure words and words. The similarity between.
  • the training module 23 is configured to input the word vector into the convolutional neural network model for classification training, and when it converges, a text classification model is obtained.
  • the word vector obtained by training is input into a pre-built convolutional neural network model to classify and train the convolutional neural network model until it converges, that is, when the training result meets the requirements, a trained text is obtained
  • the classification model is subsequently used to classify text data, such as news headline classification and comment sentiment classification. It should be noted that the more word vectors input to the convolutional neural network model, the higher the classification accuracy of the trained text classification model.
  • the text classification model construction device uses the pytorch framework to build a convolutional neural network model, and then obtains text classification training data, and uses the Word2Vec algorithm to train the text training data with word vectors to obtain word vectors; finally The word vector is input into the convolutional neural network model for classification training, until convergence, a qualified text classification model is obtained to improve the classification accuracy of the text classification model; at the same time, since the text classification model of this application adopts the Pytorch framework, the Pytorch framework is oriented to The object interface design comes from torch, and the torch interface design is flexible and easy to use, and the PyTorch framework can print out the calculation results layer by layer for debugging, so the text classification model built is easier to maintain and debug.
  • a terminal provided by the present application includes a memory and a processor, and computer-readable instructions are stored in the memory.
  • the processor executes any of the above The steps of the text classification model construction method described.
  • the terminal is a computer device, as shown in FIG. 3.
  • the computer equipment described in this embodiment may be equipment such as servers, personal computers, and network equipment.
  • the computer equipment includes a processor 302, a memory 303, an input unit 304, a display unit 305 and other devices.
  • the memory 303 may be used to store computer-readable instructions 301 and various functional modules, and the processor 302 runs the computer-readable instructions 301 stored in the memory 303 to execute various functional applications and data processing of the device.
  • the memory may be internal memory or external memory, or include both internal memory and external memory.
  • the internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or random access memory.
  • External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
  • the memory disclosed in this application includes but is not limited to these types of memory.
  • the memory disclosed in this application is only an example and not a limitation.
  • the input unit 304 is used for receiving signal input and receiving keywords input by the user.
  • the input unit 304 may include a touch panel and other input devices.
  • the touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to the preset
  • the program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the display unit 305 can be used to display information input by the user or information provided to the user and various menus of the computer device.
  • the display unit 305 may take the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the processor 302 is the control center of the computer equipment. It uses various interfaces and lines to connect the various parts of the entire computer. It executes by running or executing the software programs and/or modules stored in the memory 302, and calling the data stored in the memory. Various functions and processing data.
  • the computer device includes: one or more processors 302, a memory 303, and one or more computer-readable instructions 301, wherein the one or more computer-readable instructions 301 are stored in the memory 303 And is configured to be executed by the one or more processors 302, and the one or more computer-readable instructions 301 are configured to execute the text classification model construction method described in the above embodiments.
  • this application also proposes a storage medium storing computer-readable instructions.
  • the storage medium of the computer-readable instructions may be a non-volatile readable storage medium.
  • the storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • the computer-readable instructions can be stored in a storage medium. When executed, it may include the processes of the above-mentioned method embodiments.
  • the aforementioned storage medium may be a magnetic disk, an optical disk, a read-only storage memory (Read-Only Non-volatile storage media such as Memory, ROM, or Random Access Memory (RAM), etc.
  • the text classification model construction method, device, terminal and storage medium provided in this application build a convolutional neural network model by using the pytorch framework, then obtain text classification training data, and use the Word2Vec algorithm to train the text training data with word vectors to obtain Word vector; finally the word vector is input into the convolutional neural network model for classification training, until convergence, a trained text classification model is obtained to improve the classification accuracy of the text classification model; at the same time, because of the text classification model of this application Using the Pytorch framework, the object-oriented interface design of the Pytorch framework comes from torch, and the torch interface design is flexible and easy to use, and the PyTorch framework can print out the calculation results layer by layer for debugging, so the text classification model built is easier to maintain And debugging.

Abstract

一种文本分类模型构建方法、装置、终端及存储介质。所述文本分类模型构建方法包括:利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层(S11);获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量(S12);将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型(S13)。所述文本分类模型采用Pytorch框架,由于Pytorch框架面向对象的接口设计来源于torch,而torch的接口设计具有灵活易用的特点,并且PyTorch框架能够逐层打印出计算结果以便于调试,因此构建的文本分类模型更易于维护和调试。

Description

文本分类模型构建方法、装置、终端及存储介质
本申请要求于2019年02月13日提交中国专利局、申请号为
CN201910113183.0、申请名称为“文本分类模型构建方法、装置、终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及神经网络技术领域,尤其涉及一种文本分类模型构建方法、装置、终端及存储介质。
背景技术
随着移动互联网时代的到来,内容的生产和传播都发生了深刻的变化,为了满足信息爆炸背景下用户的多样化需求,迫切需要对文本信息进行有效的组织,文本分类是数据挖掘和信息检索领域研究的热点和核心技术。
现有的文本分类模型,如果要在文本分类问题上采用神经网络算法一般会采用tensorflow框架。但tensorflow框架代码冗长、接口设计过于晦涩难懂,导致构建的文本分类模型维护困难、调试不便、不易于操作。
发明内容
本申请提供一种文本分类模型构建方法、装置、终端及存储介质,以解决当前采用tensorflow框架构建文本分类模型时,因tensorflow框架代码冗长、接口设计过于晦涩难懂而导致构建的文本分类模型维护困难、调试不便、不易于操作的问题。
为解决上述问题,本申请采用如下技术方案:
本申请提供一种文本分类模型构建方法,包括如下步骤:
利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层;
获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
在一实施例中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练之前,还包括:
根据正则表达式匹配规则去除文本训练数据的停用词和符号;
利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词。
在一实施例中,所述利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词的步骤,包括:
利用结巴分词库确定文本训练数据中汉字之间的相关度;
将相关度大于预设值的汉字组成分词,得到分词结果。
在一实施例中,所述将所述词向量输入卷积神经网络模型进行分类训练的步骤,包括:
根据所述词向量并通过交叉熵损失函数和ADAM优化算法对卷积神经网络模型进行分类训练。
在一实施例中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练的步骤,包括:
利用Word2Vec算法对大型语料数据进行词向量训练,得到词向量字典;
根据所述词向量字典将文本训练数据转化为词向量。
在一实施例中,所述利用pytorch框架搭建卷积神经网络模型之后,还包括:
在所述卷积神经网络模型上建立位置注意力机制和通道注意力机制;其中,所述位置注意力机制和通道注意力机制的输入与卷积神经网络模型的激活层的输出连接,所述位置注意力机制和通道注意力机制的输出与卷积神经网络模型的全连接层的输入连接。
在一实施例中,所述将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型的步骤,包括:
根据分类训练结果计算卷积神经网络模型的分类准确率;
当分类准确率低于预设值时,调整卷积神经网络模型的参数,利用词向量对所述卷积神经网络模型重新训练,直至收敛时,得到文本分类模型。
本申请提供的一种文本分类模型构建装置,包括:
搭建模块,用于利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层;
获取模块,用于获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
训练模块,用于将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
本申请提供一种终端,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行如上任一项所述的文本分类模型构建方法的步骤。
本申请提供一种存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时,实现如上任一项所述的文本分类模型构建方法。
相对于现有技术,本申请的技术方案至少具备如下优点:
本申请提供的文本分类模型构建方法,通过利用pytorch框架搭建卷积神经网络模型,然后获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。本申请的文本分类模型采用Pytorch框架,由于Pytorch框架面向对象的接口设计来源于torch,而torch的接口设计具有灵活易用的特点,并且PyTorch框架能够逐层打印出计算结果以便于调试,因此构建的文本分类模型更易于维护和调试。
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。
附图说明
图1本申请文本分类模型构建方法一种实施例流程框图;
图2为本申请文本分类模型构建装置一种实施例模块框图;
图3为本申请一个实施例中终端的内部结构框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
在本申请的说明书和权利要求书及上述附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如S11、S12等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
本领域普通技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本领域普通技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
请参阅图1,本申请提供了一种文本分类模型构建方法,以解决当前采用tensorflow框架构建文本分类模型时,因tensorflow框架代码冗长、接口设计过于晦涩难懂而导致构建的文本分类模型维护困难、调试不便、不易于操作的问题。其中一种实施方式中,所述文本分类模型构建方法包括如下步骤:
S11、利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层;
本实施例的卷积神经网络模型基于pytorch框架搭建,所述 pytorch框架是一个python优先的深度学习框架,相对于Tensorflow深度学习框架,其具有更好的灵活性,可以在执行时动态构建或调整计算图,以便于训练过程直接打印隐含变量数值,用于进行调试。而Tensorflow深度学习框架运行时必须提前建好静态计算图,然后通过feed和run重复执行建好的图,因图是静态的,网络结构需要预先建立编译,然后进行训练,训练过程中,每一隐含变量无法直接打印,而是需要重新载入数据进行相关输出,操作不便。所述嵌入层具有降维的作用,输入到卷积神经网络模型的向量往往是高维度数据,比如8000维,嵌入层可以将其降到比如100维度的空间下进行运算,可以在压缩数据的同时让信息损失最小化,从而提高运算效率。
S12、获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
在本实施例中,可通过爬虫技术自动抓取互联网信息,从互联网信息中提取出文本分类训练数据,然后利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量。其中,所述word2vec算法基于一个浅层神经网络,其可以在百万数量级的词典和上亿的数据集上进行高效地训练,得到训练结果——词向量,并可以良好地度量词与词之间的相似性。
S13、将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
本实施例将训练得到的词向量输入预先搭建好的卷积神经网络模型中,以对卷积神经网络模型进行分类训练,直至其收敛时,也即训练结果满足要求时,得到训练合格的文本分类模型,以后续用于对文本数据进行分类,如新闻标题分类、评论情感分类等。需要说明的是,输入卷积神经网络模型的词向量越多,则训练得到的文本分类模型的分类准确性越高。
本申请提供的文本分类模型构建方法,通过利用pytorch框架搭建卷积神经网络模型,然后获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;最后将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到训练合格的文本分类模型,以提高文本分类模型的分类准确率;同时,由于本申请的文本分类模型采用Pytorch框架,Pytorch框架面向对象的接口设计来源于torch,而torch的接口设计具有灵活易用的特点,并且PyTorch框架能够逐层打印出计算结果以便于调试,因此构建的文本分类模型更易于维护和调试。
在一实施例中,步骤S12的利用Word2Vec算法将所述文本训练数据进行词向量训练之前,还可包括:
根据正则表达式匹配规则去除文本训练数据的停用词和符号;
在本实施例中,正则表达式是由一些具有特殊含义的字符组成的字符串,多用于查找、替换符合规则的字符串。所述正则表达式匹配规则可以对字符串进行操作,以简化对字符串的复杂操作,其主要功能有匹配、切割、替换、获取。本实施例可以利用正则表达式匹配规则去除文本训练数据的停用词和符号,如文本中标点符号的删除,以得到有效文本训练数据。
利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词。
在本实施例中,可根据结巴分词库中字词的搭配频率对文本训练数据进行中文分词,经过中文分词后的文本训练数据,在利用Word2Vec算法对其进行词向量训练时,训练效率更高,训练结果更佳。例如,当对“燕子去了,有再来的时候;杨柳枯了,有再青的时候;桃花谢了,有再开的时候”这段文本进行分词时,可根据结巴分词库中字词的搭配频率分为“燕子/去了,有/再来/的/时候;杨柳/枯了,有/再/青/的/时候;桃花/谢了,有/再开/的/时候”。当然,所述文本训练数据还可通过其他方式进行分词,在此不做具体限定。
在一实施例中,所述利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词的步骤,包括:
利用结巴分词库确定文本训练数据中汉字之间的相关度;
将相关度大于预设值的汉字组成分词,得到分词结果。
在本实施例中,可通过计算文本训练数据中相邻汉字之间的相关度,并将相关度高的汉字组成分词,从而得到分词结果,以提高分词的准确性。例如,在对“枯藤老树昏鸦,小桥流水人家”这段文本进行分词时,根据结巴分词库确定“枯”和“藤”的相关度比“藤”和“老”的相关度要高,因此可以组成分词“枯藤”,依此类推,得到这段文本的分词结果为“枯藤”、“老树”、“昏鸦”、“小桥”、“流水”、“人家”。其中,所述相关度的预设值可根据需要灵活调整。
在一实施例中,步骤S13的将所述词向量输入卷积神经网络模型进行分类训练的步骤,包括:
根据所述词向量并通过交叉熵损失函数和ADAM优化算法对卷积神经网络模型进行分类训练。
在本实施例中,所述交叉熵损失函数可用来评估当前训练得到的概率分布与真实分布的差异情况,以了解文本分类模型的分类准确率,以对文本分类模型的相关参数实时调整,直至训练合格。其中,ADAM优化算法是在带动量的梯度下降法的基础上融合了一种加速梯度下降算法而形成的。相较于带动量的梯度下降法,所述ADAM优化算法可对分类训练结果进行偏差纠正,以提高分类准确性。
在一实施例中,步骤S12的利用Word2Vec算法将所述文本训练数据进行词向量训练的步骤,可具体包括:
利用Word2Vec算法对大型语料数据进行词向量训练,得到词向量字典;
本实施例可通过Word2Vec算法将大型语料数据进行词向量训练,得到词向字典量。这一步骤可通过Python中的gensim库实现,Gensim库是一个基于python的自然语言处理库,能够利用TF-IDF、 LDA,、LSI 等模型将文本转化成向量模式,以便进行进一步的处理。
根据所述词向量字典将文本训练数据转化为词向量。
本实施例可通过训练得到的词向量字典将文本训练数据转化为词向量,其中,文本训练数据中每一个字词在词向量字典中都有相应的词向量,从而得到文本训练数据中所有字词的词向量。
在一实施例中,步骤S11的利用pytorch框架搭建卷积神经网络模型之后,还可包括:
在所述卷积神经网络模型上建立位置注意力机制和通道注意力机制;其中,所述位置注意力机制和通道注意力机制的输入与卷积神经网络模型的激活层的输出连接,所述位置注意力机制和通道注意力机制的输出与卷积神经网络模型的全连接层的输入连接。
在本实施例中,位置注意力机制与通道注意力机制的输入来源于卷积神经网络的激活层输出。卷积神经网络模型的输出可以为384*100*1的三维矩阵,对于位置注意力机制,可先将卷积神经网络模型的输出三维矩阵转化为384*100的矩阵,并通过两个并行全连接层输出100*384与384*100的矩阵,然后进行矩阵乘法及softmax映射,得到100*100的位置注意力矩阵。在此基础上,通过另一并行全连接层输出384*100的矩阵与位置注意力矩阵进行矩阵乘法,得到384*100的矩阵并将其转化为384*100*1的三维矩阵,并与卷积神经网络模型的输出加和,得到位置注意力机制的输出结果。
对于通道注意力,可首先将卷积神经网络模型的输出三维矩阵转化为384*100的矩阵,并通过两个并行全连接层输出384*100与100*384的矩阵,然后进行矩阵乘法及softmax映射,得到384*384的通道注意力矩阵。在此基础上通过另一并行全连接层输出100*384的矩阵与通道注意力矩阵进行矩阵乘法,得到100*384的矩阵并将其转化为384*100*1的三维矩阵,并与卷积神经网络模型输出加和,得到通道注意力机制输出结果。最后将位置注意力机制和通道注意力机制输出,并输入全连接层,从而完成整个卷积神经网络模型的输出。
其中,所述全连接层在整个卷积神经网络模型中起到“分类器”的作用。如果说卷积神经网络模型的卷积层、池化层和激活层等操作是将原始数据映射到隐层特征空间的话,全连接层则起到将学到的“分布式特征表示”映射到样本标记空间的作用。
在一实施例中,卷积神经网络模型的卷积层含高度为1、3、5的一维卷积和128条通道(通过padding实现卷积层输入输出维度一致),激活层函数可以为ReLU,其收敛更快,并且能保持同样效果。
在一实施例中,步骤S13的将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型的步骤,可具体包括:
根据分类训练结果计算卷积神经网络模型的分类准确率;
当分类准确率低于预设值时,调整卷积神经网络模型的参数,利用词向量对所述卷积神经网络模型重新训练,直至收敛时,得到文本分类模型。
本实施例对卷积神经网络模型的分类准确率进行计算,并判断卷积神经网络模型的分类准确率是否低于预设值,若是,则调整卷积神经网络模型的参数,利用词向量对所述卷积神经网络模型重新训练,直至卷积神经网络模型的分类准确率高于预设值时,得到训练合格的文本分类模型,从而保证训练得到的文本分类模型有较好的分类效果。
请参考图2,本申请的实施例还提供了一种文本分类模型构建装置,一种本实施例中,包括搭建模块21、获取模块22和训练模块23。其中,
搭建模块21,用于利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层;
本实施例的卷积神经网络模型基于pytorch框架搭建,所述 pytorch框架是一个python优先的深度学习框架,相对于Tensorflow深度学习框架,其具有更好的灵活性,可以在执行时动态构建或调整计算图,以便于训练过程直接打印隐含变量数值,用于进行调试。而Tensorflow深度学习框架运行时必须提前建好静态计算图,然后通过feed和run重复执行建好的图,因图是静态的,网络结构需要预先建立编译,然后进行训练,训练过程中,每一隐含变量无法直接打印,而是需要重新载入数据进行相关输出,操作不便。所述嵌入层具有降维的作用,输入到卷积神经网络模型的向量往往是高维度数据,比如8000维,嵌入层可以将其降到比如100维度的空间下进行运算,可以在压缩数据的同时让信息损失最小化,从而提高运算效率。
获取模块22,用于获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
在本实施例中,可通过爬虫技术自动抓取互联网信息,从互联网信息中提取出文本分类训练数据,然后利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量。其中,所述word2vec算法基于一个浅层神经网络,其可以在百万数量级的词典和上亿的数据集上进行高效地训练,得到训练结果——词向量,并可以良好地度量词与词之间的相似性。
训练模块23,用于将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
本实施例将训练得到的词向量输入预先搭建好的卷积神经网络模型中,以对卷积神经网络模型进行分类训练,直至其收敛时,也即训练结果满足要求时,得到训练合格的文本分类模型,以后续用于对文本数据进行分类,如新闻标题分类、评论情感分类等。需要说明的是,输入卷积神经网络模型的词向量越多,则训练得到的文本分类模型的分类准确性越高。
本申请提供的文本分类模型构建装置,通过利用pytorch框架搭建卷积神经网络模型,然后获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;最后将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到训练合格的文本分类模型,以提高文本分类模型的分类准确率;同时,由于本申请的文本分类模型采用Pytorch框架,Pytorch框架面向对象的接口设计来源于torch,而torch的接口设计具有灵活易用的特点,并且PyTorch框架能够逐层打印出计算结果以便于调试,因此构建的文本分类模型更易于维护和调试。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
本申请提供的一种终端,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行如上任一项所述的文本分类模型构建方法的步骤。
在一实施例中,所述终端为一种计算机设备,如图3所示。本实施例所述的计算机设备可以是服务器、个人计算机以及网络设备等设备。所述计算机设备包括处理器302、存储器303、输入单元304以及显示单元305等器件。本领域技术人员可以理解,图3示出的设备结构器件并不构成对所有设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件。存储器303可用于存储计算机可读指令301以及各功能模块,处理器302运行存储在存储器303的计算机可读指令301,从而执行设备的各种功能应用以及数据处理。存储器可以是内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储器包括但不限于这些类型的存储器。本申请所公开的存储器只作为例子而非作为限定。
输入单元304用于接收信号的输入,以及接收用户输入的关键字。输入单元304可包括触控面板以及其它输入设备。触控面板可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作),并根据预先设定的程序驱动相应的连接装置;其它输入设备可以包括但不限于物理键盘、功能键(比如播放控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。显示单元305可用于显示用户输入的信息或提供给用户的信息以及计算机设备的各种菜单。显示单元305可采用液晶显示器、有机发光二极管等形式。处理器302是计算机设备的控制中心,利用各种接口和线路连接整个电脑的各个部分,通过运行或执行存储在存储器302内的软件程序和/或模块,以及调用存储在存储器内的数据,执行各种功能和处理数据。
作为一个实施例,所述计算机设备包括:一个或多个处理器302,存储器303,一个或多个计算机可读指令301,其中所述一个或多个计算机可读指令301被存储在存储器303中并被配置为由所述一个或多个处理器302执行,所述一个或多个计算机可读指令301配置用于执行以上实施例所述的文本分类模型构建方法。
在一个实施例中,本申请还提出了一种存储有计算机可读指令的存储介质,该计算机可读指令的存储介质可以为非易失性可读存储介质,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述文本分类模型构建方法。例如,所述存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,该计算机可读指令可存储于一存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。
综合上述实施例可知,本申请在于:
本申请提供的文本分类模型构建方法、装置、终端及存储介质,通过利用pytorch框架搭建卷积神经网络模型,然后获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;最后将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到训练合格的文本分类模型,以提高文本分类模型的分类准确率;同时,由于本申请的文本分类模型采用Pytorch框架,Pytorch框架面向对象的接口设计来源于torch,而torch的接口设计具有灵活易用的特点,并且PyTorch框架能够逐层打印出计算结果以便于调试,因此构建的文本分类模型更易于维护和调试。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种文本分类模型构建方法,其中,包括:
    利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层,在所述卷积神经网络模型上建立位置注意力机制和通道注意力机制;其中,所述位置注意力机制和通道注意力机制的输入与卷积神经网络模型的激活层的输出连接,所述位置注意力机制和通道注意力机制的输出与卷积神经网络模型的全连接层的输入连接;
    获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
    将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
  2. 根据权利要求1所述的文本分类模型构建方法,其中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练之前,还包括:
    根据正则表达式匹配规则去除文本训练数据的停用词和符号;
    利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词。
  3. 根据权利要求2所述的文本分类模型构建方法,其中,所述利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词的步骤,包括:
    利用结巴分词库确定文本训练数据中汉字之间的相关度;
    将相关度大于预设值的汉字组成分词,得到分词结果。
  4. 根据权利要求1所述的文本分类模型构建方法,其中,所述将所述词向量输入卷积神经网络模型进行分类训练的步骤,包括:
    根据所述词向量并通过交叉熵损失函数和ADAM优化算法对卷积神经网络模型进行分类训练。
  5. 根据权利要求1所述的文本分类模型构建方法,其中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练的步骤,包括:
    利用Word2Vec算法对大型语料数据进行词向量训练,得到词向量字典;
    根据所述词向量字典将文本训练数据转化为词向量。
  6. 根据权利要求1所述的文本分类模型构建方法,其中,所述将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型的步骤,包括:
    根据分类训练结果计算卷积神经网络模型的分类准确率;
    当分类准确率低于预设值时,调整卷积神经网络模型的参数,利用词向量对所述卷积神经网络模型重新训练,直至收敛时,得到文本分类模型。
  7. 一种文本分类模型构建装置,其中,包括:
    搭建模块,用于利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层,在所述卷积神经网络模型上建立位置注意力机制和通道注意力机制;其中,所述位置注意力机制和通道注意力机制的输入与卷积神经网络模型的激活层的输出连接,所述位置注意力机制和通道注意力机制的输出与卷积神经网络模型的全连接层的输入连接;
    获取模块,用于获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
    训练模块,用于将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
  8. 如权利要求7所述的文本分类模型构建装置,其中,所述的文本分类模型构建装置还包括:
    去除模块,用于根据正则表达式匹配规则去除文本训练数据的停用词和符号;
    分词模块,用于利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词。
  9. 如权利要求8所述的文本分类模型构建装置,其中,所述的分词模块还用于:
    利用结巴分词库确定文本训练数据中汉字之间的相关度;
    将相关度大于预设值的汉字组成分词,得到分词结果。
  10. 如权利要求7所述的文本分类模型构建装置,其中,所述的训练模块还用于:
    根据所述词向量并通过交叉熵损失函数和ADAM优化算法对卷积神经网络模型进行分类训练。
  11. 如权利要求7所述的文本分类模型构建装置,其中,所述的获取模块还用于:
    利用Word2Vec算法对大型语料数据进行词向量训练,得到词向量字典;
    根据所述词向量字典将文本训练数据转化为词向量。
  12. 如权利要求7所述的文本分类模型构建装置,其中,所述的训练模块还用于:
    根据分类训练结果计算卷积神经网络模型的分类准确率;
    当分类准确率低于预设值时,调整卷积神经网络模型的参数,利用词向量对所述卷积神经网络模型重新训练,直至收敛时,得到文本分类模型。
  13. 一种终端,其中,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:
    利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层,在所述卷积神经网络模型上建立位置注意力机制和通道注意力机制;其中,所述位置注意力机制和通道注意力机制的输入与卷积神经网络模型的激活层的输出连接,所述位置注意力机制和通道注意力机制的输出与卷积神经网络模型的全连接层的输入连接;
    获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
    将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
  14. 如权利要求13所述的终端,其中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练之前,还包括:
    根据正则表达式匹配规则去除文本训练数据的停用词和符号;
    利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词。
  15. 如权利要求14所述的终端,其中,所述利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词的步骤,包括:
    利用结巴分词库确定文本训练数据中汉字之间的相关度;
    将相关度大于预设值的汉字组成分词,得到分词结果。
  16. 如权利要求13所述的终端,其中,所述将所述词向量输入卷积神经网络模型进行分类训练的步骤,包括:
    根据所述词向量并通过交叉熵损失函数和ADAM优化算法对卷积神经网络模型进行分类训练。
  17. 如权利要求13所述的终端,其中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练的步骤,包括:
    利用Word2Vec算法对大型语料数据进行词向量训练,得到词向量字典;
    根据所述词向量字典将文本训练数据转化为词向量。
  18. 如权利要求13所述的终端,其中,所述将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型的步骤,包括:
    根据分类训练结果计算卷积神经网络模型的分类准确率;
    当分类准确率低于预设值时,调整卷积神经网络模型的参数,利用词向量对所述卷积神经网络模型重新训练,直至收敛时,得到文本分类模型。
  19. 一种存储介质,其上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时,实现以下步骤:
    利用pytorch框架搭建卷积神经网络模型;其中,所述卷积神经网络模型设置于嵌入层,在所述卷积神经网络模型上建立位置注意力机制和通道注意力机制;其中,所述位置注意力机制和通道注意力机制的输入与卷积神经网络模型的激活层的输出连接,所述位置注意力机制和通道注意力机制的输出与卷积神经网络模型的全连接层的输入连接;
    获取文本分类训练数据,利用Word2Vec算法将所述文本训练数据进行词向量训练,得到词向量;
    将所述词向量输入卷积神经网络模型进行分类训练,直至收敛时,得到文本分类模型。
  20. 如权利要求19所述的存储介质,其中,所述利用Word2Vec算法将所述文本训练数据进行词向量训练之前,还包括:
    根据正则表达式匹配规则去除文本训练数据的停用词和符号;
    利用结巴分词库将去除停用词和符号的文本训练数据进行中文分词。
PCT/CN2019/117225 2019-02-13 2019-11-11 文本分类模型构建方法、装置、终端及存储介质 WO2020164267A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910113183.0A CN109960726B (zh) 2019-02-13 2019-02-13 文本分类模型构建方法、装置、终端及存储介质
CN201910113183.0 2019-02-13

Publications (1)

Publication Number Publication Date
WO2020164267A1 true WO2020164267A1 (zh) 2020-08-20

Family

ID=67023660

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117225 WO2020164267A1 (zh) 2019-02-13 2019-11-11 文本分类模型构建方法、装置、终端及存储介质

Country Status (2)

Country Link
CN (1) CN109960726B (zh)
WO (1) WO2020164267A1 (zh)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069317A (zh) * 2020-09-07 2020-12-11 北京理工大学 一种装配工时的获取方法及处理器
CN112597764A (zh) * 2020-12-23 2021-04-02 青岛海尔科技有限公司 文本分类方法及装置、存储介质、电子装置
CN112699243A (zh) * 2021-01-15 2021-04-23 上海交通大学 基于法条图卷积网络文本的案件文书案由分类方法及介质
CN112711423A (zh) * 2021-01-18 2021-04-27 深圳中兴网信科技有限公司 引擎构建方法、入侵检测方法、电子设备和可读存储介质
CN113010674A (zh) * 2021-03-11 2021-06-22 平安科技(深圳)有限公司 文本分类模型封装方法、文本分类方法及相关设备
CN113268599A (zh) * 2021-05-31 2021-08-17 平安国际智慧城市科技股份有限公司 文件分类模型的训练方法、装置、计算机设备及存储介质
CN113282710A (zh) * 2021-06-01 2021-08-20 平安国际智慧城市科技股份有限公司 文本关系抽取模型的训练方法、装置以及计算机设备
CN113688237A (zh) * 2021-08-10 2021-11-23 北京小米移动软件有限公司 文本分类方法、文本分类网络的训练方法及装置
CN113723102A (zh) * 2021-06-30 2021-11-30 平安国际智慧城市科技股份有限公司 命名实体识别方法、装置、电子设备及存储介质
CN115859837A (zh) * 2023-02-23 2023-03-28 山东大学 基于数字孪生建模的风机叶片动态冲击检测方法及系统
CN116975863A (zh) * 2023-07-10 2023-10-31 福州大学 基于卷积神经网络的恶意代码检测方法
CN117370809A (zh) * 2023-11-02 2024-01-09 快朵儿(广州)云科技有限公司 一种基于深度学习的人工智能模型构建方法、系统及存储介质
CN113723102B (zh) * 2021-06-30 2024-04-26 平安国际智慧城市科技股份有限公司 命名实体识别方法、装置、电子设备及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109960726B (zh) * 2019-02-13 2024-01-23 平安科技(深圳)有限公司 文本分类模型构建方法、装置、终端及存储介质
CN110472659B (zh) * 2019-07-05 2024-03-08 中国平安人寿保险股份有限公司 数据处理方法、装置、计算机可读存储介质和计算机设备
CN111382269B (zh) * 2020-03-02 2021-07-23 拉扎斯网络科技(上海)有限公司 文本分类模型训练方法、文本分类方法及相关装置
CN111984762B (zh) * 2020-08-05 2022-12-13 中国科学院重庆绿色智能技术研究院 一种对抗攻击敏感的文本分类方法
CN116644157B (zh) * 2023-07-27 2023-10-10 交通运输部公路科学研究所 基于桥梁养护非结构化数据构建Embedding数据的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180181864A1 (en) * 2016-12-27 2018-06-28 Texas Instruments Incorporated Sparsified Training of Convolutional Neural Networks
CN108520535A (zh) * 2018-03-26 2018-09-11 天津大学 基于深度恢复信息的物体分类方法
CN108717439A (zh) * 2018-05-16 2018-10-30 哈尔滨理工大学 一种基于注意力机制和特征强化融合的中文文本分类方法
CN109960726A (zh) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 文本分类模型构建方法、装置、终端及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10438117B1 (en) * 2015-05-21 2019-10-08 Google Llc Computing convolutions using a neural network processor
US20170308790A1 (en) * 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
CN107301246A (zh) * 2017-07-14 2017-10-27 河北工业大学 基于超深卷积神经网络结构模型的中文文本分类方法
CN108364023A (zh) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 基于注意力模型的图像识别方法和系统
CN108573047A (zh) * 2018-04-18 2018-09-25 广东工业大学 一种中文文本分类模型的训练方法及装置
CN108509427B (zh) * 2018-04-24 2022-03-11 北京慧闻科技(集团)有限公司 文本数据的数据处理方法及应用

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180181864A1 (en) * 2016-12-27 2018-06-28 Texas Instruments Incorporated Sparsified Training of Convolutional Neural Networks
CN108520535A (zh) * 2018-03-26 2018-09-11 天津大学 基于深度恢复信息的物体分类方法
CN108717439A (zh) * 2018-05-16 2018-10-30 哈尔滨理工大学 一种基于注意力机制和特征强化融合的中文文本分类方法
CN109960726A (zh) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 文本分类模型构建方法、装置、终端及存储介质

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069317A (zh) * 2020-09-07 2020-12-11 北京理工大学 一种装配工时的获取方法及处理器
CN112597764A (zh) * 2020-12-23 2021-04-02 青岛海尔科技有限公司 文本分类方法及装置、存储介质、电子装置
CN112699243A (zh) * 2021-01-15 2021-04-23 上海交通大学 基于法条图卷积网络文本的案件文书案由分类方法及介质
CN112711423A (zh) * 2021-01-18 2021-04-27 深圳中兴网信科技有限公司 引擎构建方法、入侵检测方法、电子设备和可读存储介质
CN113010674B (zh) * 2021-03-11 2023-12-22 平安创科科技(北京)有限公司 文本分类模型封装方法、文本分类方法及相关设备
CN113010674A (zh) * 2021-03-11 2021-06-22 平安科技(深圳)有限公司 文本分类模型封装方法、文本分类方法及相关设备
CN113268599A (zh) * 2021-05-31 2021-08-17 平安国际智慧城市科技股份有限公司 文件分类模型的训练方法、装置、计算机设备及存储介质
CN113268599B (zh) * 2021-05-31 2024-03-19 平安国际智慧城市科技股份有限公司 文件分类模型的训练方法、装置、计算机设备及存储介质
CN113282710A (zh) * 2021-06-01 2021-08-20 平安国际智慧城市科技股份有限公司 文本关系抽取模型的训练方法、装置以及计算机设备
CN113723102A (zh) * 2021-06-30 2021-11-30 平安国际智慧城市科技股份有限公司 命名实体识别方法、装置、电子设备及存储介质
CN113723102B (zh) * 2021-06-30 2024-04-26 平安国际智慧城市科技股份有限公司 命名实体识别方法、装置、电子设备及存储介质
CN113688237B (zh) * 2021-08-10 2024-03-05 北京小米移动软件有限公司 文本分类方法、文本分类网络的训练方法及装置
CN113688237A (zh) * 2021-08-10 2021-11-23 北京小米移动软件有限公司 文本分类方法、文本分类网络的训练方法及装置
CN115859837A (zh) * 2023-02-23 2023-03-28 山东大学 基于数字孪生建模的风机叶片动态冲击检测方法及系统
CN116975863A (zh) * 2023-07-10 2023-10-31 福州大学 基于卷积神经网络的恶意代码检测方法
CN117370809A (zh) * 2023-11-02 2024-01-09 快朵儿(广州)云科技有限公司 一种基于深度学习的人工智能模型构建方法、系统及存储介质
CN117370809B (zh) * 2023-11-02 2024-04-12 快朵儿(广州)云科技有限公司 一种基于深度学习的人工智能模型构建方法、系统及存储介质

Also Published As

Publication number Publication date
CN109960726A (zh) 2019-07-02
CN109960726B (zh) 2024-01-23

Similar Documents

Publication Publication Date Title
WO2020164267A1 (zh) 文本分类模型构建方法、装置、终端及存储介质
WO2020215681A1 (zh) 指示信息生成方法、装置、终端及存储介质
WO2021132927A1 (en) Computing device and method of classifying category of data
WO2021051558A1 (zh) 基于知识图谱的问答方法、装置和存储介质
WO2021121198A1 (zh) 基于语义相似度的实体关系抽取方法、装置、设备及介质
WO2020119069A1 (zh) 基于自编码神经网络的文本生成方法、装置、终端及介质
EP3105668A1 (en) Dynamically modifying elements of user interface based on knowledge graph
WO2010120101A2 (ko) 역 벡터 공간 모델을 이용한 키워드 추천방법 및 그 장치
WO2021157897A1 (en) A system and method for efficient multi-relational entity understanding and retrieval
CN105975604A (zh) 一种分布迭代式数据处理程序异常检测与诊断方法
WO2020251233A1 (ko) 영상데이터의 추상적특성 획득 방법, 장치 및 프로그램
WO2015050321A1 (ko) 자율학습 정렬 기반의 정렬 코퍼스 생성 장치 및 그 방법과, 정렬 코퍼스를 사용한 파괴 표현 형태소 분석 장치 및 그 형태소 분석 방법
WO2021175005A1 (zh) 基于向量的文档检索方法、装置、计算机设备及存储介质
WO2019125054A1 (en) Method for content search and electronic device therefor
WO2018135723A1 (ko) 복수 문단 텍스트의 추상적 요약문 생성 장치 및 방법, 그 방법을 수행하기 위한 기록 매체
WO2017115994A1 (ko) 인공 지능 기반 연관도 계산을 이용한 노트 제공 방법 및 장치
WO2020082766A1 (zh) 输入法的联想方法、装置、设备及可读存储介质
WO2021157863A1 (ko) 준 지도 학습을 위한 오토인코더 기반 그래프 설계
CN112084794A (zh) 一种藏汉翻译方法和装置
WO2021068349A1 (zh) 基于区块链的图片标注方法、装置及存储介质、服务器
Yin et al. Sentiment lexical-augmented convolutional neural networks for sentiment analysis
WO2021051557A1 (zh) 基于语义识别的关键词确定方法、装置和存储介质
WO2019039659A1 (ko) 감성 기반의 사용자 관리 방법 및 이를 수행하는 장치들
Li et al. Exploit a multi-head reference graph for semi-supervised relation extraction
WO2023195769A1 (ko) 신경망 모델을 활용한 유사 특허 문헌 추출 방법 및 이를 제공하는 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915182

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19915182

Country of ref document: EP

Kind code of ref document: A1