CN110222329A - A kind of Chinese word cutting method and device based on deep learning - Google Patents
A kind of Chinese word cutting method and device based on deep learning Download PDFInfo
- Publication number
- CN110222329A CN110222329A CN201910322127.8A CN201910322127A CN110222329A CN 110222329 A CN110222329 A CN 110222329A CN 201910322127 A CN201910322127 A CN 201910322127A CN 110222329 A CN110222329 A CN 110222329A
- Authority
- CN
- China
- Prior art keywords
- data
- convolutional neural
- neural networks
- random field
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000013135 deep learning Methods 0.000 title claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 155
- 230000001537 neural effect Effects 0.000 claims abstract description 143
- 230000011218 segmentation Effects 0.000 claims abstract description 56
- 238000013527 convolutional neural network Methods 0.000 claims description 71
- 230000006870 function Effects 0.000 claims description 40
- 230000008569 process Effects 0.000 claims description 21
- 238000000605 extraction Methods 0.000 claims description 19
- 230000015654 memory Effects 0.000 claims description 17
- 241001269238 Data Species 0.000 claims description 11
- 238000007476 Maximum Likelihood Methods 0.000 claims description 8
- 230000000644 propagated effect Effects 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000012545 processing Methods 0.000 description 10
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 241000208340 Araliaceae Species 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 1
- 238000002845 discoloration Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The embodiment of the invention provides a kind of Chinese word cutting method and device based on deep learning.The present invention relates to field of artificial intelligence, this method comprises: training corpus data to be converted to the data of character level;The data of character level are converted into sequence data;Sequence data is subjected to cutting according to predetermined symbol, obtains multiple subsequence data, multiple subsequence data are grouped according to the length of subsequence data, obtain K data acquisition system;Timing convolutional neural networks-conditional random field models according to K data acquisition system, after obtaining K training;The data of target corpus data after treatment are inputted into timing convolutional neural networks-conditional random field models after the training of at least one of timing convolutional neural networks-conditional random field models after K training, obtain the word segmentation result of target corpus data.Therefore, technical solution provided in an embodiment of the present invention is able to solve the problem that Chinese word segmentation accuracy is low in the prior art.
Description
[technical field]
The present invention relates to field of artificial intelligence more particularly to a kind of Chinese word cutting methods and dress based on deep learning
It sets.
[background technique]
Deep learning Chinese Word Automatic Segmentation is based primarily upon the circulation nerve net with long short-term memory (LSTM) for representative at present
Network model and its derivative model, but processing capacity of the LSTM model in sequence data problem with the increase of sequence length and
Decline, has that Chinese word segmentation accuracy is low.
[summary of the invention]
In view of this, the embodiment of the invention provides a kind of Chinese word cutting method and device based on deep learning, to
Solve the problems, such as that Chinese word segmentation accuracy is low in the prior art.
On the one hand, the embodiment of the invention provides a kind of Chinese word cutting methods based on deep learning, which comprises
Training corpus data are converted to the data of character level;The data of the character level are converted into sequence data;According to default symbol
Number the sequence data is subjected to cutting, multiple subsequence data is obtained, according to the length of subsequence data by the multiple son
Sequence data is grouped, and obtains K data acquisition system, the subsequence that each data acquisition system in the K data acquisition system includes
The equal length of data, K are the natural number greater than 1;Multiple subsequence data are extracted from i-th of data acquisition system and by extraction
The multiple subsequence data input in i-th of timing convolutional neural networks-conditional random field models, when training described i-th
Sequence convolutional neural networks-conditional random field models, i-th of timing convolutional neural networks-condition random field mould after being trained
Type, i successively take 1 to the natural number between K, and one is obtained timing convolutional neural networks-condition random field mould after K training
Type;The data that target corpus data are converted to character level, obtain the first data, and first data are converted to sequence number
According to obtaining the second data, second data inputted timing convolutional neural networks-condition random field after the K training
Timing convolutional neural networks-conditional random field models after the training of at least one of model, obtain the target corpus data
Word segmentation result.
Further, the data by the character level are converted to sequence data, comprising: will by pre-arranged code mode
The data of the character level are converted to the sequence data, the pre-arranged code mode be it is following any one: one-hot coding or
Person's word steering volume coding.
Further, the multiple subsequence data by extraction input i-th of timing convolutional neural networks-condition
In random field models, training i-th of timing convolutional neural networks-conditional random field models, when i-th after being trained
Sequence convolutional neural networks-conditional random field models, comprising: the multiple subsequence data of extraction are inputted i-th of timing by S1
Convolutional neural networks carry out propagated forward, obtain the first output data, i-th of timing convolutional neural networks are described i-th
Timing convolutional neural networks in a timing convolutional neural networks-conditional random field models;S2, according to first output data
The value of loss function is calculated with the multiple subsequence data of input;S3, if the value of the loss function is greater than preset value,
The multiple subsequence data are then inputted into i-th of timing convolutional neural networks and carry out backpropagation, and to described i-th
The network parameter of timing convolutional neural networks optimizes;S4, circulation step S1 to S3, until the value of the loss function is less than
Or it is equal to the preset value;S5 determines that training is completed, obtains if the value of the loss function is less than or equal to the preset value
I-th of timing convolutional neural networks after to training;S6, by i-th of timing convolutional neural networks output after the training
Data input i-th of condition random field, and are trained to i-th of condition random field, i-th after obtaining the training
Timing convolutional neural networks-conditional random field models, i-th of condition random field are i-th of timing convolutional Neural nets
Condition random field in network-conditional random field models.
Further, described that i-th of condition random field is trained, comprising: according to i-th after the training
The data of timing convolutional neural networks output calculate the conditional probability of the output data of i-th of condition random field;Using most
The training of maximum-likelihood estimation method obtains the maximum value of the conditional probability of the output data of i-th of condition random field.
Further, it is described by second data input timing convolutional neural networks-condition after the K training with
Timing convolutional neural networks-conditional random field models after the training of at least one of airport model, obtain the target corpus
The word segmentation result of data, comprising: second data are carried out by cutting according to predetermined symbol, obtain multiple sequence datas;According to
The multiple sequence data is grouped by the length of sequence data, obtains L data acquisition system, every in the L data acquisition system
The equal length for all sequences data that a data acquisition system includes, L are natural number, 1≤L≤K;According to used in training process
The length of subsequence data filters out L instruction from timing convolutional neural networks-conditional random field models after the K training
Timing convolutional neural networks-conditional random field models after white silk obtain L1 to the timing convolutional Neural after the LL training
Network-conditional random field models, the timing that all sequences data for including by j-th of data acquisition system input after the Lj training are rolled up
In product neural network-conditional random field models, multiple word segmentation results are obtained, wherein the timing convolution after the Lj training
The length of subsequence data used in neural network-conditional random field models training process and j-th of data acquisition system packet
The equal length of the sequence data contained, j successively take 1 to the natural number between L, and Lj is 1 to the natural number between K;It will be described more
A word segmentation result is spliced, and the word segmentation result of the target corpus data is obtained.
On the one hand, the embodiment of the invention provides a kind of Chinese word segmentation device based on deep learning, described device include:
First converting unit, for training corpus data to be converted to the data of character level;Second converting unit is used for the character
The data of grade are converted to sequence data;First cutting unit is obtained for the sequence data to be carried out cutting according to predetermined symbol
To multiple subsequence data, the multiple subsequence data are grouped according to the length of subsequence data, obtain K data
Gather, the equal length for the subsequence data that each data acquisition system in the K data acquisition system includes, K is the nature greater than 1
Number;First determination unit, for extracting multiple subsequence data from i-th of data acquisition system and by the multiple sub- sequence of extraction
Column data inputs in i-th of timing convolutional neural networks-conditional random field models, training i-th of timing convolutional Neural net
Network-conditional random field models, i-th of timing convolutional neural networks-conditional random field models after being trained, i successively take 1 to
Natural number between K, one is obtained timing convolutional neural networks-conditional random field models after K training;Second determines list
Member obtains the first data, first data is converted to sequence for target corpus data to be converted to the data of character level
Data obtain the second data, and second data are inputted timing convolutional neural networks-condition random after the K training
Timing convolutional neural networks-conditional random field models after the training of at least one of field model, obtain the target corpus number
According to word segmentation result.
Further, second converting unit includes: conversion subunit, for by pre-arranged code mode by the word
The data of symbol grade are converted to the sequence data, the pre-arranged code mode be it is following any one: one-hot coding or word turn
Vector coding.
Further, first determination unit is for executing: S1, by the multiple subsequence data input of extraction the
I timing convolutional neural networks carry out propagated forward, obtain the first output data, i-th of timing convolutional neural networks are
Timing convolutional neural networks in i-th of timing convolutional neural networks-conditional random field models;S2, according to described first
Output data and the multiple subsequence data of input calculate the value of loss function;S3, if the value of the loss function is big
In preset value, then the multiple subsequence data is inputted into i-th of timing convolutional neural networks and carry out backpropagation, and is right
The network parameter of i-th of timing convolutional neural networks optimizes;S4, circulation step S1 to S3, until the loss letter
Several values is less than or equal to the preset value;S5 determines instruction if the value of the loss function is less than or equal to the preset value
Practice and completes, i-th of timing convolutional neural networks after being trained;S6, by i-th of timing convolutional Neural net after the training
The data of network output input i-th of condition random field, and are trained to i-th of condition random field, obtain the training
I-th of timing convolutional neural networks-conditional random field models afterwards, i-th of condition random field are i-th of timing volumes
Condition random field in product neural network-conditional random field models.
Further, first determination unit includes: the first computation subunit, for according to i-th after the training
The data of a timing convolutional neural networks output calculate the conditional probability of the output data of i-th of condition random field;First
Determine subelement, the item of the output data for obtaining i-th of condition random field using maximum Likelihood training
The maximum value of part probability.
Further, second determination unit includes: cutting subelement, for being counted according to predetermined symbol by described second
According to cutting is carried out, multiple sequence datas are obtained;Be grouped subelement, for according to the length of sequence data by the multiple sequence number
According to being grouped, L data acquisition system, all sequences data that each data acquisition system includes in the L data acquisition system are obtained
Equal length, L are natural number, 1≤L≤K;Second determines subelement, is used for the subsequence data according to used in training process
Timing of the length after filtering out L training in timing convolutional neural networks-conditional random field models after the K training
Convolutional neural networks-conditional random field models, obtain L1 to timing convolutional neural networks-condition after the LL training with
Airport model, all sequences data for including by j-th of data acquisition system input the timing convolutional neural networks-after the Lj training
In conditional random field models, multiple word segmentation results are obtained, wherein the timing convolutional neural networks-article after the Lj training
The sequence data that the length of subsequence data used in part random field models training process and j-th of data acquisition system include
Equal length, j successively takes 1 to the natural number between L, and Lj is 1 to the natural number between K;Splice subelement, being used for will be described
Multiple word segmentation results are spliced, and the word segmentation result of the target corpus data is obtained.
On the one hand, the embodiment of the invention provides a kind of storage medium, the storage medium includes the program of storage,
In, equipment where controlling the storage medium in described program operation executes the above-mentioned Chinese word segmentation side based on deep learning
Method.
On the one hand, the embodiment of the invention provides a kind of computer equipment, including memory and processor, the memories
For storing the information including program instruction, the processor is used to control the execution of program instruction, and described program instruction is located
The step of reason device loads and realizes the above-mentioned Chinese word cutting method based on deep learning when executing.
In the embodiment of the present invention, it converts target corpus data to the data of character level;It converts the data of character level to
Sequence data;By in timing convolutional neural networks-conditional random field models after sequence data input training, target corpus is obtained
The word segmentation result of data, timing convolutional neural networks can be expanded by way of increasing the network number of plies with the speed of exponential increase
It receives domain and improves coding result so as to the longer sequence data of processing sequence length or the data of other characteristics complexity
Accuracy, to improve the accuracy of Chinese word segmentation.
[Detailed description of the invention]
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this field
For those of ordinary skill, without any creative labor, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 is a kind of flow chart of optionally Chinese word cutting method based on deep learning according to embodiments of the present invention;
Fig. 2 is a kind of schematic diagram of Chinese word segmentation device optionally based on deep learning according to embodiments of the present invention;
Fig. 3 is a kind of schematic diagram of optional computer equipment provided in an embodiment of the present invention.
[specific embodiment]
For a better understanding of the technical solution of the present invention, being retouched in detail to the embodiment of the present invention with reference to the accompanying drawing
It states.
It will be appreciated that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Base
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its
Its embodiment, shall fall within the protection scope of the present invention.
The term used in embodiments of the present invention is only to be not intended to be limiting merely for for the purpose of describing particular embodiments
The present invention.In the embodiment of the present invention and the "an" of singular used in the attached claims, " described " and "the"
It is also intended to including most forms, unless the context clearly indicates other meaning.
It should be appreciated that term "and/or" used herein is only a kind of incidence relation for describing affiliated partner, indicate
There may be three kinds of relationships, for example, A and/or B, can indicate: individualism A, exist simultaneously A and B, individualism B these three
Situation.In addition, character "/" herein, typicallys represent the relationship that forward-backward correlation object is a kind of "or".
Fig. 1 is a kind of flow chart of optionally Chinese word cutting method based on deep learning according to embodiments of the present invention, such as
Shown in Fig. 1, this method comprises:
Training corpus data are converted to the data of character level by step S102.
The data of character level are converted to sequence data by step S104.
Sequence data is carried out cutting according to predetermined symbol, multiple subsequence data is obtained, according to subsequence by step S106
Multiple subsequence data are grouped by the length of data, obtain K data acquisition system, each data set in K data acquisition system
The equal length for the subsequence data that conjunction includes, K are the natural number greater than 1.Predetermined symbol refers to the punctuation mark for punctuate,
Such as: fullstop, question mark, exclamation mark, comma, pause mark, branch, colon etc..
Step S108 extracts multiple subsequence data and multiple subsequence data by extraction from i-th of data acquisition system
It inputs in i-th of timing convolutional neural networks-conditional random field models, i-th of timing convolutional neural networks-condition random of training
Field model, i-th of timing convolutional neural networks-conditional random field models after being trained, i successively take 1 to the nature between K
Number, one is obtained timing convolutional neural networks-conditional random field models after K training.
Target corpus data are converted to the data of character level, obtain the first data by step S110, by the first data conversion
For sequence data, the second data are obtained, the second data are inputted into timing convolutional neural networks-condition random field after K training
Timing convolutional neural networks-conditional random field models after the training of at least one of model obtain point of target corpus data
Word result.
Corpus data is the basic resource that linguistry is carried using electronic computer as carrier, is the actual use in language
In the linguistic data that really occurred.
Timing convolutional neural networks-conditional random field models (TCN-CRF) are timing convolutional neural networks (TCN) and item
The binding model of part random field (CRF).Timing convolutional neural networks are a kind of time convolutional network of deep learning, condition random
Field is a typical discriminative model, and condition random field assigns participle as the lexeme classification problem of word, the word of usual defined word
Position information: prefix, commonly using B indicates;In word, commonly using M is indicated;Suffix, commonly using E indicates;Monosyllabic word, commonly using S indicates, condition random
The process of field participle is exactly after marking to lexeme, by the word and S individual character composition participle between B and E.Such as: sentence to be segmented
Are as follows: " I loves Beijing Tian An-men ", after mark: I/S love/north S/capital B/E days/B peace/M/E, word segmentation result: " I/love/Beijing/
Tian An-men ".
In the embodiment of the present invention, it converts target corpus data to the data of character level;It converts the data of character level to
Sequence data;By in timing convolutional neural networks-conditional random field models after sequence data input training, target corpus is obtained
The word segmentation result of data, timing convolutional neural networks can be expanded by way of increasing the network number of plies with the speed of exponential increase
It receives domain and improves coding result so as to the longer sequence data of processing sequence length or the data of other characteristics complexity
Accuracy, to improve the accuracy of Chinese word segmentation.
Also, the neuron weight in timing convolutional neural networks on same Feature Mapping face is identical, can with collateral learning,
Processing speed is fast, because this timing convolutional neural networks-conditional random field models can also be realized in distributed system.
Optionally, the data of character level are converted into sequence data, comprising: by pre-arranged code mode by the number of character level
According to being converted to sequence data, pre-arranged code mode be it is following any one: one-hot coding or word steering volume coding.
One-hot coding, that is, One-Hot coding, also known as an efficient coding, method are using N bit status register come to N
A state is encoded, and each state has its independent register-bit, and when any, wherein only one effective.
For example, one group of data is characterized in color, including yellow, red, green, after one-hot coding, yellow becomes [100], red
At [010], green becomes [001] discoloration, corresponding with vector by the sequence data of one-hot coding in this way, can be used in nerve
In network model.
Word steering volume coding can be word2vec, and word2vec is the efficient calculation of one kind that word is characterized as real number value vector
Processing to content of text can be reduced to the vector operation in K dimensional vector space by training by method model.word2vec
The term vector of output can be used to do the relevant work of many NLP (neural LISP program LISP), for example, cluster, look for synonym,
Part of speech analysis etc..Such as: word2vec obtains the data of character level as feature by Feature Mapping to K dimensional vector space
The sequence data of character representation.
Optionally, multiple subsequence data of extraction are inputted into i-th of timing convolutional neural networks-conditional random field models
In, i-th of timing convolutional neural networks-conditional random field models of training, i-th of timing convolutional Neural net after being trained
Network-conditional random field models, comprising: multiple subsequence data of extraction are inputted i-th of timing convolutional neural networks and carried out by S1
Propagated forward, obtains the first output data, i-th of timing convolutional neural networks be i-th of timing convolutional neural networks-condition with
Timing convolutional neural networks in the model of airport;S2 is calculated according to the first output data and multiple subsequence data of input and is damaged
Lose the value of function;Multiple subsequence data are inputted i-th of timing convolution if the value of loss function is greater than preset value by S3
Neural network carries out backpropagation, and optimizes to the network parameter of i-th of timing convolutional neural networks;S4, circulation step
S1 to S3, until the value of loss function is less than or equal to preset value;S5, if the value of loss function is less than or equal to preset value,
Determine that training is completed, i-th of timing convolutional neural networks after being trained;S6, by i-th of timing convolutional Neural after training
The data of network output input i-th of condition random field, and are trained to i-th of condition random field, i-th after being trained
A timing convolutional neural networks-conditional random field models, i-th of condition random field are i-th of timing convolutional neural networks-conditions
Condition random field in random field models.
Wherein, i-th of timing convolutional neural networks of training are the values based on loss function, are specifically included: i-th of initialization
The network parameter of timing convolutional neural networks is iterated i-th of timing convolutional neural networks using stochastic gradient descent method
Training, every iteration once calculate the value of a loss function, and iteration repeatedly reaches minimum up to the value of loss function, trained
I-th of timing convolutional neural networks and corresponding convergent network parameter after the completion.
The specific formula for calculating loss function can be with are as follows:
Loss indicates the value of loss function, and N indicates the number of the subsequence data of i-th of timing convolutional neural networks of input
Amount, y(i)Indicate i-th of subsequence data of i-th of timing convolutional neural networks of input,It indicates i-th of subsequence number
According to the data exported after i-th of timing convolutional neural networks of input.
Optionally, i-th of condition random field is trained, comprising: according to i-th of timing convolutional Neural net after training
The data of network output calculate the conditional probability of the output data of i-th of condition random field;Use maximum Likelihood training
Obtain the maximum value of the conditional probability of the output data of i-th of condition random field.
Condition random field is the Markov random field of stochastic variable Y under conditions of given stochastic variable X, Markov
Some stochastic variable of random field, only stochastic variable adjacent thereto is related, unrelated with those non-conterminous stochastic variables.
In conditional probability model P (Y | X), Y is output variable, indicates flag sequence, also referred to as status switch, X are defeated
Enter variable, indicates the observation sequence for needing to mark.Training data is utilized when training, and conditional probability is obtained by maximal possibility estimation
Then model uses the model prediction, the output sequence Y for given list entries X, when conditional probability maximum.Commonly
For linear chain conditional random, sequence X=(X1, X2 ..., the Xn) of input, the sequence of output is that Y=(Y1, Y2 ..., Yn) is
The sequence of random variables that linear chain indicates, if the condition of sequence of random variables Y is general under conditions of given sequence of random variables X
Rate distribution P (Y | X) constitute condition random field.
Wherein, maximal possibility estimation refers to by testing several times, observes as a result, obtaining some ginseng using test result
The probability that numerical value can be such that sample occurs is maximum.Maximal possibility estimation provides the given observation data of one kind and carrys out assessment models ginseng
Several method, it may be assumed that " model has been determined, unknown parameters ".Known sample data are X=(X1, X2 ..., Xn), and n is the number of sample data
Amount estimates that parameter t, the likelihood function of the t relative to X are
Wherein i is value 1 to the natural number of n, if t ' is to make the maximum t value of likelihood function f (t) in parameter space, t ' should
Make " most probable " parameter, then t ' is exactly the maximum-likelihood estimator of t.
Optionally, the second data are inputted in timing convolutional neural networks-conditional random field models after K training extremely
Timing convolutional neural networks-conditional random field models after a few training, obtain the word segmentation result of target corpus data, wrap
It includes: the second data being carried out by cutting according to predetermined symbol, obtain multiple sequence datas;According to the length of sequence data by multiple sequences
Column data is grouped, and obtains L data acquisition system, all sequences data that each data acquisition system includes in L data acquisition system
Equal length, L are natural number, 1≤L≤K;The length of subsequence data according to used in training process is after K training
Timing convolutional neural networks-condition random field after filtering out L training in timing convolutional neural networks-conditional random field models
Model obtains L1 to timing convolutional neural networks-conditional random field models after the LL training, by j-th of data set
The all sequences data that conjunction includes input in timing convolutional neural networks-conditional random field models after the Lj training, obtain
Multiple word segmentation results, wherein used in timing convolutional neural networks-conditional random field models training process after the Lj training
Subsequence data length and j-th of data acquisition system sequence data for including equal length, j successively take 1 between L from
So number, Lj are 1 to the natural number between K;Multiple word segmentation results are spliced, the word segmentation result of target corpus data is obtained.
For example, it is assumed that the value of K is timing convolutional neural networks-conditional random field models after 5,5 training in training
The sub-sequence length used is respectively 10,20,30,40,50, after the second data cutting, obtains 2 sequence datas, 2 sequences
The length of data is respectively 20 and 50, then the length 20 and 50 of the subsequence data according to used in training process, from 5 training
Timing convolutional neural networks-condition after filtering out 2 training in timing convolutional neural networks-conditional random field models afterwards with
Airport model, used in timing convolutional neural networks-conditional random field models training process after the 1st training filtered out
The length of subsequence data is 20, the timing convolutional neural networks after the 2nd training filtered out-conditional random field models training
The length of subsequence data used in process is 50, then after the data that the length of sequence data is 20 being inputted the 1st training
Timing convolutional neural networks-conditional random field models, obtain multiple word segmentation results;The data for being 50 by the length of sequence data
Timing convolutional neural networks-conditional random field models after inputting the 2nd training, obtain multiple word segmentation results;By the 1st training
The multiple word segmentation results of timing convolutional neural networks afterwards-conditional random field models output and the timing convolution after the 2nd training
The multiple word segmentation results of neural network-conditional random field models output are spliced, and the word segmentation result of target corpus data is obtained.
Fig. 2 is a kind of schematic diagram of Chinese word segmentation device optionally based on deep learning according to embodiments of the present invention, should
Device is for executing the above-mentioned Chinese word cutting method based on deep learning, as shown in Fig. 2, the device includes: the first converting unit
10, the second converting unit 20, the first cutting unit 30, the first determination unit 40, the second determination unit 50.
First converting unit 10, for training corpus data to be converted to the data of character level.
Second converting unit 20, for the data of character level to be converted to sequence data.
First cutting unit 30 obtains multiple subsequence data for sequence data to be carried out cutting according to predetermined symbol,
Multiple subsequence data are grouped according to the length of subsequence data, obtain K data acquisition system, in K data acquisition system
The equal length for the subsequence data that each data acquisition system includes, K are the natural number greater than 1.Predetermined symbol refers to for making pauses in reading unpunctuated ancient writings
Punctuation mark, such as: fullstop, question mark, exclamation mark, comma, pause mark, branch, colon etc..
First determination unit 40, for extracting multiple subsequence data from i-th of data acquisition system and by the multiple of extraction
Subsequence data input in i-th of timing convolutional neural networks-conditional random field models, i-th of timing convolutional Neural net of training
Network-conditional random field models, i-th of timing convolutional neural networks-conditional random field models after being trained, i successively take 1 to
Natural number between K, one is obtained timing convolutional neural networks-conditional random field models after K training.
Second determination unit 50 obtains the first data, by for target corpus data to be converted to the data of character level
One data are converted to sequence data, obtain the second data, and the second data are inputted the timing convolutional neural networks-after K training
Timing convolutional neural networks-conditional random field models after the training of at least one of conditional random field models, obtain target language
Expect the word segmentation result of data.
Corpus data is the basic resource that linguistry is carried using electronic computer as carrier, is the actual use in language
In the linguistic data that really occurred.
Timing convolutional neural networks-conditional random field models (TCN-CRF) are timing convolutional neural networks (TCN) and item
The binding model of part random field (CRF).Timing convolutional neural networks are a kind of time convolutional network of deep learning, condition random
Field is a typical discriminative model, and condition random field assigns participle as the lexeme classification problem of word, the word of usual defined word
Position information: prefix, commonly using B indicates;In word, commonly using M is indicated;Suffix, commonly using E indicates;Monosyllabic word, commonly using S indicates, condition random
The process of field participle is exactly after marking to lexeme, by the word and S individual character composition participle between B and E.Such as: sentence to be segmented
Are as follows: " I loves Beijing Tian An-men ", after mark: I/S love/north S/capital B/E days/B peace/M/E, word segmentation result: " I/love/Beijing/
Tian An-men ".
In the embodiment of the present invention, it converts target corpus data to the data of character level;It converts the data of character level to
Sequence data;By in timing convolutional neural networks-conditional random field models after sequence data input training, target corpus is obtained
The word segmentation result of data, timing convolutional neural networks can be expanded by way of increasing the network number of plies with the speed of exponential increase
It receives domain and improves coding result so as to the longer sequence data of processing sequence length or the data of other characteristics complexity
Accuracy, to improve the accuracy of Chinese word segmentation.
Also, the neuron weight in timing convolutional neural networks on same Feature Mapping face is identical, can with collateral learning,
Processing speed is fast, because this timing convolutional neural networks-conditional random field models can also be realized in distributed system.
Optionally, the second converting unit 20 includes: conversion subunit.Conversion subunit, for passing through pre-arranged code mode
The data of character level are converted into sequence data, pre-arranged code mode be it is following any one: one-hot coding or word steering volume
Coding.
Optionally, the first determination unit 40 is for executing following steps: S1, by multiple subsequence data input of extraction the
I timing convolutional neural networks carry out propagated forward, obtain the first output data, i-th of timing convolutional neural networks is i-th
Timing convolutional neural networks in timing convolutional neural networks-conditional random field models.S2, according to the first output data and input
Multiple subsequence data calculate loss function value.S3, if the value of loss function is greater than preset value, by multiple subsequences
Data input i-th of timing convolutional neural networks and carry out backpropagation, and to the network parameter of i-th of timing convolutional neural networks
It optimizes.S4, circulation step S1 to S3, until the value of loss function is less than or equal to preset value.S5, if loss function
Value is less than or equal to preset value, determines that training is completed, i-th of timing convolutional neural networks after being trained.S6, after training
The data of i-th of timing convolutional neural networks output input i-th of condition random field, and i-th condition random field is carried out
Training, i-th of timing convolutional neural networks-conditional random field models after being trained, i-th of condition random field is i-th
Condition random field in timing convolutional neural networks-conditional random field models.
Optionally, the first determination unit includes: the first computation subunit, the first determining subelement.First computation subunit,
For calculating the output data of i-th of condition random field according to the data of i-th of timing convolutional neural networks output after training
Conditional probability.First determines subelement, for obtaining the defeated of i-th of condition random field using maximum Likelihood training
The maximum value of the conditional probability of data out.
Optionally, the second determination unit 50 includes: cutting subelement, grouping subelement, the second determining subelement, splicing
Unit.Cutting subelement obtains multiple sequence datas for the second data to be carried out cutting according to predetermined symbol.Grouping is single
Member obtains L data acquisition system for being grouped multiple sequence datas according to the length of sequence data, in L data acquisition system
The equal length for all sequences data that each data acquisition system includes, L are natural number, 1≤L≤K.Second determines subelement, uses
In the subsequence data according to used in training process length from K training after timing convolutional neural networks-condition random
Timing convolutional neural networks-conditional random field models after filtering out L training in field model obtain L1 to the LL and instruct
Timing convolutional neural networks-conditional random field models after white silk, all sequences data input for including by j-th of data acquisition system the
In timing convolutional neural networks-conditional random field models after Lj training, multiple word segmentation results are obtained, wherein the Lj instruction
The length of subsequence data used in timing convolutional neural networks-conditional random field models training process after white silk with j-th
The equal length for the sequence data that data acquisition system includes, j successively take 1 to the natural number between L, and Lj is 1 to the nature between K
Number.Splice subelement and obtains the word segmentation result of target corpus data for splicing multiple word segmentation results.
On the one hand, the embodiment of the invention provides a kind of storage medium, storage medium includes the program of storage, wherein
Equipment where control storage medium executes following steps when program is run: training corpus data are converted to the data of character level;
The data of character level are converted into sequence data;Sequence data is subjected to cutting according to predetermined symbol, obtains multiple subsequence numbers
According to multiple subsequence data are grouped according to the length of subsequence data, obtain K data acquisition system, in K data acquisition system
Each data acquisition system subsequence data for including equal length, K is the natural number greater than 1;It is taken out from i-th of data acquisition system
It takes multiple subsequence data and multiple subsequence data of extraction is inputted into i-th of timing convolutional neural networks-condition random field
In model, i-th of timing convolutional neural networks-conditional random field models of training, i-th of timing convolutional Neural after being trained
Network-conditional random field models, i successively take 1 to the natural number between K, and one is obtained the timing convolutional Neural net after K training
Network-conditional random field models;The data that target corpus data are converted to character level, obtain the first data, and the first data are turned
It is changed to sequence data, obtains the second data, the second data are inputted into timing convolutional neural networks-condition random after K training
Timing convolutional neural networks-conditional random field models after the training of at least one of field model, obtain target corpus data
Word segmentation result.
Optionally, when program is run, equipment where control storage medium also executes following steps: by pre-arranged code side
The data of character level are converted to sequence data by formula, pre-arranged code mode be it is following any one: one-hot coding or word turn to
Amount coding.
Optionally, when program is run, equipment where control storage medium also executes following steps: S1, by the multiple of extraction
Subsequence data input i-th of timing convolutional neural networks and carry out propagated forward, obtain the first output data, i-th of timing volume
Product neural network is the timing convolutional neural networks in i-th of timing convolutional neural networks-conditional random field models;S2, according to
First output data and multiple subsequence data of input calculate the value of loss function;S3, if the value of loss function is greater than in advance
If value, then multiple subsequence data is inputted into i-th of timing convolutional neural networks and carry out backpropagation, and i-th of timing is rolled up
The network parameter of product neural network optimizes;S4, circulation step S1 to S3 are preset until the value of loss function is less than or equal to
Value;S5 determines that training is completed, i-th of timing convolution after being trained if the value of loss function is less than or equal to preset value
Neural network;The data of i-th of timing convolutional neural networks output after training are inputted i-th of condition random field by S6, and right
I-th of condition random field is trained, i-th of timing convolutional neural networks-conditional random field models after being trained, and i-th
A condition random field is the condition random field in i-th of timing convolutional neural networks-conditional random field models.
Optionally, when program is run, equipment where control storage medium also executes following steps: according to i-th after training
The data of a timing convolutional neural networks output calculate the conditional probability of the output data of i-th of condition random field;Use maximum
Likelihood estimation training obtains the maximum value of the conditional probability of the output data of i-th of condition random field.
Optionally, when program is run, equipment where control storage medium also executes following steps: will according to predetermined symbol
Second data carry out cutting, obtain multiple sequence datas;Multiple sequence datas are grouped according to the length of sequence data, are obtained
To L data acquisition system, the equal length for all sequences data that each data acquisition system includes in L data acquisition system, L is nature
Number, 1≤L≤K;The length of subsequence data according to used in training process from K training after timing convolutional neural networks-
Timing convolutional neural networks-conditional random field models after filtering out L training in conditional random field models obtain L1 extremely
Timing convolutional neural networks-conditional random field models after the LL training, all sequences number for including by j-th of data acquisition system
According in timing convolutional neural networks-conditional random field models after the Lj training of input, multiple word segmentation results are obtained, wherein
The length of subsequence data used in timing convolutional neural networks-conditional random field models training process after the Lj training
The equal length for the sequence data for including with j-th of data acquisition system, j successively take 1 to the natural number between L, and Lj is 1 between K
Natural number;Multiple word segmentation results are spliced, the word segmentation result of target corpus data is obtained.
On the one hand, the embodiment of the invention provides a kind of computer equipments, including memory and processor, memory to be used for
Storage includes the information of program instruction, and processor is used to control the execution of program instruction, and program instruction is loaded and held by processor
The data that training corpus data are converted to character level are performed the steps of when row;The data of character level are converted into sequence number
According to;Sequence data is subjected to cutting according to predetermined symbol, obtains multiple subsequence data, it will be more according to the length of subsequence data
A sub- sequence data is grouped, and obtains K data acquisition system, the subsequence that each data acquisition system in K data acquisition system includes
The equal length of data, K are the natural number greater than 1;Multiple subsequence data are extracted from i-th of data acquisition system and by extraction
Multiple subsequence data input in i-th of timing convolutional neural networks-conditional random field models, i-th of timing convolution mind of training
Through network-conditional random field models, i-th of timing convolutional neural networks-conditional random field models after being trained, i is successively
1 is taken to the natural number between K, one is obtained timing convolutional neural networks-conditional random field models after K training;By target
Corpus data is converted to the data of character level, obtains the first data, and the first data are converted to sequence data, obtains the second number
According to after the second data are inputted the training of at least one of timing convolutional neural networks-conditional random field models after K training
Timing convolutional neural networks-conditional random field models, obtain the word segmentation result of target corpus data.
Optionally, also performing the steps of when program instruction is loaded and executed by processor will by pre-arranged code mode
The data of character level are converted to sequence data, pre-arranged code mode be it is following any one: one-hot coding or word steering volume are compiled
Code.
Optionally, S1 is also performed the steps of when program instruction is loaded and executed by processor, by multiple sub- sequences of extraction
Column data inputs i-th of timing convolutional neural networks and carries out propagated forward, obtains the first output data, i-th of timing convolution mind
Through the timing convolutional neural networks that network is in i-th of timing convolutional neural networks-conditional random field models;S2, according to first
Output data and multiple subsequence data of input calculate the value of loss function;S3, if the value of loss function is greater than preset value,
Multiple subsequence data are then inputted into i-th of timing convolutional neural networks and carry out backpropagation, and to i-th of timing convolutional Neural
The network parameter of network optimizes;S4, circulation step S1 to S3, until the value of loss function is less than or equal to preset value;S5,
If the value of loss function is less than or equal to preset value, determine that training is completed, i-th of timing convolutional Neural net after being trained
Network;The data of i-th of timing convolutional neural networks output after training are inputted i-th of condition random field by S6, and to i-th
Condition random field is trained, i-th of timing convolutional neural networks-conditional random field models after being trained, i-th of condition
Random field is the condition random field in i-th of timing convolutional neural networks-conditional random field models.
Optionally, when also performed the steps of when program instruction is loaded and executed by processor according to i-th after training
The data of sequence convolutional neural networks output calculate the conditional probability of the output data of i-th of condition random field;Use maximum likelihood
Estimation method training obtains the maximum value of the conditional probability of the output data of i-th of condition random field.
Optionally, it is also performed the steps of second when program instruction is loaded and executed by processor according to predetermined symbol
Data carry out cutting, obtain multiple sequence datas;Multiple sequence datas are grouped according to the length of sequence data, obtain L
A data acquisition system, the equal length for all sequences data that each data acquisition system includes in L data acquisition system, L are natural number, 1
≤L≤K;The length of subsequence data according to used in training process from K training after timing convolutional neural networks-condition
Timing convolutional neural networks-conditional random field models after filtering out L training in random field models obtain L1 to LL
Timing convolutional neural networks-conditional random field models after a training, all sequences data for including by j-th of data acquisition system are defeated
In timing convolutional neural networks-conditional random field models after entering the Lj training, multiple word segmentation results are obtained, wherein Lj
The length of subsequence data used in timing convolutional neural networks-conditional random field models training process after a training and the
The equal length for the sequence data that j data acquisition system includes, j successively take 1 to the natural number between L, Lj be 1 between K from
So number;Multiple word segmentation results are spliced, the word segmentation result of target corpus data is obtained.
Fig. 3 is a kind of schematic diagram of computer equipment provided in an embodiment of the present invention.As shown in figure 3, the meter of the embodiment
Machine equipment 50 is calculated to include: processor 51, memory 52 and be stored in the meter that can be run in memory 52 and on processor 51
Calculation machine program 53 realizes the Chinese word segmentation based on deep learning in embodiment when the computer program 53 is executed by processor 51
Method does not repeat one by one herein to avoid repeating.Alternatively, being realized in embodiment when the computer program is executed by processor 51
The function of each model/unit does not repeat one by one herein in Chinese word segmentation device based on deep learning to avoid repeating.
Computer equipment 50 can be desktop PC, notebook, palm PC and cloud server etc. and calculate equipment.
Computer equipment may include, but be not limited only to, processor 51, memory 52.It will be understood by those skilled in the art that Fig. 3 is only
It is the example of computer equipment 50, does not constitute the restriction to computer equipment 50, may include more more or fewer than illustrating
Component perhaps combines certain components or different components, such as computer equipment can also include input-output equipment, net
Network access device, bus etc..
Alleged processor 51 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng.
Memory 52 can be the internal storage unit of computer equipment 50, such as the hard disk or interior of computer equipment 50
It deposits.Memory 52 is also possible to the plug-in type being equipped on the External memory equipment of computer equipment 50, such as computer equipment 50
Hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card
(Flash Card) etc..Further, memory 52 can also both including computer equipment 50 internal storage unit and also including
External memory equipment.Memory 52 is for storing other programs and data needed for computer program and computer equipment.It deposits
Reservoir 52 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided by the present invention, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or group
Part can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown
Or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, device or unit it is indirect
Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one
In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer
It is each that device (can be personal computer, server or network equipment etc.) or processor (Processor) execute the present invention
The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-
Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various
It can store the medium of program code.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.
Claims (10)
1. a kind of Chinese word cutting method based on deep learning, which is characterized in that the described method includes:
Training corpus data are converted to the data of character level;
The data of the character level are converted into sequence data;
The sequence data is subjected to cutting according to predetermined symbol, multiple subsequence data are obtained, according to the length of subsequence data
The multiple subsequence data are grouped by degree, obtain K data acquisition system, each data set in the K data acquisition system
The equal length for the subsequence data that conjunction includes, K are the natural number greater than 1;
Multiple subsequence data are extracted from i-th of data acquisition system and input the multiple subsequence data of extraction i-th
In timing convolutional neural networks-conditional random field models, training i-th of timing convolutional neural networks-condition random field mould
Type, i-th of timing convolutional neural networks-conditional random field models after being trained, i successively take 1 to the natural number between K,
One is obtained timing convolutional neural networks-conditional random field models after K training;
The data that target corpus data are converted to character level, obtain the first data, and first data are converted to sequence number
According to obtaining the second data, second data inputted timing convolutional neural networks-condition random field after the K training
Timing convolutional neural networks-conditional random field models after the training of at least one of model, obtain the target corpus data
Word segmentation result.
2. the method according to claim 1, wherein the data by the character level are converted to sequence number
According to, comprising:
The data of the character level are converted into the sequence data by pre-arranged code mode, the pre-arranged code mode be with
Descend any one: one-hot coding or word steering volume coding.
3. the method according to claim 1, wherein the multiple subsequence data input by extraction the
In i timing convolutional neural networks-conditional random field models, training i-th of timing convolutional neural networks-condition random field
Model, i-th of timing convolutional neural networks-conditional random field models after being trained, comprising:
The multiple subsequence data of extraction are inputted i-th of timing convolutional neural networks and carry out propagated forward by S1, obtain the
One output data, i-th of timing convolutional neural networks are i-th of timing convolutional neural networks-condition random field moulds
Timing convolutional neural networks in type;
S2 calculates the value of loss function according to first output data and the multiple subsequence data of input;
The multiple subsequence data are inputted i-th of timing if the value of the loss function is greater than preset value by S3
Convolutional neural networks carry out backpropagation, and optimize to the network parameter of i-th of timing convolutional neural networks;
S4, circulation step S1 to S3, until the value of the loss function is less than or equal to the preset value;
S5 determines that training is completed if the value of the loss function is less than or equal to the preset value, i-th after being trained
A timing convolutional neural networks;
The data of i-th of timing convolutional neural networks output after the training are inputted i-th of condition random field by S6, and right
I-th of condition random field is trained, i-th of timing convolutional neural networks-condition random field after obtaining the training
Model, i-th of condition random field be condition in i-th of timing convolutional neural networks-conditional random field models with
Airport.
4. according to the method described in claim 3, it is characterized in that, described be trained i-th of condition random field, packet
It includes:
I-th of condition random field is calculated according to the data of i-th of timing convolutional neural networks output after the training
The conditional probability of output data;
The maximum of the conditional probability of the output data of i-th of condition random field is obtained using maximum Likelihood training
Value.
5. method according to any one of claims 1 to 4, which is characterized in that described that second data are inputted the K
Timing convolutional Neural net after the training of at least one of timing convolutional neural networks-conditional random field models after a training
Network-conditional random field models obtains the word segmentation result of the target corpus data, comprising:
Second data are subjected to cutting according to predetermined symbol, obtain multiple sequence datas;
The multiple sequence data is grouped according to the length of sequence data, obtains L data acquisition system, the L data
The equal length for all sequences data that each data acquisition system includes in set, L are natural number, 1≤L≤K;
The length of subsequence data according to used in training process is from timing convolutional neural networks-item after the K training
Timing convolutional neural networks-conditional random field models after filtering out L training in part random field models obtain L1 to the
Timing convolutional neural networks-conditional random field models after LL training, all sequences data for including by j-th of data acquisition system
In timing convolutional neural networks-conditional random field models after inputting the Lj training, multiple word segmentation results are obtained, wherein institute
The length of subsequence data used in timing convolutional neural networks-conditional random field models training process after stating the Lj training
The equal length for the sequence data that degree includes with j-th of data acquisition system, j successively take 1 to the natural number between L, Lj be 1 to
Natural number between K;
The multiple word segmentation result is spliced, the word segmentation result of the target corpus data is obtained.
6. a kind of Chinese word segmentation device based on deep learning, which is characterized in that described device includes:
First converting unit, for training corpus data to be converted to the data of character level;
Second converting unit, for the data of the character level to be converted to sequence data;
First cutting unit obtains multiple subsequence data, root for the sequence data to be carried out cutting according to predetermined symbol
The multiple subsequence data are grouped according to the length of subsequence data, obtain K data acquisition system, the K data set
The equal length for the subsequence data that each data acquisition system in conjunction includes, K are the natural number greater than 1;
First determination unit, for extracting multiple subsequence data from i-th of data acquisition system and by the multiple son of extraction
Sequence data inputs in i-th of timing convolutional neural networks-conditional random field models, training i-th of timing convolutional Neural
Network-conditional random field models, i-th of timing convolutional neural networks-conditional random field models after being trained, i successively take 1
To the natural number between K, one is obtained timing convolutional neural networks-conditional random field models after K training;
Second determination unit obtains the first data, by described first for target corpus data to be converted to the data of character level
Data are converted to sequence data, obtain the second data, and second data are inputted the timing convolutional Neural after the K training
Timing convolutional neural networks-conditional random field models after the training of at least one of network-conditional random field models, obtain institute
State the word segmentation result of target corpus data.
7. device according to claim 6, which is characterized in that second converting unit includes:
Conversion subunit, it is described for the data of the character level to be converted to the sequence data by pre-arranged code mode
Pre-arranged code mode be it is following any one: one-hot coding or word steering volume coding.
8. device according to claim 6, which is characterized in that first determination unit is for executing:
The multiple subsequence data of extraction are inputted i-th of timing convolutional neural networks and carry out propagated forward by S1, obtain the
One output data, i-th of timing convolutional neural networks are i-th of timing convolutional neural networks-condition random field moulds
Timing convolutional neural networks in type;
S2 calculates the value of loss function according to first output data and the multiple subsequence data of input;
The multiple subsequence data are inputted i-th of timing if the value of the loss function is greater than preset value by S3
Convolutional neural networks carry out backpropagation, and optimize to the network parameter of i-th of timing convolutional neural networks;
S4, circulation step S1 to S3, until the value of the loss function is less than or equal to the preset value;
S5 determines that training is completed if the value of the loss function is less than or equal to the preset value, i-th after being trained
A timing convolutional neural networks;
The data of i-th of timing convolutional neural networks output after the training are inputted i-th of condition random field by S6, and right
I-th of condition random field is trained, i-th of timing convolutional neural networks-condition random field after obtaining the training
Model, i-th of condition random field be condition in i-th of timing convolutional neural networks-conditional random field models with
Airport.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program
When control the storage medium where equipment perform claim require any one of 1 to 5 described in the Chinese based on deep learning
Segmenting method.
10. a kind of computer equipment, including memory and processor, the memory is for storing the letter including program instruction
Breath, the processor are used to control the execution of program instruction, it is characterised in that: described program instruction is loaded and executed by processor
The step of Chinese word cutting method described in Shi Shixian claim 1 to 5 any one based on deep learning.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910322127.8A CN110222329B (en) | 2019-04-22 | 2019-04-22 | Chinese word segmentation method and device based on deep learning |
SG11202111464WA SG11202111464WA (en) | 2019-04-22 | 2019-11-14 | Method, device, storage medium, and computing device for segmenting chinese word based on deep learning |
JP2021563188A JP7178513B2 (en) | 2019-04-22 | 2019-11-14 | Chinese word segmentation method, device, storage medium and computer equipment based on deep learning |
PCT/CN2019/118259 WO2020215694A1 (en) | 2019-04-22 | 2019-11-14 | Chinese word segmentation method and apparatus based on deep learning, and storage medium and computer device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910322127.8A CN110222329B (en) | 2019-04-22 | 2019-04-22 | Chinese word segmentation method and device based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110222329A true CN110222329A (en) | 2019-09-10 |
CN110222329B CN110222329B (en) | 2023-11-24 |
Family
ID=67819927
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910322127.8A Active CN110222329B (en) | 2019-04-22 | 2019-04-22 | Chinese word segmentation method and device based on deep learning |
Country Status (4)
Country | Link |
---|---|
JP (1) | JP7178513B2 (en) |
CN (1) | CN110222329B (en) |
SG (1) | SG11202111464WA (en) |
WO (1) | WO2020215694A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020215694A1 (en) * | 2019-04-22 | 2020-10-29 | 平安科技(深圳)有限公司 | Chinese word segmentation method and apparatus based on deep learning, and storage medium and computer device |
CN113341919A (en) * | 2021-05-31 | 2021-09-03 | 中国科学院重庆绿色智能技术研究院 | Computing system fault prediction method based on time sequence data length optimization |
TWI771841B (en) * | 2020-05-08 | 2022-07-21 | 南韓商韓領有限公司 | Systems and methods for word segmentation based on a competing neural character language model |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112528648A (en) * | 2020-12-10 | 2021-03-19 | 平安科技(深圳)有限公司 | Method, device, equipment and storage medium for predicting polyphone pronunciation |
CN112884087A (en) * | 2021-04-07 | 2021-06-01 | 山东大学 | Biological enhancer and identification method for type thereof |
CN114863995B (en) * | 2022-03-30 | 2024-05-07 | 安徽大学 | Silencer prediction method based on bidirectional gating cyclic neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104182423A (en) * | 2013-05-27 | 2014-12-03 | 华东师范大学 | Conditional random field-based automatic Chinese personal name recognition method |
CN107977354A (en) * | 2017-10-12 | 2018-05-01 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on Bi-LSTM-CNN |
CN108268444A (en) * | 2018-01-10 | 2018-07-10 | 南京邮电大学 | A kind of Chinese word cutting method based on two-way LSTM, CNN and CRF |
US20180329897A1 (en) * | 2016-10-26 | 2018-11-15 | Deepmind Technologies Limited | Processing text sequences using neural networks |
US20190018836A1 (en) * | 2016-04-12 | 2019-01-17 | Huawei Technologies Co., Ltd. | Word Segmentation method and System for Language Text |
CN109255119A (en) * | 2018-07-18 | 2019-01-22 | 五邑大学 | A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU4869601A (en) * | 2000-03-20 | 2001-10-03 | Robert J. Freeman | Natural-language processing system using a large corpus |
JP2008140117A (en) | 2006-12-01 | 2008-06-19 | National Institute Of Information & Communication Technology | Apparatus for segmenting chinese character sequence to chinese word sequence |
CN103020034A (en) | 2011-09-26 | 2013-04-03 | 北京大学 | Chinese words segmentation method and device |
CN104268200A (en) * | 2013-09-22 | 2015-01-07 | 中科嘉速(北京)并行软件有限公司 | Unsupervised named entity semantic disambiguation method based on deep learning |
CN108536679B (en) * | 2018-04-13 | 2022-05-20 | 腾讯科技(成都)有限公司 | Named entity recognition method, device, equipment and computer readable storage medium |
CN109086267B (en) * | 2018-07-11 | 2022-07-26 | 南京邮电大学 | Chinese word segmentation method based on deep learning |
CN110222329B (en) * | 2019-04-22 | 2023-11-24 | 平安科技(深圳)有限公司 | Chinese word segmentation method and device based on deep learning |
-
2019
- 2019-04-22 CN CN201910322127.8A patent/CN110222329B/en active Active
- 2019-11-14 JP JP2021563188A patent/JP7178513B2/en active Active
- 2019-11-14 SG SG11202111464WA patent/SG11202111464WA/en unknown
- 2019-11-14 WO PCT/CN2019/118259 patent/WO2020215694A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104182423A (en) * | 2013-05-27 | 2014-12-03 | 华东师范大学 | Conditional random field-based automatic Chinese personal name recognition method |
US20190018836A1 (en) * | 2016-04-12 | 2019-01-17 | Huawei Technologies Co., Ltd. | Word Segmentation method and System for Language Text |
US20180329897A1 (en) * | 2016-10-26 | 2018-11-15 | Deepmind Technologies Limited | Processing text sequences using neural networks |
CN107977354A (en) * | 2017-10-12 | 2018-05-01 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on Bi-LSTM-CNN |
CN108268444A (en) * | 2018-01-10 | 2018-07-10 | 南京邮电大学 | A kind of Chinese word cutting method based on two-way LSTM, CNN and CRF |
CN109255119A (en) * | 2018-07-18 | 2019-01-22 | 五邑大学 | A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020215694A1 (en) * | 2019-04-22 | 2020-10-29 | 平安科技(深圳)有限公司 | Chinese word segmentation method and apparatus based on deep learning, and storage medium and computer device |
TWI771841B (en) * | 2020-05-08 | 2022-07-21 | 南韓商韓領有限公司 | Systems and methods for word segmentation based on a competing neural character language model |
CN113341919A (en) * | 2021-05-31 | 2021-09-03 | 中国科学院重庆绿色智能技术研究院 | Computing system fault prediction method based on time sequence data length optimization |
Also Published As
Publication number | Publication date |
---|---|
WO2020215694A1 (en) | 2020-10-29 |
JP7178513B2 (en) | 2022-11-25 |
JP2022530447A (en) | 2022-06-29 |
CN110222329B (en) | 2023-11-24 |
SG11202111464WA (en) | 2021-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110222329A (en) | A kind of Chinese word cutting method and device based on deep learning | |
CN107392973B (en) | Pixel-level handwritten Chinese character automatic generation method, storage device and processing device | |
CN108509413A (en) | Digest extraction method, device, computer equipment and storage medium | |
CN110287961A (en) | Chinese word cutting method, electronic device and readable storage medium storing program for executing | |
CN106951825A (en) | A kind of quality of human face image assessment system and implementation method | |
CN108733644B (en) | A kind of text emotion analysis method, computer readable storage medium and terminal device | |
CN106997474A (en) | A kind of node of graph multi-tag sorting technique based on deep learning | |
CN108595585A (en) | Sample data sorting technique, model training method, electronic equipment and storage medium | |
CN110222184A (en) | A kind of emotion information recognition methods of text and relevant apparatus | |
CN107480143A (en) | Dialogue topic dividing method and system based on context dependence | |
CN109829162A (en) | A kind of text segmenting method and device | |
CN110134961A (en) | Processing method, device and the storage medium of text | |
CN107958230A (en) | Facial expression recognizing method and device | |
CN109829478A (en) | One kind being based on the problem of variation self-encoding encoder classification method and device | |
CN110245353B (en) | Natural language expression method, device, equipment and storage medium | |
CN110472268A (en) | A kind of bridge monitoring data modality recognition methods and device | |
CN113220876A (en) | Multi-label classification method and system for English text | |
CN113011532A (en) | Classification model training method and device, computing equipment and storage medium | |
CN109101984A (en) | A kind of image-recognizing method and device based on convolutional neural networks | |
CN109033078B (en) | The recognition methods of sentence classification and device, storage medium, processor | |
CN113722477B (en) | Internet citizen emotion recognition method and system based on multitask learning and electronic equipment | |
CN106855852A (en) | The determination method and device of sentence emotion | |
CN110245332A (en) | Chinese character code method and apparatus based on two-way length memory network model in short-term | |
CN110489552A (en) | A kind of microblog users suicide risk checking method and device | |
CN115906861A (en) | Statement emotion analysis method and device based on interaction aspect information fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |