CN108446273B - Kalman filtering word vector learning method based on Dield process - Google Patents
Kalman filtering word vector learning method based on Dield process Download PDFInfo
- Publication number
- CN108446273B CN108446273B CN201810212606.XA CN201810212606A CN108446273B CN 108446273 B CN108446273 B CN 108446273B CN 201810212606 A CN201810212606 A CN 201810212606A CN 108446273 B CN108446273 B CN 108446273B
- Authority
- CN
- China
- Prior art keywords
- calculating
- lds
- model
- kalman filter
- distribution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Mathematics (AREA)
- Mathematical Optimization (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Algebra (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Machine Translation (AREA)
Abstract
A kalman filtering word vector learning method based on a dir process, the method comprising: training and preprocessing the corpus to generate an LDS language model system, initializing system parameters, assuming that process noise meets normal distribution, defining clustering thetat=(μt,∑t),μtCalculating theta for the frequency of occurrence of words t in a corpustThe Dirichlet prior distribution is calculated through Kalman filtering derivation and Gibbs sampling estimation, the alternative clusters are extracted by using MCMC sampling algorithm, the selection probability of the alternative clusters is calculated, and the alternative cluster with the highest probability value is selected as thetatAnd calculating the minimum mean square error estimation value of the cluster, substituting the calculation result into the LDS language model, training the model through an EM (effective vector) algorithm to stabilize the model parameters, inputting the preprocessed corpus into the trained LDS language model, and calculating the implicit vector expression through a Kalman filter one-step updating formula.
Description
Technical Field
The invention relates to the field of natural language processing, in particular to a Kalman filtering word vector learning method based on a Dield process.
Background
In Natural Language Processing (NLP) related tasks, natural language is processed by algorithms in machine learning, which usually requires the language to be first mathematically processed, since the machine recognizes only mathematical symbols. A vector is a thing that people abstract from nature and give to a machine for processing, and a word vector is a way to mathematically transform words in a language.
One of the simplest word vector Representation modes is One-hot Representation, which is to use a very long vector to represent a word, the length of the vector is the size of the dictionary, the vector has only One 1 component, and the other 0 components are all 0, and the position of 1 corresponds to the position of the word in the dictionary. However, this word-word vector representation has two disadvantages: (1) are vulnerable to dimensional disasters, especially when used in some algorithms for Deep Learning; (2) the word-to-word similarity is not well characterized.
Another word vector Representation is Distributed replication, which was originally proposed by Hinton in 1986, and overcomes the drawbacks of One-hot replication. The basic principle is as follows: by training each word in a certain language to map into a short vector with fixed length (here, "short" is relative to "long" of One-hot Representation), putting all the vectors together to form a word vector space, and each vector is a point in the space, and introducing "distance" into the space, the similarity (lexical, semantic) between the words can be judged according to the distance between the words.
Due to many characteristics of text data, such as the phenomenon of similar words and ambiguous words, the dimensionality of feature vectors representing the text data is often high, but the high-dimensional feature vectors are not necessarily beneficial to classification, but can lead to feature sparsity, and reduce the efficiency of an algorithm and even reduce the classification effect. Therefore, a good word vector representation method has important significance.
To derive context-based word vectors, the gaussian Linear Dynamical System (LDS) is used to describe a language model. The advantage of this approach is that the context of the current word can be fully exploited. This model is learned through a large unlabeled corpus and the implicit vector representation of each word can be inferred by a kalman filter. Let the lexicon be V and the corpus length be T. By usingAn indicator vector representing the word t, with a dimension of | V |. If the word t is at the ith position in the dictionary, the vectorIs not zero. Definition of muiIs the frequency of occurrence of the word t in the corpus.
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, ηtAnd xitRepresenting the process noise and measurement noise of the system, a and C representing the state transition matrix and the observation matrix. Definition of(mu is mu)iOne element of) w)tHas a mean value of zero. Defining expected valuesFunction E represents the desirability function.
The number of words in a sentence that differ from the word at the ith position by the number of words at the jth position is defined as the lag, and when the lag is k,μiandin a corpus of length T, the estimated values are:
wherein E [ alpha ], [ beta ], [ alpha ], [ beta ]]The representative is to calculate the expected value,is referred to asIn x pairsAnd calculating an expected value.
The following is a description of the process of estimating model parameters, the whole process is divided into two steps, firstly, the EM algorithm is initialized by SSID (subframe identification) method, parameters A, C, Q and R (Q, R is the basic parameter of the Kalman filter) are optimized by EMwith ASOS (Martens, 2010) (an enhanced EM algorithm), and then the estimation value based on the first t-1 observation values is obtainedK is Kalman filter gain and an estimated value based on all corpora(I, J is the fundamental parameter of the Kalman Filter) computing the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
In the prior art, it is assumed that the distribution of system process noise and measurement noise belongs to zero-mean white gaussian noise, but the system noise is uncertain in general, and particularly for a language model, many problems are that new and unknown information in a corpus needs to be mined, so the assumption is not practical, and a word vector obtained based on the assumption is inaccurate.
The Dirichlet Process is a famous variable parameter bayes model, and is particularly suitable for solving various clustering problems. The advantage of this type of model is that the number of classes used in building the hybrid model need not be specified manually, but is calculated autonomously from the model and data. In the field of natural language processing, many problems need to mine new and unknown information in the corpus, and the information often lacks prior knowledge, so that the advantage of the Dirichlet process can be fully embodied in many natural language processing application problems. The prior art has not applied the dike process to the word vector representation of natural language processing.
Therefore, it is necessary to develop a new word vector representation method.
Disclosure of Invention
In order to achieve the purpose of the invention, the invention provides a Kalman filtering word vector learning method based on a Dike process, which comprises the following steps:
the corpus is trained and preprocessed,
generating an LDS language model system, initializing system parameters,
assuming that the process noise satisfies the normal distribution, define the cluster θt=(μt,∑t),μtCalculating theta for the frequency of occurrence of words t in a corpustThe dirichlet prior distribution of (a) is,
the posterior distribution is calculated by kalman filter derivation and Gibbs sampling estimation,
extracting alternative clusters by using MCMC sampling algorithm, calculating selection probability of the alternative clusters, and selecting the alternative cluster with the highest probability value as thetatCalculating a minimum mean square error estimate for the cluster,
substituting the calculated result into an LDS language model, training the model through an EM algorithm to stabilize the model parameters,
inputting the preprocessed corpus into a trained LDS language model, and calculating the implicit vector expression by using a one-step updating formula of a Kalman filter.
The LDS language model comprises the following steps:
xt=Axt-1+ηt
wt=Cxt+ξt
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, ηtAnd xitRepresenting the process noise and measurement noise of the system, a and C representing the state transition matrix and the observation matrix.
Wherein, thetatSatisfying the prior distribution assumption of the Dirichlet process, and calculating thetat~G;G~DP(α,G0),
Where the symbol-represents the distribution subject to the following, the parameter G, D, P is an indicator of the Dirichlet distributionNumber, α is a scale factor, G0Denotes the base distribution, G0=NIW(μ0,κ0,ν0,Λ0),μ0,κ0,ν0,Λ0Is a hyper-parameter.
Wherein, the calculation formula of the posterior distribution is as follows:
p(x0:T,θ1:T|w1:T)=p(x0:T|θ1:T,w1:T)p(θ1:T|w1:T)
p(x0:T|θ1:T,w1:T) Can be derived by Kalman filtering, p (theta)1:T|w1:T) Can be estimated by Gibbs sampling.
Extracting alternative clusters by using MCMC sampling algorithm, calculating selection probability of the alternative clusters, and selecting the alternative cluster with the highest probability value as thetatThe specific procedure for the values of (a) is as follows:
extracting from 1, …, T wordsThe extraction result after the word t is removed in the i times of sampling is shown, and i is more than or equal to 2;
the MH algorithm extracts the candidate clusters from the following formula:
Substituting the calculation result into the LDS language model, training the model through an EM algorithm, and enabling the specific process of stabilizing the model parameters to be as follows:
e, step E: calculating the state estimation value of the Kalman filter at the t moment according to the parameter value at the t-1 momentFurther calculating a state estimation valueCovariance matrix of (2):
(1) the following definitions are first made:
wherein R is a covariance matrix of observation noise;
(2) calculating data at the time t by using a BP neural network model,
Wherein N isVIs a unit diagonal matrix, B (theta)t)=G0chol(∑t)TThen, then
forward propagation: t is 1, …, T,
And M: calculating an expected value by using the covariance matrix, maximizing the expected value, and solving related parameters of the LDS model, namely a state transition matrix A and an observation matrix C;
and updating the parameters and repeating the two steps until the LDS model is stable.
The specific process of calculating the implicit vector representation through the kalman filter one-step updating formula is as follows:
the kalman filter one-step update formula is:
wherein K, R, B, Q, P and I are basic parameters of a Kalman filter,implicit vector representation x calculated for one-step updating formula using Kalman filtertUsing the estimated value to calculate the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
The invention can fully utilize unknown information in the corpus and provide better expression of the learning word vector, and the word vector obtained by using the model of the invention can more accurately express the meaning represented by the word and the potential relation with other words, such as a near-synonym, a synonym, an antisense word and the like.
The features and advantages of the present invention will become apparent by reference to the following drawings and detailed description of specific embodiments of the invention.
Drawings
FIG. 1 shows a flow chart of a word vector learning method of the present invention
Detailed Description
The invention provides a Kalman filtering word vector learning method based on a Dietype process, which is characterized in that process noise and measurement noise of a system are assumed to obey Dietype distribution, then Dietype posterior distribution can be calculated, sampling is carried out by adopting MCMC (Monte Carlo sampling algorithm) sampling algorithm to obtain alternative clusters with highest selection probability, the alternative clusters are substituted into an LDS (Linear discriminant system) model to train model parameters, finally preprocessed linguistic data are input into a trained language model, and a Kalman filter is utilized to further update a formula to calculate an estimation value represented by an implicit vector.
The technical solution of the present invention will be described in detail with reference to fig. 1.
Firstly, training and preprocessing are performed on the speech, including word segmentation processing and dictionary generation, which are well-known processes for word vector learning in the field of natural language processing, and are not described herein again.
Then, the LDS language model system of the invention is generated, and the system parameters are initialized.
The LDS language model of the invention is as follows:
xt=Axt-1+ηt
wt=Cxt+ξt
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, we denote the one-hot representation, the observed noise includes the process noise and the measurement noise of the system, denoted as ηtAnd xitAnd a and C denote a state transition matrix and an observation matrix. We will measure the noise xitSet to zero mean white Gaussian noise, process noise ηtIs expressed as a dir process.
1. Let η betSatisfy the normal distribution, ηt~N(μt,∑t) Defining a cluster θt=(μt,∑t),μtIs the frequency of occurrence of words t in the corpus, θtSatisfying the prior distribution assumption of the Dirichlet process, and calculating thetat~G;G~DP(α,G0) The notation-denotes the distribution subject to the following, the parameter G, D, P is the notation of the Dirichlet distribution, α is the scale factor, G0Denotes the base distribution, G0=NIW(μ0,κ0,ν0,Λ0),μ0,κ0,ν0,Λ0Is a hyper-parameter.
2. Calculating posterior distribution:
p(x0:T,θ1:T|w1:T)=p(x0:T|θ1:T,w1:T)p(θ1:T|w1:T)
wherein p (x)0:T|θ1:T,w1:T) Can be derived by Kalman filtering, p (theta)1:T|w1:T) Can be estimated by Gibbs sampling.
Extracting from 1, …, T wordsThe extraction result after the word t is removed in the i times of sampling is shown, and i is more than or equal to 2;
then, an MH algorithm in the MCMC sampling algorithm is used for extracting alternative clusters according to the following formula:
Selecting the candidate cluster with the highest probability value as thetatAnd performing subsequent calculation.
(3) Computing cluster θtThe minimum mean square error estimate of.
3. Substituting the calculation result into an LDS language model, training the model through an EM algorithm, and enabling the model parameters to be stable, wherein the specific process is as follows:
e, step E: calculating the state estimation value of the Kalman filter at the t moment according to the parameter value at the t-1 momentFurther calculating a state estimation valueThe covariance matrix of (2). The method specifically comprises the following steps:
(1) the following definitions are first made:
wherein, R is the covariance matrix of the observation noise.
(2) And calculating data at the time t by using a BP neural network model, wherein the model consists of two processes of forward propagation of information and backward propagation of errors, the forward propagation is to use the data before the time t to infer the data at the time t, and the backward propagation is to use the data after the time t to infer the data at the time t.
Wherein N isVIs a unit diagonal matrix, B (theta)t)=G0chol(∑t)TThen, then
forward propagation: t is 1, …, T,
And M: and calculating an expected value by using the covariance matrix, maximizing the expected value, and solving related parameters of the LDS model, namely the state transition matrix A and the observation matrix C.
And updating the parameters and repeating the two steps until the LDS model is stable.
4. Inputting the preprocessed corpus into the trained LDS language model, and calculating the implicit vector expression x by using a Kalman filter one-step updating formula listed belowt:
The above-mentioned Kalman filter one-step updating formula is the existing formula in the prior art, wherein K, R, B, Q, P, I are the basic parameters of the Kalman filter,implicit vector representation x calculated for one-step updating formula using Kalman filtertUsing the estimated value to calculate the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
The foregoing is illustrative only, and it is to be understood that modifications and variations in the arrangements and details described herein will be apparent to those skilled in the art, and that any obvious substitutions are within the scope of the present invention without departing from the inventive concepts thereof. It is therefore intended that the scope of the appended claims be limited only by the specific details presented by way of the foregoing description and explanation.
Claims (6)
1. A kalman filtering word vector learning method based on a dir process, the method comprising:
the corpus is trained and preprocessed,
generating an LDS language model, initializing system parameters,
the LDS language model comprises the following steps:
xt=Axt-1+ηt
wt=Cxt+ξt
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, ηtAnd xitRepresenting process and measurement noise of the system, A and C representing state transitionsShifting the matrix and observing the matrix;
assuming that the process noise satisfies the normal distribution, define the cluster θt=(μt,∑t),μtFor the frequency of occurrence of words t in the corpus, sigmatComputing θ for covariance matrix of words t in corpustThe dirichlet prior distribution of (a) is,
the posterior distribution is calculated by kalman filter derivation and Gibbs sampling estimation,
extracting alternative clusters by using MCMC sampling algorithm, calculating selection probability of the alternative clusters, and selecting the alternative cluster with the highest probability value as thetatCalculating a minimum mean square error estimate for the cluster,
substituting the calculated result into an LDS language model, training the model through an EM algorithm to stabilize the model parameters,
inputting the preprocessed corpus into a trained LDS language model, and calculating the implicit vector expression by using a one-step updating formula of a Kalman filter.
2. The method of claim 1, wherein,
θtsatisfying the prior distribution assumption of the Dirichlet process, and calculating thetat~G;G~DP(α,G0),
Where the notation-denotes the distribution subject to the following, the parameter G, D, P is the notation of the Dirichlet distribution, α is the scale factor, G0Denotes the base distribution, G0=NIW(μ0,κ0,v0,Λ0),μ0,κ0,v0,Λ0For hyperparameters, NIW represents a normal inverse weixate distribution.
3. The method of claim 2, wherein the posterior distribution is calculated as follows:
p(x0:T,θ1:T|w1:T)=p(x0:T|θ1:T,w1:T)p(θ1:T|w1:T)
p(x0:T|θ1:T,w1:T) Can be derived by Kalman filtering, p (theta)1:T|w1:T) Can be estimated by Gibbs sampling.
4. The method as claimed in claim 3, wherein the extracting of the candidate clusters by the MCMC sampling algorithm, the calculating of the selection probability of the candidate clusters, and the selecting of the candidate cluster with the highest probability value as θtThe specific procedure for the values of (a) is as follows:
extracting from 1, …, T words The extraction result after the word t is removed in the i times of sampling is shown, and i is more than or equal to 2;
the MH algorithm extracts the candidate clusters from the following formula:
5. The method as claimed in claim 4, wherein the step of substituting the calculated result into the LDS language model and training the model by EM algorithm to make the model parameters stable is as follows:
e, step E: calculating the state estimation value of the Kalman filter at the t moment according to the parameter value at the t-1 momentFurther calculating a state estimation valueCovariance matrix of (2):
(1) the following definitions are first made:
wherein R is a covariance matrix of observation noise;
(2) calculating data at the time t by using a BP neural network model,
Wherein N isVIs a unit diagonal matrix, B (theta)t)=G0chol(∑t)TChol denotes Cholesky decomposition, then
forward propagation: t is 1, …, T,
And M: calculating an expected value by using the covariance matrix, maximizing the expected value, and solving related parameters of the LDS model, namely a state transition matrix A and an observation matrix C;
and updating the parameters and repeating the two steps until the LDS language model is stable.
6. The method of claim 5, wherein the calculation of the implicit vector representation by the Kalman filter one-step update formula is performed as follows:
the kalman filter one-step update formula is:
wherein K, R, B, Q, P and I are basic parameters of a Kalman filter,implicit vector representation x calculated for one-step updating formula using Kalman filtertUsing the estimated value to calculate the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810212606.XA CN108446273B (en) | 2018-03-15 | 2018-03-15 | Kalman filtering word vector learning method based on Dield process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810212606.XA CN108446273B (en) | 2018-03-15 | 2018-03-15 | Kalman filtering word vector learning method based on Dield process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108446273A CN108446273A (en) | 2018-08-24 |
CN108446273B true CN108446273B (en) | 2021-07-20 |
Family
ID=63195245
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810212606.XA Active CN108446273B (en) | 2018-03-15 | 2018-03-15 | Kalman filtering word vector learning method based on Dield process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108446273B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109801073A (en) * | 2018-12-13 | 2019-05-24 | 中国平安财产保险股份有限公司 | Risk subscribers recognition methods, device, computer equipment and storage medium |
CN116561814B (en) * | 2023-05-17 | 2023-11-24 | 杭州君方科技有限公司 | Textile chemical fiber supply chain information tamper-proof method and system thereof |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103278170A (en) * | 2013-05-16 | 2013-09-04 | 东南大学 | Mobile robot cascading map building method based on remarkable scenic spot detection |
CN104933199A (en) * | 2015-07-14 | 2015-09-23 | 成都理工大学 | Geological big data fusion system and method based on trusted mechanism |
CN105760365A (en) * | 2016-03-14 | 2016-07-13 | 云南大学 | Probability latent parameter estimation model of image semantic data based on Bayesian algorithm |
CN106547735A (en) * | 2016-10-25 | 2017-03-29 | 复旦大学 | The structure and using method of the dynamic word or word vector based on the context-aware of deep learning |
CN106815297A (en) * | 2016-12-09 | 2017-06-09 | 宁波大学 | A kind of academic resources recommendation service system and method |
CN106971176A (en) * | 2017-05-10 | 2017-07-21 | 河海大学 | Tracking infrared human body target method based on rarefaction representation |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8862932B2 (en) * | 2008-08-15 | 2014-10-14 | Apple Inc. | Read XF instruction for processing vectors |
US9411829B2 (en) * | 2013-06-10 | 2016-08-09 | Yahoo! Inc. | Image-based faceted system and method |
-
2018
- 2018-03-15 CN CN201810212606.XA patent/CN108446273B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103278170A (en) * | 2013-05-16 | 2013-09-04 | 东南大学 | Mobile robot cascading map building method based on remarkable scenic spot detection |
CN104933199A (en) * | 2015-07-14 | 2015-09-23 | 成都理工大学 | Geological big data fusion system and method based on trusted mechanism |
CN105760365A (en) * | 2016-03-14 | 2016-07-13 | 云南大学 | Probability latent parameter estimation model of image semantic data based on Bayesian algorithm |
CN106547735A (en) * | 2016-10-25 | 2017-03-29 | 复旦大学 | The structure and using method of the dynamic word or word vector based on the context-aware of deep learning |
CN106815297A (en) * | 2016-12-09 | 2017-06-09 | 宁波大学 | A kind of academic resources recommendation service system and method |
CN106971176A (en) * | 2017-05-10 | 2017-07-21 | 河海大学 | Tracking infrared human body target method based on rarefaction representation |
Also Published As
Publication number | Publication date |
---|---|
CN108446273A (en) | 2018-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108595706B (en) | Document semantic representation method based on topic word similarity, and text classification method and device | |
CN107085581B (en) | Short text classification method and device | |
He et al. | Discriminative learning in sequential pattern recognition | |
Goikoetxea et al. | Random walks and neural network language models on knowledge bases | |
CN107480143A (en) | Dialogue topic dividing method and system based on context dependence | |
CN111125367B (en) | Multi-character relation extraction method based on multi-level attention mechanism | |
Chen et al. | Matrix factorization with knowledge graph propagation for unsupervised spoken language understanding | |
Li et al. | A generative word embedding model and its low rank positive semidefinite solution | |
CN108733647B (en) | Word vector generation method based on Gaussian distribution | |
CN108446273B (en) | Kalman filtering word vector learning method based on Dield process | |
Asadi et al. | Creating discriminative models for time series classification and clustering by HMM ensembles | |
Luo et al. | Unsupervised learning of morphological forests | |
CN112364659B (en) | Automatic identification method and device for unsupervised semantic representation | |
JP6586026B2 (en) | Word vector learning device, natural language processing device, method, and program | |
JP2017078919A (en) | Word expansion device, classification device, machine learning device, method, and program | |
Savchenko | Statistical recognition of a set of patterns using novel probability neural network | |
Lee et al. | A multimodal variational approach to learning and inference in switching state space models [speech processing application] | |
KR20140077774A (en) | Apparatus and method for adapting language model based on document clustering | |
CN112883158A (en) | Method, device, medium and electronic equipment for classifying short texts | |
Klomsae et al. | A novel string grammar fuzzy C-medians | |
Kudinov et al. | A hybrid language model based on a recurrent neural network and probabilistic topic modeling | |
JP6057170B2 (en) | Spoken language evaluation device, parameter estimation device, method, and program | |
Chen | Optimization of Data Mining and Analysis System for Chinese Language Teaching Based on Convolutional Neural Network | |
Shustin et al. | PCENet: High dimensional surrogate modeling for learning uncertainty | |
Mathivanan et al. | Text Classification of E-Commerce Product via Hidden Markov Model. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |