CN108446273B - Kalman filtering word vector learning method based on Dield process - Google Patents

Kalman filtering word vector learning method based on Dield process Download PDF

Info

Publication number
CN108446273B
CN108446273B CN201810212606.XA CN201810212606A CN108446273B CN 108446273 B CN108446273 B CN 108446273B CN 201810212606 A CN201810212606 A CN 201810212606A CN 108446273 B CN108446273 B CN 108446273B
Authority
CN
China
Prior art keywords
calculating
lds
model
kalman filter
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810212606.XA
Other languages
Chinese (zh)
Other versions
CN108446273A (en
Inventor
王磊
翟荣安
刘晶晶
王毓
王飞
于振中
李文兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HRG International Institute for Research and Innovation
Original Assignee
HRG International Institute for Research and Innovation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HRG International Institute for Research and Innovation filed Critical HRG International Institute for Research and Innovation
Priority to CN201810212606.XA priority Critical patent/CN108446273B/en
Publication of CN108446273A publication Critical patent/CN108446273A/en
Application granted granted Critical
Publication of CN108446273B publication Critical patent/CN108446273B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

A kalman filtering word vector learning method based on a dir process, the method comprising: training and preprocessing the corpus to generate an LDS language model system, initializing system parameters, assuming that process noise meets normal distribution, defining clustering thetat=(μt,∑t),μtCalculating theta for the frequency of occurrence of words t in a corpustThe Dirichlet prior distribution is calculated through Kalman filtering derivation and Gibbs sampling estimation, the alternative clusters are extracted by using MCMC sampling algorithm, the selection probability of the alternative clusters is calculated, and the alternative cluster with the highest probability value is selected as thetatAnd calculating the minimum mean square error estimation value of the cluster, substituting the calculation result into the LDS language model, training the model through an EM (effective vector) algorithm to stabilize the model parameters, inputting the preprocessed corpus into the trained LDS language model, and calculating the implicit vector expression through a Kalman filter one-step updating formula.

Description

Kalman filtering word vector learning method based on Dield process
Technical Field
The invention relates to the field of natural language processing, in particular to a Kalman filtering word vector learning method based on a Dield process.
Background
In Natural Language Processing (NLP) related tasks, natural language is processed by algorithms in machine learning, which usually requires the language to be first mathematically processed, since the machine recognizes only mathematical symbols. A vector is a thing that people abstract from nature and give to a machine for processing, and a word vector is a way to mathematically transform words in a language.
One of the simplest word vector Representation modes is One-hot Representation, which is to use a very long vector to represent a word, the length of the vector is the size of the dictionary, the vector has only One 1 component, and the other 0 components are all 0, and the position of 1 corresponds to the position of the word in the dictionary. However, this word-word vector representation has two disadvantages: (1) are vulnerable to dimensional disasters, especially when used in some algorithms for Deep Learning; (2) the word-to-word similarity is not well characterized.
Another word vector Representation is Distributed replication, which was originally proposed by Hinton in 1986, and overcomes the drawbacks of One-hot replication. The basic principle is as follows: by training each word in a certain language to map into a short vector with fixed length (here, "short" is relative to "long" of One-hot Representation), putting all the vectors together to form a word vector space, and each vector is a point in the space, and introducing "distance" into the space, the similarity (lexical, semantic) between the words can be judged according to the distance between the words.
Due to many characteristics of text data, such as the phenomenon of similar words and ambiguous words, the dimensionality of feature vectors representing the text data is often high, but the high-dimensional feature vectors are not necessarily beneficial to classification, but can lead to feature sparsity, and reduce the efficiency of an algorithm and even reduce the classification effect. Therefore, a good word vector representation method has important significance.
To derive context-based word vectors, the gaussian Linear Dynamical System (LDS) is used to describe a language model. The advantage of this approach is that the context of the current word can be fully exploited. This model is learned through a large unlabeled corpus and the implicit vector representation of each word can be inferred by a kalman filter. Let the lexicon be V and the corpus length be T. By using
Figure BDA0001597638560000021
An indicator vector representing the word t, with a dimension of | V |. If the word t is at the ith position in the dictionary, the vector
Figure BDA0001597638560000022
Is not zero. Definition of muiIs the frequency of occurrence of the word t in the corpus.
The LDS model is:
Figure BDA0001597638560000023
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, ηtAnd xitRepresenting the process noise and measurement noise of the system, a and C representing the state transition matrix and the observation matrix. Definition of
Figure BDA0001597638560000024
(mu is mu)iOne element of) w)tHas a mean value of zero. Defining expected values
Figure BDA0001597638560000025
Function E represents the desirability function.
The number of words in a sentence that differ from the word at the ith position by the number of words at the jth position is defined as the lag, and when the lag is k,
Figure BDA0001597638560000031
μiand
Figure BDA0001597638560000037
in a corpus of length T, the estimated values are:
Figure BDA0001597638560000032
wherein E [ alpha ], [ beta ], [ alpha ], [ beta ]]The representative is to calculate the expected value,
Figure BDA0001597638560000033
is referred to asIn x pairs
Figure BDA0001597638560000034
And calculating an expected value.
The following is a description of the process of estimating model parameters, the whole process is divided into two steps, firstly, the EM algorithm is initialized by SSID (subframe identification) method, parameters A, C, Q and R (Q, R is the basic parameter of the Kalman filter) are optimized by EMwith ASOS (Martens, 2010) (an enhanced EM algorithm), and then the estimation value based on the first t-1 observation values is obtained
Figure BDA0001597638560000035
K is Kalman filter gain and an estimated value based on all corpora
Figure BDA0001597638560000036
(I, J is the fundamental parameter of the Kalman Filter) computing the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
In the prior art, it is assumed that the distribution of system process noise and measurement noise belongs to zero-mean white gaussian noise, but the system noise is uncertain in general, and particularly for a language model, many problems are that new and unknown information in a corpus needs to be mined, so the assumption is not practical, and a word vector obtained based on the assumption is inaccurate.
The Dirichlet Process is a famous variable parameter bayes model, and is particularly suitable for solving various clustering problems. The advantage of this type of model is that the number of classes used in building the hybrid model need not be specified manually, but is calculated autonomously from the model and data. In the field of natural language processing, many problems need to mine new and unknown information in the corpus, and the information often lacks prior knowledge, so that the advantage of the Dirichlet process can be fully embodied in many natural language processing application problems. The prior art has not applied the dike process to the word vector representation of natural language processing.
Therefore, it is necessary to develop a new word vector representation method.
Disclosure of Invention
In order to achieve the purpose of the invention, the invention provides a Kalman filtering word vector learning method based on a Dike process, which comprises the following steps:
the corpus is trained and preprocessed,
generating an LDS language model system, initializing system parameters,
assuming that the process noise satisfies the normal distribution, define the cluster θt=(μt,∑t),μtCalculating theta for the frequency of occurrence of words t in a corpustThe dirichlet prior distribution of (a) is,
the posterior distribution is calculated by kalman filter derivation and Gibbs sampling estimation,
extracting alternative clusters by using MCMC sampling algorithm, calculating selection probability of the alternative clusters, and selecting the alternative cluster with the highest probability value as thetatCalculating a minimum mean square error estimate for the cluster,
substituting the calculated result into an LDS language model, training the model through an EM algorithm to stabilize the model parameters,
inputting the preprocessed corpus into a trained LDS language model, and calculating the implicit vector expression by using a one-step updating formula of a Kalman filter.
The LDS language model comprises the following steps:
xt=Axt-1t
wt=Cxtt
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, ηtAnd xitRepresenting the process noise and measurement noise of the system, a and C representing the state transition matrix and the observation matrix.
Wherein, thetatSatisfying the prior distribution assumption of the Dirichlet process, and calculating thetat~G;G~DP(α,G0),
Where the symbol-represents the distribution subject to the following, the parameter G, D, P is an indicator of the Dirichlet distributionNumber, α is a scale factor, G0Denotes the base distribution, G0=NIW(μ0000),μ0000Is a hyper-parameter.
Wherein, the calculation formula of the posterior distribution is as follows:
p(x0:T1:T|w1:T)=p(x0:T1:T,w1:T)p(θ1:T|w1:T)
p(x0:T1:T,w1:T) Can be derived by Kalman filtering, p (theta)1:T|w1:T) Can be estimated by Gibbs sampling.
Extracting alternative clusters by using MCMC sampling algorithm, calculating selection probability of the alternative clusters, and selecting the alternative cluster with the highest probability value as thetatThe specific procedure for the values of (a) is as follows:
extracting from 1, …, T words
Figure BDA0001597638560000061
The extraction result after the word t is removed in the i times of sampling is shown, and i is more than or equal to 2;
the MH algorithm extracts the candidate clusters from the following formula:
Figure BDA0001597638560000062
computing selection probabilities for candidate clusters
Figure BDA0001597638560000063
If ρ > α, let
Figure BDA0001597638560000064
Otherwise make
Figure BDA0001597638560000065
Substituting the calculation result into the LDS language model, training the model through an EM algorithm, and enabling the specific process of stabilizing the model parameters to be as follows:
e, step E: calculating the state estimation value of the Kalman filter at the t moment according to the parameter value at the t-1 moment
Figure BDA0001597638560000066
Further calculating a state estimation value
Figure BDA0001597638560000067
Covariance matrix of (2):
(1) the following definitions are first made:
Figure BDA0001597638560000068
Figure BDA0001597638560000069
wherein R is a covariance matrix of observation noise;
(2) calculating data at the time t by using a BP neural network model,
and (3) back propagation: t, …,1, calculating a covariance matrix
Figure BDA0001597638560000071
And
Figure BDA0001597638560000072
order to
Figure BDA0001597638560000073
Wherein N isVIs a unit diagonal matrix, B (theta)t)=G0chol(∑t)TThen, then
Figure BDA0001597638560000074
Figure BDA0001597638560000075
Covariance matrix can be derived by Kalman filter
Figure BDA0001597638560000076
And
Figure BDA0001597638560000077
forward propagation: t is 1, …, T,
Figure BDA0001597638560000078
Figure BDA0001597638560000079
the covariance matrix can be derived and calculated through a Kalman filter
Figure BDA00015976385600000710
And mt|t1:t-1);
And M: calculating an expected value by using the covariance matrix, maximizing the expected value, and solving related parameters of the LDS model, namely a state transition matrix A and an observation matrix C;
and updating the parameters and repeating the two steps until the LDS model is stable.
The specific process of calculating the implicit vector representation through the kalman filter one-step updating formula is as follows:
the kalman filter one-step update formula is:
Figure BDA0001597638560000081
Figure BDA0001597638560000082
Figure BDA0001597638560000083
Figure BDA0001597638560000084
Figure BDA0001597638560000085
wherein K, R, B, Q, P and I are basic parameters of a Kalman filter,
Figure BDA0001597638560000086
implicit vector representation x calculated for one-step updating formula using Kalman filtertUsing the estimated value to calculate the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
The invention can fully utilize unknown information in the corpus and provide better expression of the learning word vector, and the word vector obtained by using the model of the invention can more accurately express the meaning represented by the word and the potential relation with other words, such as a near-synonym, a synonym, an antisense word and the like.
The features and advantages of the present invention will become apparent by reference to the following drawings and detailed description of specific embodiments of the invention.
Drawings
FIG. 1 shows a flow chart of a word vector learning method of the present invention
Detailed Description
The invention provides a Kalman filtering word vector learning method based on a Dietype process, which is characterized in that process noise and measurement noise of a system are assumed to obey Dietype distribution, then Dietype posterior distribution can be calculated, sampling is carried out by adopting MCMC (Monte Carlo sampling algorithm) sampling algorithm to obtain alternative clusters with highest selection probability, the alternative clusters are substituted into an LDS (Linear discriminant system) model to train model parameters, finally preprocessed linguistic data are input into a trained language model, and a Kalman filter is utilized to further update a formula to calculate an estimation value represented by an implicit vector.
The technical solution of the present invention will be described in detail with reference to fig. 1.
Firstly, training and preprocessing are performed on the speech, including word segmentation processing and dictionary generation, which are well-known processes for word vector learning in the field of natural language processing, and are not described herein again.
Then, the LDS language model system of the invention is generated, and the system parameters are initialized.
The LDS language model of the invention is as follows:
xt=Axt-1t
wt=Cxtt
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, we denote the one-hot representation, the observed noise includes the process noise and the measurement noise of the system, denoted as ηtAnd xitAnd a and C denote a state transition matrix and an observation matrix. We will measure the noise xitSet to zero mean white Gaussian noise, process noise ηtIs expressed as a dir process.
1. Let η betSatisfy the normal distribution, ηt~N(μt,∑t) Defining a cluster θt=(μt,∑t),μtIs the frequency of occurrence of words t in the corpus, θtSatisfying the prior distribution assumption of the Dirichlet process, and calculating thetat~G;G~DP(α,G0) The notation-denotes the distribution subject to the following, the parameter G, D, P is the notation of the Dirichlet distribution, α is the scale factor, G0Denotes the base distribution, G0=NIW(μ0000),μ0000Is a hyper-parameter.
2. Calculating posterior distribution:
p(x0:T1:T|w1:T)=p(x0:T1:T,w1:T)p(θ1:T|w1:T)
wherein p (x)0:T1:T,w1:T) Can be derived by Kalman filtering, p (theta)1:T|w1:T) Can be estimated by Gibbs sampling.
Extracting from 1, …, T words
Figure BDA0001597638560000101
The extraction result after the word t is removed in the i times of sampling is shown, and i is more than or equal to 2;
then, an MH algorithm in the MCMC sampling algorithm is used for extracting alternative clusters according to the following formula:
Figure BDA0001597638560000102
computing selection probabilities for candidate clusters
Figure BDA0001597638560000103
If ρ > α, let
Figure BDA0001597638560000104
Otherwise make
Figure BDA0001597638560000105
Selecting the candidate cluster with the highest probability value as thetatAnd performing subsequent calculation.
(3) Computing cluster θtThe minimum mean square error estimate of.
3. Substituting the calculation result into an LDS language model, training the model through an EM algorithm, and enabling the model parameters to be stable, wherein the specific process is as follows:
e, step E: calculating the state estimation value of the Kalman filter at the t moment according to the parameter value at the t-1 moment
Figure BDA0001597638560000111
Further calculating a state estimation value
Figure BDA0001597638560000112
The covariance matrix of (2). The method specifically comprises the following steps:
(1) the following definitions are first made:
Figure BDA0001597638560000113
Figure BDA0001597638560000114
wherein, R is the covariance matrix of the observation noise.
(2) And calculating data at the time t by using a BP neural network model, wherein the model consists of two processes of forward propagation of information and backward propagation of errors, the forward propagation is to use the data before the time t to infer the data at the time t, and the backward propagation is to use the data after the time t to infer the data at the time t.
And (3) back propagation: t, …,1, calculating a covariance matrix
Figure BDA0001597638560000115
And
Figure BDA0001597638560000116
order to
Figure BDA0001597638560000117
Wherein N isVIs a unit diagonal matrix, B (theta)t)=G0chol(∑t)TThen, then
Figure BDA0001597638560000121
Figure BDA0001597638560000122
Covariance matrix can be derived by Kalman filter
Figure BDA0001597638560000123
And
Figure BDA0001597638560000124
forward propagation: t is 1, …, T,
Figure BDA0001597638560000125
Figure BDA0001597638560000126
the covariance matrix can be derived and calculated through a Kalman filter
Figure BDA0001597638560000127
And mt|t1:t-1)。
And M: and calculating an expected value by using the covariance matrix, maximizing the expected value, and solving related parameters of the LDS model, namely the state transition matrix A and the observation matrix C.
And updating the parameters and repeating the two steps until the LDS model is stable.
4. Inputting the preprocessed corpus into the trained LDS language model, and calculating the implicit vector expression x by using a Kalman filter one-step updating formula listed belowt:
Figure BDA0001597638560000128
Figure BDA0001597638560000129
Figure BDA00015976385600001210
Figure BDA00015976385600001211
Figure BDA00015976385600001212
The above-mentioned Kalman filter one-step updating formula is the existing formula in the prior art, wherein K, R, B, Q, P, I are the basic parameters of the Kalman filter,
Figure BDA0001597638560000131
implicit vector representation x calculated for one-step updating formula using Kalman filtertUsing the estimated value to calculate the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
The foregoing is illustrative only, and it is to be understood that modifications and variations in the arrangements and details described herein will be apparent to those skilled in the art, and that any obvious substitutions are within the scope of the present invention without departing from the inventive concepts thereof. It is therefore intended that the scope of the appended claims be limited only by the specific details presented by way of the foregoing description and explanation.

Claims (6)

1. A kalman filtering word vector learning method based on a dir process, the method comprising:
the corpus is trained and preprocessed,
generating an LDS language model, initializing system parameters,
the LDS language model comprises the following steps:
xt=Axt-1t
wt=Cxtt
wherein xtImplicit vector representation, w, representing the word to be learnedtRepresenting the observed value, ηtAnd xitRepresenting process and measurement noise of the system, A and C representing state transitionsShifting the matrix and observing the matrix;
assuming that the process noise satisfies the normal distribution, define the cluster θt=(μt,∑t),μtFor the frequency of occurrence of words t in the corpus, sigmatComputing θ for covariance matrix of words t in corpustThe dirichlet prior distribution of (a) is,
the posterior distribution is calculated by kalman filter derivation and Gibbs sampling estimation,
extracting alternative clusters by using MCMC sampling algorithm, calculating selection probability of the alternative clusters, and selecting the alternative cluster with the highest probability value as thetatCalculating a minimum mean square error estimate for the cluster,
substituting the calculated result into an LDS language model, training the model through an EM algorithm to stabilize the model parameters,
inputting the preprocessed corpus into a trained LDS language model, and calculating the implicit vector expression by using a one-step updating formula of a Kalman filter.
2. The method of claim 1, wherein,
θtsatisfying the prior distribution assumption of the Dirichlet process, and calculating thetat~G;G~DP(α,G0),
Where the notation-denotes the distribution subject to the following, the parameter G, D, P is the notation of the Dirichlet distribution, α is the scale factor, G0Denotes the base distribution, G0=NIW(μ00,v00),μ00,v00For hyperparameters, NIW represents a normal inverse weixate distribution.
3. The method of claim 2, wherein the posterior distribution is calculated as follows:
p(x0:T1:T|w1:T)=p(x0:T1:T,w1:T)p(θ1:T|w1:T)
p(x0:T1:T,w1:T) Can be derived by Kalman filtering, p (theta)1:T|w1:T) Can be estimated by Gibbs sampling.
4. The method as claimed in claim 3, wherein the extracting of the candidate clusters by the MCMC sampling algorithm, the calculating of the selection probability of the candidate clusters, and the selecting of the candidate cluster with the highest probability value as θtThe specific procedure for the values of (a) is as follows:
extracting from 1, …, T words
Figure FDA0003041972450000021
Figure FDA0003041972450000022
The extraction result after the word t is removed in the i times of sampling is shown, and i is more than or equal to 2;
the MH algorithm extracts the candidate clusters from the following formula:
Figure FDA0003041972450000023
computing selection probabilities for candidate clusters
Figure FDA0003041972450000024
If ρ > α, let
Figure FDA0003041972450000025
Otherwise make
Figure FDA0003041972450000026
5. The method as claimed in claim 4, wherein the step of substituting the calculated result into the LDS language model and training the model by EM algorithm to make the model parameters stable is as follows:
e, step E: calculating the state estimation value of the Kalman filter at the t moment according to the parameter value at the t-1 moment
Figure FDA0003041972450000031
Further calculating a state estimation value
Figure FDA0003041972450000032
Covariance matrix of (2):
(1) the following definitions are first made:
Figure FDA0003041972450000033
Figure FDA0003041972450000034
wherein R is a covariance matrix of observation noise;
(2) calculating data at the time t by using a BP neural network model,
and (3) back propagation: t, …,1, calculating a covariance matrix
Figure FDA0003041972450000035
And
Figure FDA0003041972450000036
order to
Figure FDA0003041972450000037
Wherein N isVIs a unit diagonal matrix, B (theta)t)=G0chol(∑t)TChol denotes Cholesky decomposition, then
Figure FDA0003041972450000038
Figure FDA0003041972450000039
Covariance matrix can be derived by Kalman filter
Figure FDA00030419724500000310
And
Figure FDA00030419724500000311
forward propagation: t is 1, …, T,
Figure FDA00030419724500000312
Figure FDA0003041972450000041
the covariance matrix can be derived and calculated through a Kalman filter
Figure FDA0003041972450000042
And mt|t1:t-1);
And M: calculating an expected value by using the covariance matrix, maximizing the expected value, and solving related parameters of the LDS model, namely a state transition matrix A and an observation matrix C;
and updating the parameters and repeating the two steps until the LDS language model is stable.
6. The method of claim 5, wherein the calculation of the implicit vector representation by the Kalman filter one-step update formula is performed as follows:
the kalman filter one-step update formula is:
Figure FDA0003041972450000043
Figure FDA0003041972450000044
Figure FDA0003041972450000045
Figure FDA0003041972450000046
Figure FDA0003041972450000047
wherein K, R, B, Q, P and I are basic parameters of a Kalman filter,
Figure FDA0003041972450000048
implicit vector representation x calculated for one-step updating formula using Kalman filtertUsing the estimated value to calculate the implicit vector representation xtAnd the implicit vector is represented as xtRepresented as a word vector.
CN201810212606.XA 2018-03-15 2018-03-15 Kalman filtering word vector learning method based on Dield process Active CN108446273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810212606.XA CN108446273B (en) 2018-03-15 2018-03-15 Kalman filtering word vector learning method based on Dield process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810212606.XA CN108446273B (en) 2018-03-15 2018-03-15 Kalman filtering word vector learning method based on Dield process

Publications (2)

Publication Number Publication Date
CN108446273A CN108446273A (en) 2018-08-24
CN108446273B true CN108446273B (en) 2021-07-20

Family

ID=63195245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810212606.XA Active CN108446273B (en) 2018-03-15 2018-03-15 Kalman filtering word vector learning method based on Dield process

Country Status (1)

Country Link
CN (1) CN108446273B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109801073A (en) * 2018-12-13 2019-05-24 中国平安财产保险股份有限公司 Risk subscribers recognition methods, device, computer equipment and storage medium
CN116561814B (en) * 2023-05-17 2023-11-24 杭州君方科技有限公司 Textile chemical fiber supply chain information tamper-proof method and system thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103278170A (en) * 2013-05-16 2013-09-04 东南大学 Mobile robot cascading map building method based on remarkable scenic spot detection
CN104933199A (en) * 2015-07-14 2015-09-23 成都理工大学 Geological big data fusion system and method based on trusted mechanism
CN105760365A (en) * 2016-03-14 2016-07-13 云南大学 Probability latent parameter estimation model of image semantic data based on Bayesian algorithm
CN106547735A (en) * 2016-10-25 2017-03-29 复旦大学 The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
CN106815297A (en) * 2016-12-09 2017-06-09 宁波大学 A kind of academic resources recommendation service system and method
CN106971176A (en) * 2017-05-10 2017-07-21 河海大学 Tracking infrared human body target method based on rarefaction representation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8862932B2 (en) * 2008-08-15 2014-10-14 Apple Inc. Read XF instruction for processing vectors
US9411829B2 (en) * 2013-06-10 2016-08-09 Yahoo! Inc. Image-based faceted system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103278170A (en) * 2013-05-16 2013-09-04 东南大学 Mobile robot cascading map building method based on remarkable scenic spot detection
CN104933199A (en) * 2015-07-14 2015-09-23 成都理工大学 Geological big data fusion system and method based on trusted mechanism
CN105760365A (en) * 2016-03-14 2016-07-13 云南大学 Probability latent parameter estimation model of image semantic data based on Bayesian algorithm
CN106547735A (en) * 2016-10-25 2017-03-29 复旦大学 The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
CN106815297A (en) * 2016-12-09 2017-06-09 宁波大学 A kind of academic resources recommendation service system and method
CN106971176A (en) * 2017-05-10 2017-07-21 河海大学 Tracking infrared human body target method based on rarefaction representation

Also Published As

Publication number Publication date
CN108446273A (en) 2018-08-24

Similar Documents

Publication Publication Date Title
CN108595706B (en) Document semantic representation method based on topic word similarity, and text classification method and device
CN107085581B (en) Short text classification method and device
He et al. Discriminative learning in sequential pattern recognition
Goikoetxea et al. Random walks and neural network language models on knowledge bases
CN107480143A (en) Dialogue topic dividing method and system based on context dependence
CN111125367B (en) Multi-character relation extraction method based on multi-level attention mechanism
Chen et al. Matrix factorization with knowledge graph propagation for unsupervised spoken language understanding
Li et al. A generative word embedding model and its low rank positive semidefinite solution
CN108733647B (en) Word vector generation method based on Gaussian distribution
CN108446273B (en) Kalman filtering word vector learning method based on Dield process
Asadi et al. Creating discriminative models for time series classification and clustering by HMM ensembles
Luo et al. Unsupervised learning of morphological forests
CN112364659B (en) Automatic identification method and device for unsupervised semantic representation
JP6586026B2 (en) Word vector learning device, natural language processing device, method, and program
JP2017078919A (en) Word expansion device, classification device, machine learning device, method, and program
Savchenko Statistical recognition of a set of patterns using novel probability neural network
Lee et al. A multimodal variational approach to learning and inference in switching state space models [speech processing application]
KR20140077774A (en) Apparatus and method for adapting language model based on document clustering
CN112883158A (en) Method, device, medium and electronic equipment for classifying short texts
Klomsae et al. A novel string grammar fuzzy C-medians
Kudinov et al. A hybrid language model based on a recurrent neural network and probabilistic topic modeling
JP6057170B2 (en) Spoken language evaluation device, parameter estimation device, method, and program
Chen Optimization of Data Mining and Analysis System for Chinese Language Teaching Based on Convolutional Neural Network
Shustin et al. PCENet: High dimensional surrogate modeling for learning uncertainty
Mathivanan et al. Text Classification of E-Commerce Product via Hidden Markov Model.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant