CN108009150B - Input method and device based on recurrent neural network - Google Patents

Input method and device based on recurrent neural network Download PDF

Info

Publication number
CN108009150B
CN108009150B CN201711217459.7A CN201711217459A CN108009150B CN 108009150 B CN108009150 B CN 108009150B CN 201711217459 A CN201711217459 A CN 201711217459A CN 108009150 B CN108009150 B CN 108009150B
Authority
CN
China
Prior art keywords
word
matrix
vector
sparse representation
subscript
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711217459.7A
Other languages
Chinese (zh)
Other versions
CN108009150A (en
Inventor
阮翀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xinmei Hutong Technology Co ltd
Original Assignee
Beijing Xinmei Hutong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xinmei Hutong Technology Co ltd filed Critical Beijing Xinmei Hutong Technology Co ltd
Priority to CN201711217459.7A priority Critical patent/CN108009150B/en
Publication of CN108009150A publication Critical patent/CN108009150A/en
Application granted granted Critical
Publication of CN108009150B publication Critical patent/CN108009150B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Abstract

The invention discloses an input method and device based on a recurrent neural network. When candidate words need to be recommended, word vectors of the needed words are obtained according to the overcomplete basis matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, and the candidate words are obtained according to the word vectors and the recurrent neural network model. The word vectors of all words are obtained through the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix. The sum of the capacities of the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller than that of the word vector matrix, the space occupied by the terminal equipment is smaller, the calculation amount is smaller, the requirements on the storage capacity and the traffic capacity of the terminal equipment are reduced, and the method and the device can be applied to various terminal equipment.

Description

Input method and device based on recurrent neural network
Technical Field
The present application relates to the field of information input, and in particular, to an input method and device based on a recurrent neural network.
Background
With the development of science and technology, various terminal devices, such as mobile phones, smart televisions, computers and the like, come to the fore to meet the work and entertainment requirements of users. In the using process of the terminal device, the user sometimes needs to input information so that the terminal device can execute corresponding operations according to the received information. In order to improve the input efficiency and optimize the user experience, the existing input method usually provides a candidate word recommendation function. The function is used for presuming the next word which the user wants to input according to the first few words input by the user, taking the next word as a candidate word and recommending the candidate word to the user.
At present, in order to implement a candidate word recommendation function, most input methods employ an n-gram language model, in which an n-gram language model is constructed by counting the occurrence frequency of n-gram phrases in a large scale in advance, and then a next word (i.e., a candidate word) to be recommended to a user is determined according to the previous n-1 words already input by the user and the n-gram language model. However, due to the limitation of the model size of the n-gram language model and the data sparseness problem, when the scheme is applied, the candidate words can only be presumed according to the first n-1 or n-2 input words, and longer context information cannot be considered, so that the recommended candidate words are inaccurate. For example, if the user inputs "in two days" and the employed n-gram language model is a ternary language model, the candidate word is determined by the "two days" input by the user, and since the "two days ago" can be determined as a common phrase by the large-scale corpus statistics, the ternary language model may presume that the candidate word is "ago". However, "in two days ago" is not in conformity with english grammar, "ago" is not a proper candidate word, resulting in inaccuracy of the candidate word recommended this time.
In order to solve the problem that the recommended candidate words of the input method are inaccurate, an input method using a recurrent neural network can be adopted at present. In this method, a recurrent neural network model is constructed, and each word is decomposed into a multi-dimensional (usually hundreds of dimensions) vector, and the vector of each word constitutes a word vector matrix. If the number of words used to construct the word vector matrix is n and the word vector dimension of each word is e, the word vector matrix is usually an n × e matrix. When a candidate word needs to be recommended to a user, word vectors of all historical words input by the user and word vectors of current words can be obtained through the word vector matrix, the word vectors of all historical words and the word vectors of the current words are input into the recurrent neural network, and the recurrent neural network can output the candidate word. The method can be used for speculatively obtaining the candidate words based on the word vectors of the historical words and the word vectors of the current input words input by the user, so that the accuracy of recommending the candidate words is improved.
However, in the course of research of the present application, the inventors found that, when the input method of the recurrent neural network is applied, each word is usually expressed as a vector with several hundred dimensions, and there are often several tens of thousands of words in the word list (i.e., the value of n is several tens of thousands), which results in that the word vector matrix contains several hundreds of thousands of vectors (i.e., the value of e is several hundreds), the capacity is extremely large, and the calculation amount when determining candidate words according to the word vectors is large, which has high requirements on the storage capacity and the traffic capacity of the terminal device, and even is difficult to apply to some terminal devices (e.g., mobile phones).
Disclosure of Invention
The embodiment of the invention discloses an input method and device based on a recurrent neural network, and aims to solve the problems that in the prior art, the capacity of a word vector matrix is extremely large, the calculated amount is large when candidate words are determined according to word vectors, and the requirements on the storage capacity and the traffic capacity of terminal equipment are high.
In a first aspect of the present invention, an input method based on a recurrent neural network is disclosed, which includes:
obtaining a word vector matrix, wherein the word vector matrix is an n-e matrix, n is the number of words, and e is the word vector dimension of each word;
obtaining word vectors of b words in the word vector matrix, and constructing an overcomplete basis matrix according to the word vectors of the b words, wherein the overcomplete basis matrix is a b-x-e matrix, each basis vector in the overcomplete basis matrix represents a word vector corresponding to the b words, and b is a positive integer less than n;
constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete substrate matrix, wherein the sparse representation subscript matrix and the sparse representation coefficient matrix are both n-s matrices, and s is a preset sparse effect parameter;
when a candidate word needs to be recommended, acquiring a target base vector of the required word according to the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, determining a word vector of the required word through the target base vector, and acquiring the candidate word according to the word vector of the required word and the recurrent neural network model;
wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the sparse representation coefficient matrix contains coefficient vectors representing coefficients corresponding to the required target basis vectors if the word vectors of the words are formed.
Optionally, the obtaining the word vectors of b words in the word vector matrix includes:
ordering the words according to word frequency;
b words with the highest word frequency are selected according to the sequencing result;
and obtaining the word vectors of the b words by inquiring the word vector matrix.
Optionally, constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete basis matrix, includes:
setting the word vector of one word as xiThe overcomplete basis matrix is B, the following LASSO regression problem is created:
Figure GDA0002710446870000021
wherein the content of the first and second substances,
Figure GDA0002710446870000022
being a hyper-parameter in real form, wiAs vector parameters, wiIs equal to the number of rows of the overcomplete base matrix;
solving the LASSO regression problem to obtain current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component;
according to said non-zero component at wiDetermining a sparse representation subscript vector corresponding to the word, and determining a sparse representation coefficient vector corresponding to the word according to the value of the non-zero component;
and constructing a corresponding sparse representation subscript matrix according to the sparse representation subscript vector corresponding to each word, and constructing a corresponding sparse representation coefficient matrix according to the sparse representation coefficient vector corresponding to each word.
Optionally, the LASSO regression problem is solved to obtain current wiWhen the number of the medium non-zero components is s, theThe non-zero component being at wiAnd the value of the non-zero component, including:
41) determining
Figure GDA0002710446870000031
The value of (a) is a preset initial value;
42) will be provided with
Figure GDA0002710446870000032
Substituting the value of (a) into the LASSO regression problem and calculating to obtain wi
43) Obtaining wiThe number of the medium and non-zero components, judging wiWhether the number of the medium non-zero components is equal to s or not is judged, if so, the non-zero components are obtained in wiAnd the value of the non-zero component, if not, performing the operation of step 44);
44) if wiThe number of the medium and non-zero components is larger than s, and the expansion is carried out according to a preset first adjustment amplitude
Figure GDA0002710446870000033
If w isiThe number of the medium and non-zero components is less than s, and the medium and non-zero components are reduced according to a preset second adjustment amplitude
Figure GDA0002710446870000034
Then returns to performing the operation of step 42).
Optionally, when a candidate word needs to be recommended, obtaining a target base vector of a required word according to the overcomplete base matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix, and determining a word vector of the required word according to the target base vector, where the method includes:
searching the sparse representation subscript matrix, and acquiring a sparse representation subscript vector corresponding to the required word in the sparse representation subscript matrix, wherein the sparse representation subscript vector is used for indicating the position of a target base vector of the required word in the overcomplete base matrix;
searching the sparse representation coefficient matrix, and acquiring a sparse representation coefficient vector corresponding to the required word in the sparse representation coefficient matrix, wherein the sparse representation coefficient vector is used for indicating a coefficient corresponding to a target base vector of the required word;
searching the overcomplete base matrix according to the sparse representation subscript vector to obtain a target base vector of the required word;
and performing weighted combination according to the target base vector and the coefficient corresponding to the target base vector to calculate the word vector of the required word.
In a second aspect of the present invention, an input device based on a recurrent neural network is disclosed, comprising:
the first matrix obtaining module is used for obtaining a word vector matrix, wherein the word vector matrix is an n-e matrix, n is the number of words, and e is the word vector dimension of each word;
a second matrix obtaining module, configured to obtain word vectors of b words in the word vector matrix, and construct an overcomplete basis matrix according to the word vectors of the b words, where the overcomplete basis matrix is a b × e matrix, each basis vector in the overcomplete basis matrix represents a word vector corresponding to the b words, and b is a positive integer smaller than n;
the third matrix acquisition module is used for constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the over-complete substrate matrix, wherein the sparse representation subscript matrix and the sparse representation coefficient matrix are both n × s matrices, and s is a preset sparse effect parameter;
the candidate word acquisition module is used for acquiring a target base vector of a required word according to the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix when the candidate word needs to be recommended, determining a word vector of the required word according to the target base vector, and acquiring the candidate word according to the word vector of the required word and the recurrent neural network model;
wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the sparse representation coefficient matrix contains coefficient vectors representing coefficients corresponding to the required target basis vectors if the word vectors of the words are formed.
Optionally, the second matrix obtaining module includes:
the word sorting unit is used for sorting the words according to the word frequency;
the word selecting unit is used for selecting b words with the highest word frequency according to the sequencing result;
and the word vector acquisition unit is used for acquiring the word vectors of the b words by inquiring the word vector matrix.
Optionally, the third matrix obtaining module includes:
a regression problem creation unit for setting a word vector of one of the words as xiThe overcomplete basis matrix is B, the following LASSO regression problem is created:
Figure GDA0002710446870000041
wherein the content of the first and second substances,
Figure GDA0002710446870000042
being a hyper-parameter in real form, wiAs vector parameters, wiIs equal to the number of rows of the overcomplete base matrix;
a regression problem solving unit for solving the LASSO regression problem to obtain current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component;
a vector determination unit for determining w from the non-zero componentiDetermining a sparse representation subscript vector corresponding to the word, and determining a sparse representation coefficient vector corresponding to the word according to the value of the non-zero component;
and the matrix construction unit is used for constructing a corresponding sparse representation subscript matrix according to the sparse representation subscript vector corresponding to each word, and constructing a corresponding sparse representation coefficient matrix according to the sparse representation coefficient vector corresponding to each word.
Optionally, the regression problem solving unit includes: the system comprises an initial numerical value determining subunit, a regression problem calculating subunit, a non-zero fraction subunit and a numerical value adjusting subunit;
the initial value determining subunit is used for determining
Figure GDA0002710446870000051
The value of (a) is a preset initial value;
the regression problem calculation subunit is used for calculating the regression problem
Figure GDA0002710446870000052
Substituting the value of (a) into the LASSO regression problem and calculating to obtain wi
The non-zero component quantum unit is used for acquiring wiThe number of the medium and non-zero components, judging wiWhether the number of the medium non-zero components is equal to s or not is judged, if so, the non-zero components are obtained in wiIf not, triggering the numerical value adjusting subunit to execute corresponding operation;
the value adjusting subunit is used for determining if wiThe number of the medium and non-zero components is larger than s, and the expansion is carried out according to a preset first adjustment amplitude
Figure GDA0002710446870000053
If w isiThe number of the medium and non-zero components is less than s, and the medium and non-zero components are reduced according to a preset second adjustment amplitude
Figure GDA0002710446870000054
Then triggers the regression problem calculation subunit to execute the corresponding operation.
Optionally, the candidate word obtaining module includes:
a first searching unit, configured to search the sparse representation subscript matrix, and obtain a sparse representation subscript vector corresponding to the required word in the sparse representation subscript matrix, where the sparse representation subscript vector is used to indicate a position of a target base vector of the required word in the overcomplete base matrix;
the second searching unit is used for searching the sparse representation coefficient matrix and acquiring a sparse representation coefficient vector corresponding to the required word in the sparse representation coefficient matrix, wherein the sparse representation coefficient vector is used for indicating a coefficient corresponding to a target base vector of the required word;
a target base vector obtaining unit, configured to search the overcomplete base matrix according to the sparse representation subscript vector, and obtain a target base vector of the required word;
and the weighted combination unit is used for carrying out weighted combination according to the target base vector and the coefficient corresponding to the target base vector, and calculating the word vector of the required word.
The embodiment of the invention discloses an input method and device based on a recurrent neural network. When candidate words need to be recommended, word vectors of the needed words can be obtained according to the constructed overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, and then the candidate words are obtained according to the word vectors and the recurrent neural network model.
That is to say, according to the scheme disclosed in the embodiment of the present invention, the word vector matrix is decomposed into an overcomplete basis matrix, a sparse representation subscript matrix, and a sparse representation coefficient matrix, and the word vectors of the words can be obtained through the overcomplete basis matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix. And the sum of the capacities of the overcomplete basis matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller than that of the word vector matrix, so that compared with the prior art, the method disclosed by the embodiment of the invention occupies smaller space of the terminal equipment and has smaller calculation amount, thereby reducing the requirements on the storage capacity and the traffic capacity of the terminal equipment and being capable of being applied to various terminal equipment (such as mobile phones).
Furthermore, in the embodiment of the invention, because the capacity of the over-complete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller, the installation package volume of the terminal device application program is reduced, thereby facilitating distribution and downloading in an application store and facilitating popularization and application. In addition, the calculation amount in the scheme is small, so that the calculation efficiency can be improved, and the scheme disclosed by the embodiment of the invention can be used for reducing the program response time, reducing the click feeling of the terminal equipment and improving the use experience of a user when the scheme disclosed by the embodiment of the invention is adopted for text input.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any inventive exercise.
Fig. 1 is a schematic workflow diagram of an input method based on a recurrent neural network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a matrix comparison in an input method based on a recurrent neural network according to an embodiment of the present invention;
fig. 3 is a schematic view of a workflow for constructing a sparse representation subscript matrix and a sparse representation coefficient matrix in the input method based on the recurrent neural network disclosed in the embodiment of the present invention;
fig. 4 is a schematic diagram of a workflow for obtaining subscripts and values of non-zero components in an input method based on a recurrent neural network according to an embodiment of the present invention;
fig. 5 is a schematic view of a workflow for obtaining word vectors of required words in an input method based on a recurrent neural network according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an input device based on a recurrent neural network according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described below with reference to the accompanying drawings.
The embodiment of the invention discloses an input method and device based on a recurrent neural network, and aims to solve the problems that in the prior art, the capacity of a word vector matrix is extremely large, the calculated amount is large when candidate words are determined according to word vectors, and the requirements on the storage capacity and the traffic capacity of terminal equipment are high.
The first embodiment of the invention discloses an input method based on a recurrent neural network. Referring to the workflow diagram shown in fig. 1, the method comprises the following steps:
step S11, obtaining a word vector matrix, where the word vector matrix is an n × e matrix, n is the number of words, and e is the word vector dimension of each word.
In this case, each row in the word vector matrix represents a word vector for the word corresponding to that row.
Step S12, obtaining word vectors of b words in the word vector matrix, and constructing an overcomplete basis matrix according to the word vectors of the b words, where the overcomplete basis matrix is a b × e matrix, each basis vector in the overcomplete basis matrix represents a word vector corresponding to the b words, and b is a positive integer less than n.
Wherein b is a positive integer less than n to avoid the over-complete substrate matrix capacity being too large. And b is usually greater than e, that is, the overcomplete basis matrix includes basis vectors corresponding to b words, and the specific value of the number b of the basis vectors in the overcomplete basis matrix is usually greater than the dimension of the word vectors, so as to ensure that the overcomplete basis matrix includes a sufficient number of basis vectors to meet the subsequent input requirements.
In practical applications, the number of values of b is usually set to be 4 to 5 times the number of values of e. For example, if the size of the word vector matrix is 20000 × 400, i.e., the word vector dimension e is 400, it can be determined that the value of b is 2000, and the size of the overcomplete basis matrix is 2000 × 400. Of course, b may also be set to other values, which are not limited in the embodiment of the present invention.
In the embodiment of the invention, as the number of words is usually ten thousand, the dimension of the word vector is hundreds, and the number of words is far greater than the dimension of the word vector, the word vectors among the words can be mutually expressed, so that an over-complete base matrix is constructed. Each vector in the overcomplete basis matrix is a basis vector, that is, the overcomplete basis matrix includes b basis vectors, and the word vector of each word can be formed by weighted combination of partial basis vectors.
Step S13, constructing a sparse representation index matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete substrate matrix, wherein the sparse representation index matrix and the sparse representation coefficient matrix are both n × S matrices, and S is a preset sparse effect parameter.
The specific value of s is a preset positive integer, for example, s may be 10, in which case, if the size of the word vector matrix is 20000 × 400, the size of the sparse representation index matrix and the size of the sparse representation coefficient matrix are both 20000 × 10. Of course, s may also be other values, which is not limited in the embodiment of the present invention.
Wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the sparse representation coefficient matrix contains coefficient vectors representing coefficients corresponding to the required target basis vectors if the word vectors of the words are formed.
In the embodiment of the present invention, it is considered that the word vector of a word can be obtained by performing weighted combination on s basis vectors in the overcomplete basis matrix, and the s basis vectors are the target basis vectors of the word. That is, the target base vector for each word is used to combine the word vectors for that word.
Wherein s is a sparse effect parameter, and means that for any word, s basis vectors (i.e., target basis vectors of the word) are selected from the overcomplete basis matrix, and the word vectors of the word can be obtained by performing weighted combination on the s basis vectors.
In this case, for a word, it is not necessary to store the word vector matrix of the word, but only to store which basis vectors the word vector of the word can be formed by weighting and combining. The positions of the target base vectors of the terms in the overcomplete base matrix can be determined by sparsely representing the subscript matrix, and the target base vectors of the terms can be determined by querying the overcomplete base matrix. In addition, the coefficient of each target basis vector can be determined by sparsely representing the coefficient matrix, and the coefficient is the weight occupied by each target basis vector when the target basis vectors are weighted and combined to obtain the word vector of the word.
Because each word can be weighted and combined through the target basis vector corresponding to the word to obtain the word vector of the word, n × s subscripts need to be recorded in total, and the recorded n × s subscripts are the sparse representation subscript matrix. In addition, n × s coefficients need to be recorded in total, and the recorded n × s coefficients are a sparse representation coefficient matrix.
Each row vector included in the sparse representation index matrix is referred to as an index vector, wherein each index vector represents the position of a plurality of target basis vectors in the overcomplete basis matrix, which are required if a word vector of a certain word is formed. For example, if the s target substrate vectors of one word are located in the rows a1, a2, …, as respectively in the overcomplete substrate matrix, the corresponding subscript vector may be (a1, a2, …, as).
In addition, each row vector in the sparse representation coefficient matrix is referred to as a coefficient vector, wherein each coefficient vector represents a coefficient corresponding to each of a plurality of target basis vectors required for forming a word vector of the word.
And step S14, when a candidate word needs to be recommended, acquiring a target base vector of the required word according to the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, determining a word vector of the required word according to the target base vector, and acquiring the candidate word according to the word vector of the required word and the recurrent neural network model.
In this step, the word vectors of the required words can be obtained by performing weighted combination on the target basis vectors of the required words. The required word may be a current word input by the user, a history word input before, and the like.
When candidate words need to be recommended, target basis vectors of the needed words and coefficients corresponding to the target basis vectors can be determined through the overcomplete basis matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix which are constructed in the steps, and the word vectors of the needed words are obtained through weighted combination. And then, acquiring candidate words according to the word vectors of the required words and the recurrent neural network, specifically, inputting the word vectors of the required words into a recurrent neural network model, and outputting the candidate words by the recurrent neural network model, thereby meeting the requirement of recommending the candidate words to the user.
The first embodiment of the invention discloses an input method based on a recurrent neural network, which comprises the steps of constructing an over-complete substrate matrix according to a word vector matrix after the word vector matrix is obtained, and constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the over-complete substrate matrix. When candidate words need to be recommended, word vectors of the needed words can be obtained according to the constructed overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, and then the candidate words are obtained according to the word vectors and the recurrent neural network model.
That is to say, according to the scheme disclosed in the embodiment of the present invention, the word vector matrix is decomposed into an overcomplete basis matrix, a sparse representation subscript matrix, and a sparse representation coefficient matrix, and the word vectors of the words can be obtained through the overcomplete basis matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix. And the sum of the capacities of the overcomplete basis matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller than that of the word vector matrix, so that compared with the prior art, the method disclosed by the embodiment of the invention occupies smaller space of the terminal equipment and has smaller calculation amount, thereby reducing the requirements on the storage capacity and the traffic capacity of the terminal equipment and being capable of being applied to various terminal equipment (such as mobile phones).
Furthermore, in the embodiment of the invention, because the capacity of the over-complete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller, the installation package volume of the terminal device application program is reduced, thereby facilitating distribution and downloading in an application store and facilitating popularization and application. In addition, the calculation amount in the scheme is small, so that the calculation efficiency can be improved, and the scheme disclosed by the embodiment of the invention can be used for reducing the program response time, reducing the click feeling of the terminal equipment and improving the use experience of a user when the scheme disclosed by the embodiment of the invention is adopted for text input.
In order to clarify the difference between the capacity of the word vector matrix and the overcomplete basis matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix in the embodiment of the present invention, fig. 2 is disclosed.
Referring to the schematic diagram of matrix comparison shown in fig. 2, the leftmost rectangle in the diagram is a word vector matrix, and the size of the word vector matrix is 20000 × 400. In the figure, the three rectangles on the right side are the overcomplete basis matrix, the sparse representation index matrix, and the sparse representation coefficient matrix, respectively, and it can be seen from the above steps that the size of the overcomplete basis matrix is 2000 × 400, the size of the sparse representation index matrix is 20000 × 10, and the size of the sparse representation coefficient matrix is 20000 × 10. The area of each rectangle represents the dimension of the matrix corresponding to the rectangle and also reflects the number of elements in the matrix. From fig. 2, it can be seen that the word vector matrix is split into three matrices, namely a complete basis matrix, a sparse representation subscript matrix, and a sparse representation coefficient matrix.
Accordingly, the compression ratio of the word vector matrix before and after the division can be calculated, and the compression ratio is (2000 × 400+20000 × 10)/(20000 × 400) ═ 15%. The calculation shows that the capacity of the word vector matrix can be reduced by about 6 times by the method disclosed by the embodiment of the invention, which is considerable compression amount, and the method disclosed by the embodiment of the invention can effectively reduce the capacity of the word vector matrix.
Further, in the embodiment of the present invention, a step of obtaining word vectors of b words in the word vector matrix is disclosed, so as to construct an overcomplete basis matrix through the word vectors of the b words. Wherein, the obtaining of the word vectors of b words in the word vector matrix comprises the following steps:
firstly, ordering the words according to word frequency;
then, according to the sequencing result, selecting b words with the highest word frequency;
and finally, acquiring the word vectors of the b words by inquiring the word vector matrix.
In constructing the overcomplete basis matrix, b words may be optionally selected from the word vector matrix, and the overcomplete basis matrix may be constructed by the optional word vectors of the b words. In addition, in order to improve the accuracy of obtaining candidate words, an overcomplete basis matrix is usually constructed according to word vectors of words with higher word frequency. In this case, according to the description of the above steps, the words may be sorted according to the word frequency, b words with the highest word frequency are selected according to the sorting result, and then the word vectors of the b words are obtained according to the word vector matrix, so as to construct the overcomplete base matrix by the word vectors of the b words with the highest word frequency.
Further, in the embodiment of the present invention, an operation of constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete basis matrix is disclosed. Referring to the workflow diagram shown in fig. 3, the constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete basis matrix includes the following steps:
step S21, setting the word vector of one word as xiThe overcomplete basis matrix is B, the following LASSO regression problem is created:
Figure GDA0002710446870000091
wherein the content of the first and second substances,
Figure GDA0002710446870000092
being a hyper-parameter in real form, wiAs vector parameters, wiIs equal to the number of rows of the overcomplete base matrix.
Figure GDA0002710446870000093
Is the error between the word vector obtained by the overcomplete basis matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix, and the word vector obtained by the word vector matrix.
For example, if the overcomplete base matrix has 5 rows, i.e., b is 5, then wiDimension (d) is 5.
Each row in the n-e word vector matrix is a word vector of a word, the dimension of the word vector is e, and each word vector can be respectively marked as x1To xnAnd the word vector for any one of the words may be denoted as xiAnd i is a positive integer of not less than 1 and not more than n. For xiAn s-dimensional sparse representation index vector and an s-dimensional sparse representation sparse vector may be constructed to determine the word vector x by the s-dimensional sparse representation index vector and the s-dimensional sparse representation sparse vectori
Step S22, solving the LASSO regression problem, and obtaining current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component.
When solving the LASSO regression problem, a cyclic coordinate descent method, an LARS method, and the like may be adopted, and of course, other methods may also be adopted, which is not limited in the embodiment of the present invention.
Step S23, according to the non-zero component at wiDetermining a sparse representation index vector corresponding to the word, and determining a sparse representation coefficient vector corresponding to the word according to the value of the non-zero component.
In the embodiment of the present invention, when wiWhen the number of the medium non-zero components is s, the non-zero components are in wiA dimension in (a) may indicate a sparse representation index vector corresponding to the term, and a value of a non-zero component may indicate a sparse representation coefficient vector corresponding to the term.
For example, if the preset sparse effectThe fruit parameter s is 2, b is 5, the number of words n is 10000, in which case wiDimension (d) is 5. By solving the LASSO regression problem, when wiWhen the number of the non-zero components is s (namely 2), the obtained wiIs (0, 0.3,0,0,0.6), where the non-zero components are the 2 nd and 5 th dimensions, respectively, then the corresponding sparsity of the term indicates a subscript vector of (2, 5). When the values of the non-zero components are 0.3 and 0.6, respectively, the sparse representation coefficient vector is (0.3, 0.6).
And S24, constructing a corresponding sparse representation subscript matrix according to the sparse representation subscript vector corresponding to each word, and constructing a corresponding sparse representation coefficient matrix according to the sparse representation coefficient vector corresponding to each word.
One index vector is a row in the sparse representation index matrix, and one coefficient vector is a row in the sparse representation coefficient matrix.
Because each word vector in the word vector matrix can construct an s-dimensional sparse representation subscript vector and an s-dimensional sparse representation sparse vector, n s-dimensional sparse representation subscript vectors and n s-dimensional sparse representation coefficient vectors can be constructed together, the n s-dimensional sparse representation subscript vectors can form a sparse representation subscript matrix, the n s-dimensional sparse representation coefficient vectors can form a sparse representation coefficient matrix, and thus the n s sparse representation subscript matrix and the n s sparse representation coefficient matrix are obtained.
In the above steps, the current w is obtained by solving the LASSO regression problemiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the values of said non-zero components, and from this a sparse representation index matrix and a sparse representation coefficient matrix are constructed. Wherein, in the LASSO regression problem,
Figure GDA0002710446870000111
is the error between the word vector obtained by the overcomplete basis matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix, and the word vector obtained by the word vector matrix. The smaller the error, the more complete the substrate moment is passedThe closer the word vectors obtained by the array, the sparse representation subscript matrix, and the sparse representation coefficient matrix are to the word vectors obtained by the word vector matrix. However, the smaller the error, wiThe larger the number of the medium non-zero components is, the larger the capacity of the constructed sparse representation subscript matrix and sparse representation coefficient matrix is. To avoid this, hyper-parameters are set in the LASSO regression problem
Figure GDA00027104468700001114
So as to pass the hyper-parameter
Figure GDA0002710446870000113
Adjusting wiThe number of non-zero components in (a).
In this case, referring to fig. 4, in the embodiment of the present invention, the LASSO regression problem is solved to obtain the current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component, comprising the steps of:
step S31, determination
Figure GDA0002710446870000114
The value of (a) is a preset initial value.
In the embodiment of the present invention, it is,
Figure GDA0002710446870000115
is a real form of hyper-parameter. The initial value in this step is preset according to the requirements. For example, it can be determined
Figure GDA0002710446870000116
The value of (2) is 15. Of course, it is also possible to provide
Figure GDA0002710446870000117
The numerical value of (b) is other values, which are not limited in the embodiment of the present invention.
Step S32, the
Figure GDA0002710446870000118
Substituting the value of (a) into the LASSO regression problem and calculating to obtain wi
Step S33, acquiring wiThe number of the medium and non-zero components, judging wiIf the number of non-zero components is equal to S, if yes, the operation of step S34 is performed, and if no, the operation of step S35 is performed.
Figure GDA0002710446870000119
When they are different values, wiThe number of non-zero components in (a) is usually different. In this case, w is calculatediThen, it is necessary to judge wiIf the number of the medium non-zero components is equal to s, correspondingly adjusting
Figure GDA00027104468700001110
The specific numerical value of (1).
Step S34, if wiThe number of the medium non-zero components is equal to s, and the non-zero components are obtained at wiAnd the value of the non-zero component.
Step S35, if wiIf the number of the medium non-zero components is not equal to s, judging wiIf the number of the non-zero components is greater than S, the operation of step S36 is executed, and if not, the operation of step S37 is executed.
Step S36, if wiThe number of the medium and non-zero components is larger than s, and the expansion is carried out according to a preset first adjustment amplitude
Figure GDA00027104468700001111
And then returns to perform the operation of step S32.
Wherein the preset first adjustment range may be 1.5, in which case the expanded one
Figure GDA00027104468700001112
The numerical value of (A) is before enlargement
Figure GDA00027104468700001113
1.5 times the value of (c). Of course, the preset first adjustment range can also be adjustedThe present invention is not limited to these values.
Step S37, if wiThe number of the medium and non-zero components is less than s, and the medium and non-zero components are reduced according to a preset second adjustment amplitude
Figure GDA0002710446870000123
And then returns to perform the operation of step S32.
Wherein the preset second adjustment amplitude may be 0.9, in which case the reduced second adjustment amplitude is set to
Figure GDA0002710446870000124
Before reduction
Figure GDA0002710446870000125
0.9 times the value of (c). Of course, the preset second adjustment amplitude may also be other values, which is not limited in the embodiment of the present invention.
Figure GDA0002710446870000126
The larger, wiThe fewer the number of non-zero components in (b),
Figure GDA0002710446870000127
the smaller, wiThe greater the number of non-zero components in. Accordingly, the embodiments of the present invention can be applied to
Figure GDA0002710446870000128
Is adjusted, and is adjusted at each time
Figure GDA0002710446870000129
After the size of (d), calculate wiNumber of medium non-zero components, wherein, if wiIf the number of the medium non-zero components is larger than s, the number of the medium non-zero components is increased
Figure GDA0002710446870000121
If wiNumber of medium non-zero components, wherein, if wiThe number of medium and non-zero components is less than s, then it is reduced
Figure GDA0002710446870000122
Up to wiThe number of the medium non-zero components is s.
In step S14, an operation is disclosed, when a candidate word needs to be recommended, of obtaining a target basis vector of a desired word according to the overcomplete basis matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix, and determining a word vector of the desired word by using the target basis vector, with reference to the workflow diagram shown in fig. 5, the operation includes the following steps:
and step S41, searching the sparse representation subscript matrix, and acquiring a sparse representation subscript vector corresponding to the required word in the sparse representation subscript matrix.
Wherein the sparse representation subscript vector is to indicate a location of a target base vector of the desired word in the overcomplete base matrix.
The sparse representation subscript matrix is an n-s matrix and is composed of sparse representation subscript vectors of n words, and when the target base vector of the required word needs to be obtained, the sparse representation subscript vector corresponding to the required word needs to be obtained.
And S42, searching the sparse representation coefficient matrix, and acquiring a sparse representation coefficient vector corresponding to the required word in the sparse representation coefficient matrix.
Wherein the sparse representation coefficient vector is used to indicate a coefficient corresponding to a target basis vector of the desired word.
The sparse representation coefficient matrix is an n-s matrix and is composed of sparse representation coefficient vectors of n words, and when a target base vector of a required word needs to be obtained, the sparse representation coefficient vector corresponding to the required word needs to be obtained.
And step S43, searching the overcomplete basis matrix according to the sparse expression subscript vector, and acquiring a target basis vector of the required word.
For example, if the sparseness of the desired word indicates that the subscript vector is (2,5), then the vectors of the second and fifth rows in the overcomplete basis matrix are the target basis vectors for the desired word.
And step S44, carrying out weighted combination according to the target base vector and the coefficient corresponding to the target base vector, and calculating the word vector of the required word.
For example, if the sparse representation coefficient vector corresponding to the desired word is (0.3,0.6), it indicates that the coefficients of the two target basis vectors are 0.3 and 0.6, respectively, when the two target basis vectors of the desired word are weighted and combined. Accordingly, the word vector of the required word can be obtained through weighted combination.
After the word vector of the required word is obtained, the word vector of the required word is input into the neural network model, and the neural network model can output a corresponding candidate word.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
The embodiment of the invention discloses an input device based on a recurrent neural network. Referring to the schematic structural diagram shown in fig. 6, the input device based on the recurrent neural network includes: a first matrix obtaining module 100, a second matrix obtaining module 200, a third matrix obtaining module 300, and a candidate word obtaining module 400.
The first matrix obtaining module 100 is configured to obtain a word vector matrix, where the word vector matrix is an n × e matrix, n is a word number, and e is a word vector dimension of each word.
The second matrix obtaining module 200 is configured to obtain word vectors of b words in the word vector matrix, and construct an overcomplete basis matrix according to the word vectors of the b words, where the overcomplete basis matrix is a b × e matrix, each basis vector in the overcomplete basis matrix represents a word vector corresponding to the b words, and b is a positive integer smaller than n.
Wherein b is a positive integer less than n to avoid the over-complete substrate matrix capacity being too large. And b is usually greater than e, that is, the overcomplete basis matrix includes basis vectors corresponding to b words, and the specific value of the number b of the basis vectors in the overcomplete basis matrix is usually greater than the dimension of the word vectors, so as to ensure that the overcomplete basis matrix includes a sufficient number of basis vectors to meet the subsequent input requirements.
In practical applications, the number of values of b is usually set to be 4 to 5 times the number of values of e. For example, if the size of the word vector matrix is 20000 × 400, i.e., the word vector dimension e is 400, it can be determined that the value of b is 2000, and the size of the overcomplete basis matrix is 2000 × 400.
Of course, b may also be set to other values, which are not limited in the embodiment of the present invention.
The third matrix obtaining module 300 is configured to construct a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete basis matrix, where the sparse representation subscript matrix and the sparse representation coefficient matrix are both n × s matrices, and s is a preset sparse effect parameter.
The specific value of s is a preset positive integer, for example, s may be 10, in which case, if the size of the word vector matrix is 20000 × 400, the size of the sparse representation index matrix and the size of the sparse representation coefficient matrix are both 20000 × 10. Of course, s may also be other values, which is not limited in the embodiment of the present invention.
Wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the sparse representation coefficient matrix contains coefficient vectors representing coefficients corresponding to the required target basis vectors if the word vectors of the words are formed.
In the embodiment of the present invention, it is considered that the word vector of a word can be obtained by performing weighted combination on s basis vectors in the overcomplete basis matrix, and the s basis vectors are the target basis vectors of the word. That is, the target base vector for each word is used to combine the word vectors for that word.
Because each word can be weighted and combined through the target basis vector corresponding to the word to obtain the word vector of the word, n × s subscripts need to be recorded in total, and the recorded n × s subscripts are the sparse representation subscript matrix. In addition, n × s coefficients need to be recorded in total, and the recorded n × s coefficients are a sparse representation coefficient matrix.
The candidate word obtaining module 400 is configured to, when a candidate word needs to be recommended, obtain a target base vector of the required word according to the overcomplete base matrix, the sparse representation subscript matrix, and the sparse representation coefficient matrix, determine a word vector of the required word according to the target base vector, and obtain the candidate word according to the word vector of the required word and the recurrent neural network model.
Wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the sparse representation coefficient matrix contains coefficient vectors representing coefficients corresponding to the required target basis vectors if the word vectors of the words are formed.
When candidate words need to be recommended, target basis vectors of the needed words and coefficients corresponding to the target basis vectors can be determined through the overcomplete basis matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, and weighted combination is carried out to obtain word vectors of the needed words. And then, acquiring candidate words according to the word vectors of the required words and the recurrent neural network, specifically, inputting the word vectors of the required words into a recurrent neural network model, and outputting the candidate words by the recurrent neural network model, thereby meeting the requirement of recommending the candidate words to the user.
According to the scheme disclosed by the embodiment of the invention, the word vector matrix is decomposed into the overcomplete substrate matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, and the word vectors of all words can be obtained through the overcomplete substrate matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix. And the sum of the capacities of the overcomplete basis matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller than that of the word vector matrix, so that compared with the prior art, the method disclosed by the embodiment of the invention can occupy smaller space of the terminal equipment and has smaller calculation amount, thereby reducing the requirements on the storage capacity and the traffic capacity of the terminal equipment and being applied to various terminal equipment (such as mobile phones).
Furthermore, in the embodiment of the invention, because the capacity of the over-complete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix is smaller, the installation package volume of the terminal device application program is reduced, thereby facilitating distribution and downloading in an application store and facilitating popularization and application. In addition, the calculation amount in the scheme is small, so that the calculation efficiency can be improved, and the scheme disclosed by the embodiment of the invention can be used for reducing the program response time, reducing the click feeling of the terminal equipment and improving the use experience of a user when the scheme disclosed by the embodiment of the invention is adopted for text input.
Further, in the input device based on the recurrent neural network disclosed in the embodiment of the present invention, the second matrix obtaining module includes:
the word sorting unit is used for sorting the words according to the word frequency;
the word selecting unit is used for selecting b words with the highest word frequency according to the sequencing result;
and the word vector acquisition unit is used for acquiring the word vectors of the b words by inquiring the word vector matrix.
Through the units, the words are sequenced according to the word frequency, the b words with the highest word frequency are selected according to the sequencing result, then the word vectors of the b words are obtained according to the word vector matrix, and the overcomplete base matrix is constructed through the word vectors of the b words with the highest word frequency, so that the accuracy of obtaining the candidate words can be improved.
In the input device based on the recurrent neural network disclosed in the embodiment of the present invention, the third matrix acquisition module includes:
a regression problem creation unit for setting a word vector of one of the words as xiThe overcomplete basis matrix is B, the following LASSO regression problem is created:
Figure GDA0002710446870000151
wherein the content of the first and second substances,
Figure GDA0002710446870000152
being a hyper-parameter in real form, wiAs vector parameters, wiIs equal to the number of rows of the overcomplete base matrix;
a regression problem solving unit for solving the LASSO regression problem to obtain current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component;
a vector determination unit for determining w from the non-zero componentiDetermining a sparse representation subscript vector corresponding to the word, and determining a sparse representation coefficient vector corresponding to the word according to the value of the non-zero component;
and the matrix construction unit is used for constructing a corresponding sparse representation subscript matrix according to the sparse representation subscript vector corresponding to each word, and constructing a corresponding sparse representation coefficient matrix according to the sparse representation coefficient vector corresponding to each word.
In the embodiment of the present invention, when wiWhen the number of the medium non-zero components is s, the non-zero components are in wiA dimension in (a) may indicate a sparse representation index vector corresponding to the term, and a value of a non-zero component may indicate a sparse representation coefficient vector corresponding to the term.
For example, if the preset sparse effect parameter s is 2, b is 5, and the number of words n is 10000, in this case, wiDimension (d) is 5. By solving the LASSO regression problem, when wiWhen the number of the non-zero components is s (namely 2), the obtained wiIs (0, 0.3,0,0,0.6), where the non-zero components are the 2 nd and 5 th dimensions, respectively, then the corresponding sparsity of the term indicates a subscript vector of (2, 5). When the values of the non-zero components are 0.3 and 0.6, respectively, the sparse representation coefficient vector is (0.3, 0.6).
Because each word vector in the word vector matrix can construct an s-dimensional sparse representation subscript vector and an s-dimensional sparse representation sparse vector, n s-dimensional sparse representation subscript vectors and n s-dimensional sparse representation coefficient vectors can be constructed together, the n s-dimensional sparse representation subscript vectors can form a sparse representation subscript matrix, the n s-dimensional sparse representation coefficient vectors can form a sparse representation coefficient matrix, and thus the n s sparse representation subscript matrix and the n s sparse representation coefficient matrix are obtained.
In the input device based on the recurrent neural network disclosed in the embodiment of the present invention, the regression problem solving unit includes: the system comprises an initial numerical value determining subunit, a regression problem calculating subunit, a non-zero fraction subunit and a numerical value adjusting subunit;
the initial value determining subunit is used for determining
Figure GDA0002710446870000153
The value of (a) is a preset initial value;
the regression problem calculation subunit is used for calculating the regression problem
Figure GDA0002710446870000161
Substituting the value of (a) into the LASSO regression problem and calculating to obtain wi
The non-zero component quantum unit is used for acquiring wiThe number of the medium and non-zero components, judging wiWhether the number of the medium non-zero components is equal to s or not is judged, if so, the non-zero components are obtained in wiIf not, triggering the numerical value adjusting subunit to execute corresponding operation;
the value adjusting subunit is used for determining if wiThe number of the medium and non-zero components is larger than s, and the expansion is carried out according to a preset first adjustment amplitude
Figure GDA0002710446870000162
If w isiThe number of the medium and non-zero components is less than s, and the medium and non-zero components are reduced according to a preset second adjustment amplitude
Figure GDA0002710446870000163
Then triggers the regression problem calculation subunit to execute the corresponding operation.
Wherein the preset first adjustment range may be 1.5, in which case the expanded one
Figure GDA0002710446870000164
The numerical value of (A) is before enlargement
Figure GDA0002710446870000165
1.5 times the value of (c). Of course, the preset first adjustment amplitude may also be other values, which is not limited in the embodiment of the present invention.
Further, in the input device based on the recurrent neural network disclosed in the embodiment of the present invention, the candidate word acquiring module includes:
a first searching unit, configured to search the sparse representation subscript matrix, and obtain a sparse representation subscript vector corresponding to the required word in the sparse representation subscript matrix, where the sparse representation subscript vector is used to indicate a position of a target base vector of the required word in the overcomplete base matrix;
the second searching unit is used for searching the sparse representation coefficient matrix and acquiring a sparse representation coefficient vector corresponding to a required word in the sparse representation coefficient matrix, wherein the sparse representation coefficient vector is used for indicating a coefficient corresponding to a target base vector of the required word;
a target base vector obtaining unit, configured to search the overcomplete base matrix according to the sparse representation subscript vector, and obtain a target base vector of the required word;
and the weighted combination unit is used for carrying out weighted combination according to the target base vector and the coefficient corresponding to the target base vector, and calculating the word vector of the required word.
After the word vector of the required word is obtained, the word vector of the required word is input into the neural network model, and the neural network model can output a corresponding candidate word.
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The same and similar parts in the various embodiments in this specification may be referred to each other. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the description in the method embodiment.
The above-described embodiments of the present invention should not be construed as limiting the scope of the present invention.

Claims (8)

1. An input method based on a recurrent neural network, comprising:
obtaining a word vector matrix, wherein the word vector matrix is an n-e matrix, n is the number of words, and e is the word vector dimension of each word;
obtaining word vectors of b words in the word vector matrix, and constructing an overcomplete basis matrix according to the word vectors of the b words, wherein the overcomplete basis matrix is a b-x-e matrix, each basis vector in the overcomplete basis matrix represents a word vector corresponding to the b words, and b is a positive integer less than n;
constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the overcomplete substrate matrix, wherein the sparse representation subscript matrix and the sparse representation coefficient matrix are both n-s matrices, and s is a preset sparse effect parameter;
when a candidate word needs to be recommended, acquiring a target base vector of the required word according to the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, determining a word vector of the required word through the target base vector, and acquiring the candidate word according to the word vector of the required word and the recurrent neural network model;
wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the coefficient vector contained in the sparse representation coefficient matrix represents a coefficient corresponding to a required target base vector if the word vectors of the words are formed;
when a candidate word needs to be recommended, acquiring a target base vector of a required word according to the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix, and determining a word vector of the required word according to the target base vector, wherein the method comprises the following steps:
searching the sparse representation subscript matrix, and acquiring a sparse representation subscript vector corresponding to the required word in the sparse representation subscript matrix, wherein the sparse representation subscript vector is used for indicating the position of a target base vector of the required word in the overcomplete base matrix;
searching the sparse representation coefficient matrix, and acquiring a sparse representation coefficient vector corresponding to the required word in the sparse representation coefficient matrix, wherein the sparse representation coefficient vector is used for indicating a coefficient corresponding to a target base vector of the required word;
searching the overcomplete base matrix according to the sparse representation subscript vector to obtain a target base vector of the required word;
and performing weighted combination according to the target base vector and the coefficient corresponding to the target base vector to calculate the word vector of the required word.
2. The recurrent neural network-based input method of claim 1, wherein said obtaining a word vector of b words in said word vector matrix comprises:
ordering the words according to word frequency;
b words with the highest word frequency are selected according to the sequencing result;
and obtaining the word vectors of the b words by inquiring the word vector matrix.
3. The recurrent neural network-based input method of claim 1, wherein said constructing a sparse representation subscript matrix and a sparse representation coefficient matrix from the word vector of each word and the overcomplete basis matrix comprises:
setting the word vector of one word as xiThe overcomplete basis matrix is B, the following LASSO regression problem is created:
Figure FDA0002710446860000021
wherein the content of the first and second substances,
Figure FDA0002710446860000022
being a hyper-parameter in real form, wiAs vector parameters, wiIs equal to the number of rows of the overcomplete base matrix;
solving the LASSO regression problem to obtain current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component;
according to said non-zero component at wiDetermining a sparse representation subscript vector corresponding to the word, and determining a sparse representation coefficient vector corresponding to the word according to the value of the non-zero component;
and constructing a corresponding sparse representation subscript matrix according to the sparse representation subscript vector corresponding to each word, and constructing a corresponding sparse representation coefficient matrix according to the sparse representation coefficient vector corresponding to each word.
4. The recurrent neural network-based input method of claim 3, wherein said solving said LASSO regression problem, obtaining current wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component, including:
41) determining
Figure FDA0002710446860000025
The value of (a) is a preset initial value;
42) will be provided with
Figure FDA0002710446860000026
Substituting the value of (a) into the LASSO regression problem and calculating to obtain wi
43) Obtaining wiThe number of the medium and non-zero components, judging wiWhether the number of the medium non-zero components is equal to s or not is judged, if so, the non-zero components are obtained in wiAnd the value of the non-zero component, if not, performing the operation of step 44);
44) if wiThe number of the medium and non-zero components is larger than s, and the expansion is carried out according to a preset first adjustment amplitude
Figure FDA0002710446860000023
If w isiThe number of the medium and non-zero components is less than s, and the medium and non-zero components are reduced according to a preset second adjustment amplitude
Figure FDA0002710446860000024
Then returns to performing the operation of step 42).
5. An input device based on a recurrent neural network, comprising:
the first matrix obtaining module is used for obtaining a word vector matrix, wherein the word vector matrix is an n-e matrix, n is the number of words, and e is the word vector dimension of each word;
a second matrix obtaining module, configured to obtain word vectors of b words in the word vector matrix, and construct an overcomplete basis matrix according to the word vectors of the b words, where the overcomplete basis matrix is a b × e matrix, each basis vector in the overcomplete basis matrix represents a word vector corresponding to the b words, and b is a positive integer smaller than n;
the third matrix acquisition module is used for constructing a sparse representation subscript matrix and a sparse representation coefficient matrix according to the word vector of each word and the over-complete substrate matrix, wherein the sparse representation subscript matrix and the sparse representation coefficient matrix are both n × s matrices, and s is a preset sparse effect parameter;
the candidate word acquisition module is used for acquiring a target base vector of a required word according to the overcomplete base matrix, the sparse representation subscript matrix and the sparse representation coefficient matrix when the candidate word needs to be recommended, determining a word vector of the required word according to the target base vector, and acquiring the candidate word according to the word vector of the required word and the recurrent neural network model;
wherein the target base vector of each term is used to combine into a term vector for the term; the sparse representation subscript matrix comprises subscript vectors representing the positions of the required target substrate vectors in the overcomplete substrate matrix if the word vectors forming the words are formed; the coefficient vector contained in the sparse representation coefficient matrix represents a coefficient corresponding to a required target base vector if the word vectors of the words are formed;
the candidate word acquisition module comprises:
a first searching unit, configured to search the sparse representation subscript matrix, and obtain a sparse representation subscript vector corresponding to the required word in the sparse representation subscript matrix, where the sparse representation subscript vector is used to indicate a position of a target base vector of the required word in the overcomplete base matrix;
the second searching unit is used for searching the sparse representation coefficient matrix and acquiring a sparse representation coefficient vector corresponding to the required word in the sparse representation coefficient matrix, wherein the sparse representation coefficient vector is used for indicating a coefficient corresponding to a target base vector of the required word;
a target base vector obtaining unit, configured to search the overcomplete base matrix according to the sparse representation subscript vector, and obtain a target base vector of the required word;
and the weighted combination unit is used for carrying out weighted combination according to the target base vector and the coefficient corresponding to the target base vector, and calculating the word vector of the required word.
6. The recurrent neural network-based input device of claim 5, wherein said second matrix acquisition module comprises:
the word sorting unit is used for sorting the words according to the word frequency;
the word selecting unit is used for selecting b words with the highest word frequency according to the sequencing result;
and the word vector acquisition unit is used for acquiring the word vectors of the b words by inquiring the word vector matrix.
7. The recurrent neural network-based input device of claim 5, wherein said third matrix acquisition module comprises:
a regression problem creation unit for setting a word vector of one of the words as xiThe overcomplete basis matrix is B, the following LASSO regression problem is created:
Figure FDA0002710446860000031
wherein the content of the first and second substances,
Figure FDA0002710446860000032
being a hyper-parameter in real form, wiAs vector parameters, wiIs equal to the number of rows of the overcomplete base matrix;
a regression problem solving unit for solving the LASSORegression problem, obtaining when wiWhen the number of the medium non-zero components is s, the non-zero components are in wiAnd the value of the non-zero component;
a vector determination unit for determining w from the non-zero componentiDetermining a sparse representation subscript vector corresponding to the word, and determining a sparse representation coefficient vector corresponding to the word according to the value of the non-zero component;
and the matrix construction unit is used for constructing a corresponding sparse representation subscript matrix according to the sparse representation subscript vector corresponding to each word, and constructing a corresponding sparse representation coefficient matrix according to the sparse representation coefficient vector corresponding to each word.
8. The recurrent neural network-based input device of claim 7, wherein the regression problem solving unit includes: the system comprises an initial numerical value determining subunit, a regression problem calculating subunit, a non-zero fraction subunit and a numerical value adjusting subunit;
the initial value determining subunit is used for determining
Figure FDA0002710446860000041
The value of (a) is a preset initial value;
the regression problem calculation subunit is used for calculating the regression problem
Figure FDA0002710446860000042
Substituting the value of (a) into the LASSO regression problem and calculating to obtain wi
The non-zero component quantum unit is used for acquiring wiThe number of the medium and non-zero components, judging wiWhether the number of the medium non-zero components is equal to s or not is judged, if so, the non-zero components are obtained in wiIf not, triggering the numerical value adjusting subunit to execute corresponding operation;
the value adjusting subunit is used for determining if wiThe number of the medium and non-zero components is larger than s, and the adjustment range is adjusted according to a preset first adjustment rangeEnlargement
Figure FDA0002710446860000043
If w isiThe number of the medium and non-zero components is less than s, and the medium and non-zero components are reduced according to a preset second adjustment amplitude
Figure FDA0002710446860000044
Then triggers the regression problem calculation subunit to execute the corresponding operation.
CN201711217459.7A 2017-11-28 2017-11-28 Input method and device based on recurrent neural network Active CN108009150B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711217459.7A CN108009150B (en) 2017-11-28 2017-11-28 Input method and device based on recurrent neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711217459.7A CN108009150B (en) 2017-11-28 2017-11-28 Input method and device based on recurrent neural network

Publications (2)

Publication Number Publication Date
CN108009150A CN108009150A (en) 2018-05-08
CN108009150B true CN108009150B (en) 2021-01-05

Family

ID=62054262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711217459.7A Active CN108009150B (en) 2017-11-28 2017-11-28 Input method and device based on recurrent neural network

Country Status (1)

Country Link
CN (1) CN108009150B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522052B (en) * 2018-11-27 2020-05-08 中科寒武纪科技股份有限公司 Computing device and board card
CN112614500A (en) * 2019-09-18 2021-04-06 北京声智科技有限公司 Echo cancellation method, device, equipment and computer storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002059772A2 (en) * 2000-11-09 2002-08-01 Hrl Laboratories, Llc Blind decomposition using fourier and wavelet transforms
CN103280221B (en) * 2013-05-09 2015-07-29 北京大学 A kind of audio lossless compressed encoding, coding/decoding method and system of following the trail of based on base
CN106874292B (en) * 2015-12-11 2020-05-05 北京国双科技有限公司 Topic processing method and device
CN106875280A (en) * 2017-03-02 2017-06-20 深圳大图科创技术开发有限公司 Integrated community service platform

Also Published As

Publication number Publication date
CN108009150A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
CN109871532B (en) Text theme extraction method and device and storage medium
CN109923556B (en) Pointer Sentinel Hybrid Architecture
CN109947919B (en) Method and apparatus for generating text matching model
CN107704102B (en) Text input method and device
US11636341B2 (en) Processing sequential interaction data
CN106251174A (en) Information recommendation method and device
US10685012B2 (en) Generating feature embeddings from a co-occurrence matrix
CN112434188B (en) Data integration method, device and storage medium of heterogeneous database
WO2023138188A1 (en) Feature fusion model training method and apparatus, sample retrieval method and apparatus, and computer device
KR20210106398A (en) Conversation-based recommending method, conversation-based recommending apparatus, and device
CN108009150B (en) Input method and device based on recurrent neural network
CN112785005A (en) Multi-target task assistant decision-making method and device, computer equipment and medium
CN116822651A (en) Large model parameter fine adjustment method, device, equipment and medium based on incremental learning
CN112989843B (en) Intention recognition method, device, computing equipment and storage medium
CN111626783A (en) Offline information setting method and device for realizing event conversion probability prediction
KR20210090706A (en) Sort
CN116244442A (en) Text classification method and device, storage medium and electronic equipment
CN114175017A (en) Model construction method, classification method, device, storage medium and electronic equipment
CN116186326A (en) Video recommendation method, model training method, electronic device and storage medium
CN113312445B (en) Data processing method, model construction method, classification method and computing equipment
CN113641767B (en) Entity relation extraction method, device, equipment and storage medium
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
CN113515701A (en) Information recommendation method and device
JP7375096B2 (en) Distributed representation generation system, distributed representation generation method, and distributed representation generation program
CN112749275B (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant