CN107422872A - A kind of input method, device and the device for input - Google Patents

A kind of input method, device and the device for input Download PDF

Info

Publication number
CN107422872A
CN107422872A CN201610350134.5A CN201610350134A CN107422872A CN 107422872 A CN107422872 A CN 107422872A CN 201610350134 A CN201610350134 A CN 201610350134A CN 107422872 A CN107422872 A CN 107422872A
Authority
CN
China
Prior art keywords
word
sequence
vector
tuple
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610350134.5A
Other languages
Chinese (zh)
Other versions
CN107422872B (en
Inventor
崔欣
张扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201610350134.5A priority Critical patent/CN107422872B/en
Publication of CN107422872A publication Critical patent/CN107422872A/en
Application granted granted Critical
Publication of CN107422872B publication Critical patent/CN107422872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiments of the invention provide a kind of input method, device and device for input, input method therein specifically includes:Obtain primary vector sequence corresponding to input string;According to preset n-tuple relation computation rule, the first n-tuple relation score corresponding to the primary vector sequence is calculated;According to the first n-tuple relation score, candidate item corresponding to the input string is determined.The embodiment of the present invention can be stored for obtaining the vector of above-mentioned primary vector sequence, and can not store all n-tuple relations more than or equal to 2, therefore can save substantial amounts of memory space.

Description

A kind of input method, device and the device for input
Technical field
The present invention relates to input method technique field, more particularly to a kind of input method, device and device for input.
Background technology
For the users such as Chinese, Japanese, Korean, it is typically necessary and is handed over by input method system and computer Mutually.For example, user can input string by a keyboard entry, then will by the input method system Standard Map rule preset according to its The input string is converted to the candidate item of corresponding language and displaying, and then will shield in the candidate item of user's selection.
With the continuous development of input method technology, and to input experience continuous lifting, user for input long word or The demand of person's sentence is also increasing, such as inputs long word:" grabbing crab in seashore ", " common reserve fund drops everyday ", " United States of America's moral State ", " today, weather was really sunny " etc..In order to meet input demand of the user for above-mentioned long word or sentence, according to Traditional n-gram (n-ary relation) storage mode, need in system dictionary triple as storage " seashore | grab | crab " or Person's multi-component system.
However, in actual applications, when n-ary relation is more than or equal to 3, the n-tuple relation of required storage is by with geometry Multiple increases, and for the input equipment of limited memory, such as mobile phone, tablet personal computer obviously can not meet to n-gram storage organizations Full storage, therefore, 2 yuan of relations in system dictionary generally use n-gram.As can be seen that existing n-gram storage sides Formula can not meet the needs of to n-tuple relation in the case of limited storage space.
The content of the invention
In view of the above problems, it is proposed that the embodiment of the present invention overcomes above mentioned problem or at least in part to provide one kind Input method, device and the device for input to solve the above problems, the input process of mathematic(al) representation can be simplified, improved Input efficiency.
In order to solve the above problems, the embodiment of the invention discloses a kind of input method, including:
Obtain primary vector sequence corresponding to input string;
According to preset n-tuple relation computation rule, the first n-tuple relation corresponding to the primary vector sequence is calculated Score;
According to the first n-tuple relation score, candidate item corresponding to the input string is determined.
Alternatively, corresponding to the acquisition input string the step of primary vector sequence, including:
Cutting is carried out to the input string of user according to first word, to obtain the first character cutting result;
Obtain first yuan of word sequence corresponding to the first character cutting result;
The term vector storehouse established is inquired about, obtains vector corresponding to each first word in first yuan of word sequence;
Vector corresponding to each first word in first yuan of word sequence is sequentially connected in series, obtained first corresponding to the input string Sequence vector.
Alternatively, the term vector storehouse is established as follows:
Obtain first word number corresponding to first word in dictionary;
To vector corresponding to first word generation in the dictionary;
According to the mapping relations between first word number and the vector, term vector storehouse is established.
Alternatively, methods described also includes:
Obtain system word sequence corresponding to the input string;
Determine the second n-tuple relation score corresponding to the system word sequence;
It is then described according to the first n-tuple relation score, the step of determining candidate item corresponding to the input string, including:
According to the sequence of the first n-tuple relation score and the second n-tuple relation score, determine that the input string is corresponding Candidate item.
Alternatively, described the step of obtaining system word sequence corresponding to the input string, including:
Cutting is carried out to the input string according to system word, to obtain the second character cutting result;
Obtain system word sequence corresponding to the second character cutting result.
Alternatively, described the step of determining the second n-tuple relation score corresponding to the system word sequence, including:
Inquiry obtains word frequency corresponding to each system word in the system word sequence in system dictionary, and the system is calculated A tuple word score corresponding to system word sequence;
When binary crelation be present in the system word sequence, according to the binary crelation, the system word is calculated Two tuple word score corresponding to sequence;
According to the tuple word score and two tuple word scores, more than second yuan is determined corresponding to the system word sequence Relation score.
Alternatively, methods described also includes:
Obtain the above of the input string and/or hereafter corresponding second yuan of word sequence;
The term vector storehouse established is inquired about, obtains vector corresponding to each first word in second yuan of word sequence;
Vector corresponding to each first word in second yuan of word sequence is sequentially connected in series, obtains secondary vector sequence;
The 3rd n-tuple relation score between the primary vector sequence and secondary vector sequence is calculated, according to the described 3rd Sequence of the n-tuple relation score to candidate item corresponding to the input string is adjusted.
Alternatively, methods described also includes:
According to the above of the input string and/or hereafter, association's candidate item corresponding to the input is obtained;
Obtain the 3rd sequence vector corresponding to association's candidate item;
The 4th n-tuple relation score between the secondary vector sequence and the 3rd sequence vector is calculated, according to the described 4th N-tuple relation score is ranked up displaying to association's candidate item.
On the other hand, the embodiment of the invention discloses a kind of input unit, including:
Primary vector retrieval module, for obtaining primary vector sequence corresponding to input string;
First n-tuple relation computing module, for according to preset n-tuple relation computation rule, being calculated described first First n-tuple relation score corresponding to sequence vector;And
Candidate item determining module, for according to the first n-tuple relation score, determining candidate corresponding to the input string .
Another aspect, the embodiment of the invention discloses a kind of device for being used to input, include memory, and one or The more than one program of person, one of them or more than one program storage in memory, and be configured to by one or More than one computing device is one or more than one program bag contains the instruction for being used for being operated below:
Obtain primary vector sequence corresponding to input string;
According to preset n-tuple relation computation rule, the first n-tuple relation corresponding to the primary vector sequence is calculated Score;
According to the first n-tuple relation score, candidate item corresponding to the input string is determined.
The embodiment of the present invention includes advantages below:
In embodiments of the present invention, can according to preset n-tuple relation computation rule, to corresponding to input string first to Amount sequence is calculated, so as to obtain n-tuple relation score corresponding to the primary vector sequence, and according to the polynary pass It is score, determines candidate item corresponding to the input string so that the candidate item of acquisition can embodies the n-tuple relation in input string. N-tuple relation score due to the embodiment of the present invention is by the way that primary vector sequence is calculated, rather than is obtained from dictionary Take, that is, the embodiment of the present invention only needs storage to be used for the vector for obtaining above-mentioned primary vector sequence, and can not store and be more than All n-tuple relations equal to 2, therefore substantial amounts of memory space can be saved.
Brief description of the drawings
Fig. 1 is a kind of step flow chart of input method embodiment one of the present invention;
Fig. 2 is a kind of step flow chart of the embodiment of the method in generation term vector storehouse of the present invention;
Fig. 3 is a kind of step flow chart of input method embodiment three of the present invention;
Fig. 4 is a kind of step flow chart of input method example IV of the present invention;
Fig. 5 is a kind of step flow chart of input method embodiment five of the present invention;
Fig. 6 is a kind of structured flowchart of input unit embodiment of the present invention;
Fig. 7 is a kind of block diagram of device 800 for being used to input of the present invention;And
Fig. 8 is a kind of structural representation of server of the present invention.
Embodiment
In order to facilitate the understanding of the purposes, features and advantages of the present invention, it is below in conjunction with the accompanying drawings and specific real Applying mode, the present invention is further detailed explanation.
One of the core concepts of the embodiments of the present invention is, one kind is proposed in using input method input process, by pre- The n-tuple relation computation rule put, is calculated the scheme of n-tuple relation score, and can according to the n-tuple relation score, Determine candidate item corresponding to input string so that the candidate item of acquisition can embody the n-tuple relation in input string.In this scenario, N-tuple relation score due to the embodiment of the present invention is by the way that primary vector sequence is calculated, rather than is obtained from dictionary Take, that is, the embodiment of the present invention can only need storage to be used for the vector for obtaining above-mentioned primary vector sequence, and can not store All n-tuple relations more than or equal to 2, therefore substantial amounts of memory space can be saved.
Embodiment of the method one
Reference picture 1, a kind of step flow chart of input method embodiment one of the present invention is shown, can specifically included such as Lower step:
Step 101, obtain primary vector sequence corresponding to input string;
Step 102, according to preset n-tuple relation computation rule, be calculated first corresponding to the primary vector sequence N-tuple relation score;
Step 103, according to the first n-tuple relation score, determine candidate item corresponding to the input string.
The embodiment of the present invention can apply to the input method system of various input modes, for example, Pinyin Input, English are defeated Enter, stroke input, phonetic entry and handwriting input etc..User can complete the defeated of input string by above-mentioned any input mode Enter, that is, user can be inputted by physical keyboard, dummy keyboard, handwriting pad, touch-screen, voice collection device etc..Its In, input string can be made up of any one of numeral, symbol, phonetic, English alphabet etc. or several.For the ease of describing, The embodiment of the present invention illustrates using pinyin string as input string, and other types of input string is cross-referenced.
During being inputted using input method, in order to obtain more n-ary relations (n is more than or equal to 2), need Sizable memory space is expended to store n-gram structures, however, terminal device (such as hand for limited storage space Machine) for, it is difficult to realize and n-ary relations of all n more than or equal to 2 is stored, generally only store 2-gram structures, that is, Binary crelation, however, only storage binary crelation is difficult to meet the needs of user is for input long word or sentence again.Therefore, it is Solve the above problems, the embodiment of the present invention proposes one kind by calculating input vector corresponding to input string, to obtain n members The scheme of relation, the embodiment of the present invention propose acquisition n-ary relation method, can not store it is all more than or equal to 2 In the case of n-tuple relation, demand of the user for input long word or sentence is disclosure satisfy that, and then memory space can be saved. Wherein, it is described be more than or equal to 2 all n-tuple relations, can specifically include complete n-gram structures, such as 2-gram, 3- Gram, 4-gram are until n-gram structures;Or long word or sentence with n-tuple relation, such as " we have a meal together ", " today, weather was really sunny " etc..
In a kind of alternative embodiment of the present invention, corresponding to the acquisition input string the step of primary vector sequence, tool Body can include:
Sub-step S11, according to first word cutting is carried out to the input string of user, to obtain the first character cutting result;
Sub-step S12, obtain first yuan of word sequence corresponding to the first character cutting result;
Sub-step S13, the term vector storehouse established of inquiry, obtain in first yuan of word sequence corresponding to each first word to Amount;
Sub-step S14, vector corresponding to each first word in first yuan of word sequence is sequentially connected in series, obtains the input string Corresponding primary vector sequence.
In the embodiment of the present invention, first word can be used for representing the vocabulary with the independence of concept and the unit of concept.Generally The independence of thought means that the concept that vocabulary is showed has separate but complete implication;The unit of concept means that first word is showed Concept be a most basic concept units, i.e., can not all be split again in implication or on literal.For example, " number Learn " it is a first word, it is demonstrated by an independent concept, is a unit concept again, it is impossible to is split as " counting " again and " "; " mathematical modeling " is not then a first word, although it is demonstrated by an independent concept, can be further broken into " mathematics " With " model " two first words.
The embodiment of the present invention represents first word with vector, and establishes term vector storehouse, can be stored in the term vector storehouse Corresponding relation between first word and vector, by inquiring about the term vector storehouse, it can obtain vectorial corresponding to first word.The present invention is implemented Example uses vector representation member word, can obtain first word by numerical computations in the case where not storing the n-ary relation of n-gram structures Between correlation, and then the n-ary relation that can be calculated by model between first word.
In one kind application example of the present invention, it is assumed that the input string received is " mantiandaxue ", according to first word Cutting is carried out to the input string, following character cutting result can be obtained:[mantian] [daxue], the character cutting result pair The first word sequence answered can include:All over the sky | heavy snow, all over the sky | heavy snow, all over the sky | heavy snow, all over the sky | heavy snow etc..Established by inquiry Term vector storehouse, obtain vector corresponding to each first word in first word sequence and be specifically as follows:It is vectorial corresponding to first word " boundless " For V1, vector corresponding to first word " all over the sky " is V2, and vector is V3 corresponding to first word " heavy snow ", and vector corresponding to first word " university " is V4;Then primary vector sequence corresponding to input string " mantiandaxue " can include:(V1,V3)、(V1,V4)、(V2,V3)、 (V2,V4).Next, it can be calculated according to preset n-tuple relation computation rule corresponding to above-mentioned primary vector sequence First n-tuple relation score, it is to calculate binary crelation score herein, such as through primary vector sequence (V1, V3) is calculated Binary crelation is scored at 90, and the binary crelation of (V1, V4) is scored at 10, and the binary crelation of (V2, V3) is scored at 60, (V2, V4) Binary crelation be scored at 2;As can be seen that the binary crelation highest scoring of primary vector sequence (V1, V3), namely first word are " unrestrained My god " annexation between " heavy snow " is most strong, therefore, " whirling snow " can be exported as candidate item.
It is alternatively possible to n-tuple relation score is met that the candidate item of predetermined threshold value is ranked up output, such as above-mentioned Using in example, it is 55 to set predetermined threshold value, then " whirling snow " and " heavy snow all over the sky " can be used as to candidate item, and according to The sequence output of binary crelation score.
In embodiments of the present invention, preset model can be utilized to calculate the first n-tuple relation of the primary vector sequence to obtain Point.Wherein, the preset model, can be specifically a multilayer neural network.The input of the preset model can be vectorial sequence Row, output can be a probable value, for representing n-tuple relation score.When training the preset model, can utilize existing Candidate item above and corresponding gather as training, the vector table for obtaining each first word reaches, and desired output (0 or 1) it is trained, finally obtains the parameter of all nodes in multilayer neural network.
For example, sequence vector (V1, V2, V3) will be obtained after vector concatenation corresponding to three first words, can be by sequence vector The input of (V1, V2, V3) as this model, the output of this model is a probable value, and the probable value is bigger, then it represents that three Ternary relation between individual first word is stronger;Conversely, represent that ternary relation is weaker.It is appreciated that the side above by vector concatenation Formula obtains sequence vector, only one kind application example as the present invention, and the embodiment of the present invention is for obtaining the specific of sequence vector Mode is not any limitation as.Such as it can also be operated using the mode similar to convolution to be fixed the vector of window size Sequence.Specifically, CNN (convolutional neural networks, Convolutional Neural Network) method can be utilized to some Individual vector is handled, and for CNN models, number and size regardless of input vector thereto, the CNN models are all Have the ability to integrate the vector of input, export the sequence vector of a fixed dimension.
In the another kind application example of the present invention, it is assumed that user wants input " today, weather was really sunny ", obtains First word sequence corresponding to input string is taken, and is slided to the right by 3 sliding window of a size, to obtain every three in input string Ternary relation between individual adjacent first word.First word sequence first in first sliding window can be " today | weather | it is true It is ";Vector corresponding to wherein each first word is obtained by inquiring about term vector storehouse, vectorial head and the tail concatenation can be obtained into corresponding vector Sequence, the input using this sequence vector as preset model, the output of preset model is that the ternary relation being calculated obtains Point.Then window is slided to the right, continues to calculate the ternary relation score of " weather | be really | sunlight ", and ternary relation is obtained Divide high conduct candidate item output.
To sum up, in embodiments of the present invention, can be according to preset n-tuple relation computation rule, to corresponding to input string One sequence vector is calculated, so as to obtain n-tuple relation score corresponding to the primary vector sequence, and according to described more First relation score, determines candidate item corresponding to the input string so that the candidate item of acquisition can embody polynary in input string Relation.N-tuple relation score due to the embodiment of the present invention is by the way that primary vector sequence is calculated, rather than from word Obtain in storehouse, that is, the embodiment of the present invention only needs storage to be used for the vector for obtaining above-mentioned primary vector sequence, and can not deposit All n-tuple relations of the storage more than or equal to 2, therefore substantial amounts of memory space can be saved.
Embodiment of the method two
The present embodiment describes the detailed process in generation term vector storehouse in detail on the basis of above-described embodiment one.Reference picture 2, a kind of step flow chart of the embodiment of the method in generation term vector storehouse of the present invention is shown, can specifically be included:
Step 201, obtain first word number corresponding to first word in dictionary;
Step 202, in the dictionary first word generation corresponding to term vector;
Step 203, according to the mapping relations between first word number and the term vector, establish term vector storehouse.
Dictionary in the embodiment of the present invention can specifically include:System dictionary, user thesaurus, system n members storehouse and word to Measure storehouse.Wherein, the system dictionary can the higher unitary vocabulary of corpus statisticses obtain according to incoming frequency;The user Dictionary is according to the input behavior collection of user, unitary that is meeting user's input habit or polynary vocabulary;The system n members storehouse Can according in language material two or more words annexation, count the obtained more first vocabularys of n-gram, usually 2- Gram binary vocabulary;The term vector storehouse can be to utilize vector representation to first word in system dictionary, obtained vectorial vocabulary.
It is appreciated that the term vector storehouse established in the embodiment of the present invention, can enter according to any dictionary in above-mentioned dictionary Row is established, and for ease of description, is said in the embodiment of the present invention exemplified by establishing term vector storehouse according to first word in system dictionary Bright, the scene that term vector storehouse is established according to first word in other dictionaries is cross-referenced.
To save the memory space that dictionary takes, the system n members storehouse used in the embodiment of the present invention, binary can be only stored Relation, to n-tuple relation more than ternary and ternary, it can be calculated by vector.Certainly, in actual applications, Ke Yiyi According to the n-tuple relation of the processing of system or storage capacity selection storage, such as polynary pass more than ternary and ternary can also be stored System etc., and the n-tuple relation for not storing, the vector of the present invention can also be coordinated to be calculated.In a word, the present invention is implemented Example is not any limitation as the particular content of system n member library storages.
In a particular application, the more first vocabularys of n-gram can be used for the annexation for representing two or more words, with binary Relation " all over the sky | heavy snow " exemplified by, in the more first vocabularys of n-gram, the bivariate frequency of the two words can be utilized to represent the two words Between annexation power.First word in system dictionary is utilized vector representation by the embodiment of the present invention, by two vectors A score value be calculated to represent the power of annexation between two first words.Thus, it is only necessary to it is corresponding to store first word Term vector, can obtain n-tuple relation by calculating, the actual n-ary relation of storage can not had to, it is a large amount of so as to save Memory space.
In embodiments of the present invention, can be with word frequency, first word corresponding to storage system word and the system word in system dictionary Number etc. information, specifically, can be according to following form storage system word:System entry i | word frequency i | first word i.Wherein, it is described First word number can be positive integer, i.e., with an integer representation system entry.For example, following system word is stored with system dictionary: Over the sky | 506 | 39, all over the sky | 501 | 23, university | 701 | 67, heavy snow | 302 | 89 etc..Wherein, word frequency corresponding to system word " all over the sky " For 506, and the inherently first word of system word " all over the sky ", first word number corresponding to " all over the sky " are 39.In another example of the present invention In, following system word is stored with system dictionary:The United States of America | 368 | 0, wherein, system word " United States of America " is right The word frequency answered is 368, because system word " United States of America " is not first word, then corresponding first word number can be identified as into 0.
In embodiments of the present invention, the n-tuple relation between first word number can be stored in system n members storehouse.Specifically, with two Exemplified by first relation, the binary crelation between first word number can be stored according to following form:First word i | first word j | bivariate frequency.Its In, the bivariate frequency can be used for the power for representing annexation between first word i and first word j.For example, stored in system n members storehouse There is following binary crelation:23|89|8、23|67|1.By inquiry system dictionary, first word corresponding to first word number 23 is " unrestrained My god ", first word corresponding to first word number 89 is " heavy snow ", then the bivariate frequency between first word " boundless " and " heavy snow " is 8;And member First word corresponding to word number 67 is " university ", then the bivariate frequency between first word " boundless " and " university " is 1.It is as can be seen that " unrestrained My god " annexation between " heavy snow " is better than annexation between " boundless " and " university ".
In a particular application, because the limitation of memory space, system n members storehouse generally only store binary crelation, in order to More n-ary relations are obtained, the embodiment of the present invention is represented using vector first word in system dictionary, by being carried out to vector The n-ary relation between first word is calculated.Specifically, vector corresponding to first word can be stored according to following form:First word i | Vector<v1,v2,…,vd>.Wherein, the vector can be a multi-C vector, such as above-mentioned vector<v1,v2,…,vd>For one Individual d dimensional vectors.
In one kind application example of the present invention, following vector is stored with term vector storehouse:39|<0.5,0.97,...., 0.65>、89|<0.43,0.67,…,0.12>.By inquiry system dictionary, first word corresponding to first word number 39 is " all over the sky ", Then first word " all over the sky " can be expressed as vector<0.5,0.97,....,0.65>;First word corresponding to first word number 89 is " heavy snow ", then First word " heavy snow " can be expressed as vector<0.43,0.67,…,0.12>.By calculating above-mentioned two vector, can obtain To the power of the binary crelation between first word " all over the sky " and " heavy snow ".
Wherein, vector can obtain according to the distributed method for expressing of vocabulary corresponding to first word, you can with a multidimensional Vector represent vocabulary.For example, in the examples described above, with vector<0.5,0.97,....,0.65>Represent vocabulary " all over the sky ".
, can be by multiple vocabulary pair for the power of annexation between vocabulary after vocabulary vector representation The vector answered is calculated.Specifically, multiple vectors are calculated, can specifically uses the calculation of inner product of vectors Or the calculation of other model classes, it will be understood that the embodiment of the present invention is not any limitation as to the calculation of vector.
In a kind of alternative embodiment of the present invention, following vectorial calculation can be provided:
Mode one
Calculated by way of inner product of vectors.For example, calculate vectorial d1 (d11,d12,d13,...,d1n) and d2 (d21, d22,d23,…,d2n) between n-tuple relation score, specific formula for calculation is as follows:
Result=d11×d21+d12×d22+d13×d23+d14×d24+…+d1n×d2n (1)
Mode two
Calculated by NNLM (Neural Network Language Model, neutral net language model).Specifically Ground, each NNLM can set the vocabulary number of input layer, such as setting NNLM input vocabulary window is 3, then input layer Nodes be 3 × D, wherein, D be vector dimension.By the vectorial V1 of nearest three vocabulary, V2, V3 carry out ending concatenation Sequence vector V (V1, V2, V3) is obtained, V (V1, V2, V3) is input in NNLM, output can obtain vectorial V1, V2, V3's N-tuple relation score.
Mode three
Calculated by RNN (Recurrent Neural Networks, Recognition with Recurrent Neural Network).Specifically, may be used not The number of vocabulary in input vocabulary window is limited, the vector of each vocabulary is input in RNN, obtains hidden layer expression; This hidden layer represents that the input joined together as RNN next time and can be inputted next time;The neuron number of output layer with First word vocabulary size is identical, and the output of each neuron is then the probability that the vocabulary is predicted.
In a kind of alternative embodiment of the present invention, a first word can be corresponding with a variety of different vector representations, so that It is more accurate according to the n-tuple relation that vector is calculated in different input scenes.For example, for different input scenes Such as QQ (instant communication software), map, game, word (word processing program) etc., different vector representations can be used.Example Such as, some first word can represent some place name in map, but but have different implications in other scenes.Therefore, A variety of different vectors are set to first word, the vector corresponding with input scene can be obtained in different input scenes, with Improve the accuracy of vector representation.
It is appreciated that in actual applications, for the specifically used method in term vector storehouse, the embodiment of the present invention is not limited System, for example, in input process, can be used alone term vector storehouse to calculate n-tuple relation;Or system word can also be used The combination in storehouse, system n members storehouse and term vector storehouse, n-tuple relation etc. is obtained by COMPREHENSIVE CALCULATING.
The embodiment of the present invention can inquire about to obtain vector corresponding to each first word, so as to calculate by term vector storehouse To the n-tuple relation score between first word, to represent the power of annexation between first word.Because term vector storehouse only needs storage member It is vectorial corresponding to word, actual n-ary relation can not be stored, therefore, in the case where dictionary size is limited, can be obtained more N-ary relation so that the covering of n-ary relation is wider.
Embodiment of the method three
The present embodiment can combine the n-tuple relation in system n members storehouse, and pass through on the basis of above-described embodiment two The n-tuple relation that vector is calculated determines candidate item corresponding to input string, with using system n members storehouse in terms of the high frequency vocabulary Advantage, and advantage of the term vector storehouse in terms of n-tuple relation covering so that group word result is more accurate.
Reference picture 3, a kind of step flow chart of input method embodiment three of the present invention is shown, can specifically included such as Lower step:
Step 301, obtain primary vector sequence corresponding to input string;
Step 302, according to preset n-tuple relation computation rule, be calculated first corresponding to the primary vector sequence N-tuple relation score;
Step 303, obtain system word sequence corresponding to the input string;
Step 304, determine the second n-tuple relation score corresponding to the system word sequence;
Step 305, the sequence according to the first n-tuple relation score and the second n-tuple relation score, are determined described defeated Enter candidate item corresponding to string.
In embodiments of the present invention, after the input string of user is received, first, obtain corresponding to the input string first to Measure sequence and system word sequence;Then, the first n-tuple relation score corresponding to primary vector sequence is calculated respectively, and Second n-tuple relation score corresponding to system word sequence;Finally, obtained according to the first n-tuple relation score and the second n-tuple relation The sequence divided, determines candidate item corresponding to the input string.Wherein, it is described to obtain system word sequence corresponding to the input string Step, it can specifically include:
Step S21, cutting is carried out to the input string according to system word, to obtain the second character cutting result;
Step S22, system word sequence corresponding to the second character cutting result is obtained.
Wherein, the system word can use the system word stored in existing system dictionary.Should in one kind of the present invention It is " meilijianhezhongguodezhou " with the input string of user in example, is received, can by inquiry system dictionary So that the input string is carried out into cutting according to system word, obtain corresponding to system word sequence can be " United States of America | Dezhou ", In system dictionary, system word can include first word or compound word, because compound word " United States of America " is one conventional Proper noun, its entirety acquires a special sense, and therefore, compound word " United States of America " can also enter as a system word Row storage.And if carrying out cutting to the input string according to first word, then it can obtain following first word sequence:" United States of America | united states | Dezhou ".
The embodiment of the present invention carries out cutting, to obtain first word corresponding to input string by two kinds of slit modes to input string Sequence and system word sequence, respectively the n-tuple relation score in Computing Meta word sequence between each first word, and in system word sequence N-tuple relation score between each system word, finally sort candidate item corresponding to determining input string according to score, to obtain Candidate item it is more accurate.Specifically, the second n-tuple relation corresponding to the system word sequence can be determined as follows Score:
Step S31, in system dictionary, inquiry obtains word frequency corresponding to each system word in the system word sequence, calculates To a tuple word score corresponding to the system word sequence;
In a particular application, system word and system word are stored with system dictionary corresponding to word frequency, pass through inquiry system Dictionary, word frequency corresponding to each system word in the system word sequence can be obtained, can be obtained by the product for calculating each word frequency A tuple word score corresponding to the system word sequence.
In one kind application example of the present invention, it is assumed that the input string for receiving user is " gongjijintiantianjiang " (corresponding to Chinese is:Common reserve fund drops everyday).The input string is cut according to system word Point, several syllable sequence can be cut into, such as:Gongji | jintian | tianjiang, gongjijin | tiantian | Jiang, gong | jijin | tiantian | jiang etc.;Each of which syllable sequence can correspond to one or more systems Word sequence, such as " gongji | jintian | tianjiang " may corresponding to system word sequence include " attack | today | day Drop ", " cock | today | day drops " etc.;" gongjijin | tiantian | jiang " can be corresponded to " common reserve fund | everyday | drop " etc. System word sequence.
According to word frequency corresponding to each system word in said system word sequence, unitary corresponding to each system word sequence is calculated Group word score, for example, for system word sequence " common reserve fund | everyday | drop ", by inquiry system dictionary, it is corresponding to obtain system word Word frequency be:P (common reserve fund), p (everyday), p (drop);Then the system word sequence pair can be obtained by calculating the product of each word frequency The tuple word score scoreA answered, specifically, scoreA=p (common reserve fund) × p (everyday) × p (drop).
When step S32, binary crelation be present in the system word sequence, according to the binary crelation, institute is calculated State two tuple word score corresponding to system word sequence;
After computationally stating a tuple word score, it can determine whether to whether there is binary in the system word sequence Relation, specifically, the system word inquiry system n members storehouse in the system word sequence can be utilized, for example, inquiry obtains system word " everyday " and " drop " has a binary crelation, and inquires about and obtain the binary crelation and be scored at scoreB, system word " common reserve fund " and " my god My god " between binary crelation is not present, therefore, two tuple words corresponding to the system word sequence can be obtained and be scored at scoreB.
Step S33, according to the tuple word score and two tuple word scores, determine corresponding to the system word sequence Second n-tuple relation score.
, can be according to the tuple word score and two tuple word scores in a kind of alternative embodiment of the present invention Product, the second n-tuple relation score score corresponding to the system word sequence is determined, then score=scoreA × scoreB.It is logical Cross above-mentioned steps, can be calculated second n-tuple relation score score1, score2 corresponding to all system word sequences ... scoreN。
In above-mentioned application example, input string can be corresponding with following first yuan of word sequence:" attack | today | day drops ", " cock | today | day drops ", " common reserve fund | everyday | drop ", vector corresponding to wherein each first word is obtained by inquiring about term vector storehouse, from And primary vector sequence corresponding to above-mentioned each first yuan of word sequence is obtained, by being calculated corresponding to each primary vector sequence One n-tuple relation score score1 ', score2 ' ... scoreN '.
Finally, the first n-tuple relation score being calculated and the second n-tuple relation score are ranked up together, according to According to the height of score, candidate item corresponding to the input string is exported.For example, first yuan of word sequence " common reserve fund | everyday | drop " it is corresponding The first n-tuple relation highest scoring, the second n-tuple relation system word sequence " common reserve fund | say " everyday corresponding to obtain it is high by several times, then " common reserve fund drops everyday ", " common reserve fund | say everyday " can be exported as candidate item, and by candidate item " common reserve fund drops " everyday Come candidate item " common reserve fund | say " everyday before.
Thus, the process that group word is carried out with reference to system dictionary, system n members storehouse and term vector storehouse is completed, utilizes system Advantage of the n members storehouse in terms of high frequency vocabulary, and advantage of the term vector storehouse in terms of n-tuple relation covering so that group word result is more It is accurate to add.
It is appreciated that in above-mentioned application example, the cutting of first word and system word two ways is carried out to input string, and First n-tuple relation score and the second n-tuple relation score corresponding to calculating respectively, then candidate item is determined, process synthesis contrast N-tuple relation in system n members storehouses, and the n-tuple relation being calculated by vector, although can ensure what is finally obtained Candidate item is more accurate, but the amount of calculation needed is also bigger.Therefore, can be as needed in the input process of reality Flexibly use system dictionary, system n members storehouse and term vector storehouse.For example, first input string can be cut according to system word Point, to obtain system word sequence, by inquiry system n members storehouse, the second n-tuple relation score is calculated, if this more than second yuan Relation score is sufficiently high, is greater than a certain predetermined threshold value, then it is assumed that candidate item can be determined by the system word sequence, Cutting is carried out according to first word to input string without performing again, and calculates the process of the first n-tuple relation score.So as to The amount of calculation of a part is saved, further improves input efficiency.
To sum up, in embodiments of the present invention, after the input string of user is received, first, obtain corresponding to the input string Primary vector sequence and system word sequence;Then, the first n-tuple relation corresponding to primary vector sequence is calculated respectively to obtain Point, and the second n-tuple relation score corresponding to system word sequence;Finally, according to the first n-tuple relation score and more than second yuan The sequence of relation score, determine candidate item corresponding to the input string.Thus, the embodiment of the present invention can combine system n members storehouse And term vector storehouse COMPREHENSIVE CALCULATING n-tuple relation score, with the advantage using system n members storehouse in terms of high frequency vocabulary, and word to Measure advantage of the storehouse in terms of n-tuple relation covering so that group word result is more accurate.
Embodiment of the method four
The present embodiment is described in detail in input process on the basis of above-described embodiment two, using the word established to Measure the process that storehouse carries out frequency modulation.Reference picture 4, show a kind of step flow chart of input method example IV of the present invention, tool Body may include steps of:
Step 401, obtain primary vector sequence corresponding to input string;
Step 402, according to preset n-tuple relation computation rule, be calculated first corresponding to the primary vector sequence N-tuple relation score;
Step 403, according to the first n-tuple relation score, determine candidate item corresponding to the input string;
Step 404, obtain the above of the input string and/or hereafter corresponding second yuan of word sequence;
The term vector storehouse that step 405, inquiry have been established, obtain vector corresponding to each first word in second yuan of word sequence;
Step 406, vector corresponding to each first word in second yuan of word sequence is sequentially connected in series, obtains secondary vector sequence Row;
The 3rd n-tuple relation score between step 407, the calculating primary vector sequence and secondary vector sequence, foundation Sequence of the 3rd n-tuple relation score to candidate item corresponding to the input string is adjusted.
The embodiment of the present invention can also be adjusted according to sequence of the term vector storehouse established to candidate item.In the present invention One kind application example in, such as current input string is " px ", and the input string is above " go to the beach and grab ", can be to input string " px " carries out looking into word in the dictionaries such as system dictionary, user thesaurus, and obtaining candidate item corresponding to the input string can include:" row Sequence ", " leather shoes ", " training ", " crab " etc..
First, obtain second yuan of word order corresponding to " go to the beach and grab " above and be classified as " go | seashore | grab ", by query word to Storehouse is measured, obtains vector corresponding to each first word in above-mentioned second yuan of word sequence " go | seashore | grab ", wherein, vector corresponding to " going " is V1, vector is V2 corresponding to " seashore ", and vector is V3 corresponding to " grabbing ".To corresponding to each first word in above-mentioned second yuan of word sequence to Amount is sequentially connected in series, and obtains secondary vector sequence (V1, V2, V3).
It is appreciated that above-mentioned obtain secondary vector sequence, only one kind application example as the present invention by each vector concatenation, In actual applications, the secondary vector sequence can also be obtained by other means, such as can also utilize RNN (Recurrent neural Network, Recognition with Recurrent Neural Network) model etc., by vector representation corresponding to each first word on the whole Secondary vector sequence corresponding to text.
Then, primary vector sequence corresponding to input string " px " is obtained, due to candidate item corresponding to input string " px " in itself Be exactly first word, can not have to carry out first word division, by inquire about term vector storehouse obtain " sorting " corresponding to vector be V4, " leather shoes " Corresponding vector is V5, and vector is V6 corresponding to " training ", and vector is V7 corresponding to " crab ".Namely primary vector sequence includes V4, V5, V6 or V7 etc..
Next, calculate the 3rd n-tuple relation score between the primary vector sequence and secondary vector sequence, foundation Sequence of the 3rd n-tuple relation score to candidate item corresponding to the input string is adjusted.Specifically, calculate V4 with Binary crelation score between (V1, V2, V3), the binary crelation score between V5 and (V1, V2, V3), V6 and (V1, V2, V3) Between binary crelation score etc..Assuming that the binary crelation highest scoring between V7 and (V1, V2, V3), then can be by V7 pairs The candidate item " crab " answered comes foremost.
In embodiments of the present invention, can utilize preset model calculate the primary vector sequence and secondary vector sequence it Between the 3rd n-tuple relation score, and according to the 3rd n-tuple relation score change candidate item word frequency, to candidate list weight New sort.
In the examples described above, " it will go to the beach and grab " as the above of input string, it will be understood that in actual applications, for The length embodiment of the present invention above obtained is not any limitation as, for example, it is also possible to " will only grab " as above, then can be counted respectively Calculate the binary crelation score between " grabbing " and " sequence ", " leather shoes ", " training ", the candidate item such as " crab ", or by " grabbing on seashore " As above etc..The embodiment of the present invention is preferably using multiple first words splicings above as above, such as in above-mentioned example " go to the beach and grab " above, this is above by three first phrase words so that in frequency modulation, to consider above more comprehensively, enter And before the candidate item for more conforming to context environmental is come.For example, in the prior art, " will only grab " conduct above, with " grabbing " there is candidate word of binary crelation to be possible to include " crab ", " thief " etc., even " grab | thief " than " grab | crab " Binary crelation score is higher.And it is overall as above " will to go to the beach and grab ", then " go | seashore | grab | crab " obviously than " go | Seashore | grab | thief " more rationally, binary crelation score is likely to more greatly, so that candidate item " crab " can come Forward position.
The embodiment of the present invention can obtain primary vector sequence corresponding to input string, and input string above and/or under Secondary vector sequence corresponding to text, according to preset n-tuple relation computation rule, calculate the primary vector sequence with second to The binary crelation score between sequence is measured, is entered according to sequence of the binary crelation score to candidate item corresponding to the input string Row adjustment, relative to prior art, can consider in frequency modulation it is more above, so as to improve the degree of accuracy of frequency modulation.
Embodiment of the method five
The present embodiment is described in detail in input process on the basis of above-described embodiment two, using the word established to Amount storehouse carries out the process of association.Reference picture 5, show a kind of step flow chart of input method embodiment five of the present invention, tool Body may include steps of:
Step 501, according to the above of the input string and/or hereafter, obtains association's candidate item corresponding to the input;
Step 502, obtain the 3rd sequence vector corresponding to association's candidate item;
The 4th n-tuple relation score between step 503, the calculating secondary vector sequence and the 3rd sequence vector, foundation The 4th n-tuple relation score is ranked up displaying to association's candidate item.
In one kind application example of the present invention, it is assumed that it is " p " to receive the current input string of user, is " to go to sea above While grabbing ", then can by the input string " p " that has currently inputted and it is corresponding above, association goes out most possible candidate item.Tool Body, first, obtain corresponding second yuan of word sequence above:" go | seashore | grab ";By inquiring about term vector storehouse, obtain Secondary vector sequence (V1, V2, V3) corresponding to text.Then, target candidate set is traveled through, it is found out and neutralizes input string " p " matching Association candidate, such as association candidate item can include:" crab ", " catching ", " broken " etc., wherein, the target candidate set tool Body can be system dictionary, user thesaurus etc.;Vector corresponding to above-mentioned each association candidate is searched in term vector storehouse, and 3rd sequence vector corresponding to respectively obtaining, is designated as Ui;Finally, by (V1, V2, V3) and UiPreset model is input to, calculate (V1, V2, V3) and UiBetween binary crelation score, association candidate is ranked up using the binary crelation score.
In actual applications, be above " go to the beach and grab ", and when user is also without any input string is inputted, you can use Association function obtains candidate item most possible corresponding to " go to the beach and grab " above.
The embodiment of the present invention in user's input process, can by obtain above corresponding to secondary vector sequence, and 3rd sequence vector corresponding to associating candidate item, be calculated above with the binary crelation score between association candidate item, so as to Association's candidate item can be determined by score height.The embodiment of the present invention can be in associative process, it is contemplated that more than 2-gram's N-tuple relation, so that the association's candidate item arrived is more accurate.
It should be noted that for embodiment of the method, in order to be briefly described, therefore it is all expressed as to a series of action group Close, but those skilled in the art should know, the embodiment of the present invention is not limited by described sequence of movement, because according to According to the embodiment of the present invention, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art also should Know, embodiment described in this description belongs to preferred embodiment, and the involved action not necessarily present invention is implemented Necessary to example.
Device embodiment
Reference picture 6, a kind of structured flowchart of input unit embodiment of the present invention is shown, can specifically include following mould Block:
Primary vector retrieval module 601, for obtaining primary vector sequence corresponding to input string;
First n-tuple relation computing module 602, for according to preset n-tuple relation computation rule, being calculated described First n-tuple relation score corresponding to one sequence vector;And
Candidate item determining module 603, for according to the first n-tuple relation score, determining to wait corresponding to the input string Option.
In a kind of alternative embodiment of the present invention, the primary vector retrieval module 601, it can specifically include:
First cutting submodule, for carrying out cutting to the input string of user according to first word, to obtain the first character cutting As a result;
First yuan of word sequence acquisition submodule, for obtaining first yuan of word order corresponding to the first character cutting result Row;
First inquiry submodule, for inquiring about the term vector storehouse established, obtains each first word in first yuan of word sequence Corresponding vector;
Primary vector sequence determination sub-module, for being gone here and there successively to vector corresponding to each first word in first yuan of word sequence Connect, obtain primary vector sequence corresponding to the input string.
In another alternative embodiment of the present invention, the term vector storehouse can be established as follows:
Obtain first word number corresponding to first word in dictionary;
To vector corresponding to first word generation in the dictionary;
According to the mapping relations between first word number and the vector, term vector storehouse is established.
In another alternative embodiment of the present invention, described device can also include:
System word sequence acquisition module, for obtaining system word sequence corresponding to the input string;
Second n-tuple relation score determining module, for determining that the second n-tuple relation corresponding to the system word sequence obtains Point;
The then candidate item determining module 603, can specifically include:
Candidate item determination sub-module, for the row according to the first n-tuple relation score and the second n-tuple relation score Sequence, determine candidate item corresponding to the input string.
In another alternative embodiment of the present invention, the system word sequence acquisition module, it can specifically include:
Second cutting submodule, for carrying out cutting to the input string according to system word, to obtain the second character cutting As a result;
System word sequence determination sub-module, for obtaining system word sequence corresponding to the second character cutting result.
In another alternative embodiment of the present invention, the second n-tuple relation score determining module, it can specifically wrap Include:
One tuple word score calculating sub module, each system in the system word sequence is obtained for being inquired about in system dictionary Word frequency corresponding to word, a tuple word score corresponding to the system word sequence is calculated;
Two tuple word score calculating sub modules, during for binary crelation be present in the system word sequence, according to described in Binary crelation, two tuple word score corresponding to the system word sequence is calculated;
Second n-tuple relation score calculating sub module, for according to the tuple word score and two tuple word scores, Determine the second n-tuple relation score corresponding to the system word sequence.
In another alternative embodiment of the present invention, described device can also include:
Second yuan of word sequence acquisition module, for obtaining the above of the input string and/or hereafter corresponding second yuan of word Sequence;
Second enquiry module, for inquiring about the term vector storehouse established, obtain each first word pair in second yuan of word sequence The vector answered;
Secondary vector sequence determining module, for being gone here and there successively to vector corresponding to each first word in second yuan of word sequence Connect, obtain secondary vector sequence;
Sort adjusting module, for calculating the 3rd n-tuple relation between the primary vector sequence and secondary vector sequence Score, it is adjusted according to sequence of the 3rd n-tuple relation score to candidate item corresponding to the input string.
In another alternative embodiment of the present invention, described device can also include:
Associate candidate item acquisition module, it is corresponding for according to the above of the input string and/or hereafter, obtaining the input Association's candidate item;
3rd vectorial retrieval module, for obtaining the 3rd sequence vector corresponding to association's candidate item;
Candidate's order module is associated, for calculating more than the 4th yuan between the secondary vector sequence and the 3rd sequence vector Relation score, displaying is ranked up to association's candidate item according to the 4th n-tuple relation score.
For device embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, it is related Part illustrates referring to the part of embodiment of the method.
Each embodiment in this specification is described by the way of progressive, what each embodiment stressed be with The difference of other embodiment, between each embodiment identical similar part mutually referring to.
On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant this method Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 7 is a kind of block diagram of device 800 for being used to input according to an exemplary embodiment.For example, device 800 Can be mobile phone, computer, digital broadcast terminal, messaging devices, game console, tablet device, Medical Devices, Body-building equipment, personal digital assistant etc..
Reference picture 7, device 800 can include following one or more assemblies:Processing component 802, memory 804, power supply Component 806, multimedia groupware 808, audio-frequency assembly 810, the interface 812 of input/output (I/O), sensor cluster 814, and Communication component 816.
The integrated operation of the usual control device 800 of processing component 802, such as communicated with display, call, data, phase The operation that machine operates and record operation is associated.Treatment element 802 can refer to including one or more processors 820 to perform Order, to complete all or part of step of above-mentioned method.In addition, processing component 802 can include one or more modules, just Interaction between processing component 802 and other assemblies.For example, processing component 802 can include multi-media module, it is more to facilitate Interaction between media component 808 and processing component 802.
Memory 804 is configured as storing various types of data to support the operation in equipment 800.These data are shown Example includes the instruction of any application program or method for being operated on device 800, contact data, telephone book data, disappears Breath, picture, video etc..Memory 804 can be by any kind of volatibility or non-volatile memory device or their group Close and realize, as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) are erasable to compile Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 806 provides electric power for the various assemblies of device 800.Power supply module 806 can include power management system System, one or more power supplys, and other components associated with generating, managing and distributing electric power for device 800.
Multimedia groupware 808 is included in the screen of one output interface of offer between described device 800 and user.One In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action Border, but also detect and touched or the related duration and pressure of slide with described.In certain embodiments, more matchmakers Body component 808 includes a front camera and/or rear camera.When equipment 800 is in operator scheme, such as screening-mode or During video mode, front camera and/or rear camera can receive outside multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio-frequency assembly 810 is configured as output and/or input audio signal.For example, audio-frequency assembly 810 includes a Mike Wind (MIC), when device 800 is in operator scheme, during such as call model, logging mode and speech recognition mode, microphone by with It is set to reception external audio signal.The audio signal received can be further stored in memory 804 or via communication set Part 816 is sent.In certain embodiments, audio-frequency assembly 810 also includes a loudspeaker, for exports audio signal.
I/O interfaces 812 provide interface between processing component 802 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock Determine button.
Sensor cluster 814 includes one or more sensors, and the state for providing various aspects for device 800 is commented Estimate.For example, sensor cluster 814 can detect opening/closed mode of equipment 800, and the relative positioning of component, for example, it is described Component is the display and keypad of device 800, and sensor cluster 814 can be with 800 1 components of detection means 800 or device Position change, the existence or non-existence that user contacts with device 800, the orientation of device 800 or acceleration/deceleration and device 800 Temperature change.Sensor cluster 814 can include proximity transducer, be configured to detect in no any physical contact The presence of neighbouring object.Sensor cluster 814 can also include optical sensor, such as CMOS or ccd image sensor, for into As being used in application.In certain embodiments, the sensor cluster 814 can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 816 is configured to facilitate the communication of wired or wireless way between device 800 and other equipment.Device 800 can access the wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary implementation In example, communication component 816 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 816 also includes near-field communication (NFC) module, to promote junction service.Example Such as, in NFC module radio frequency identification (RFID) technology can be based on, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 800 can be believed by one or more application specific integrated circuits (ASIC), numeral Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for performing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 804 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 820 of device 800.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal When device performs so that mobile terminal is able to carry out a kind of input method, and methods described includes:Obtain input string corresponding to first to Measure sequence;According to preset n-tuple relation computation rule, the first n-tuple relation corresponding to the primary vector sequence is calculated Score;According to the first n-tuple relation score, candidate item corresponding to the input string is determined.
Fig. 8 is the structural representation of server in the embodiment of the present invention.The server 1900 can be different because of configuration or performance And produce bigger difference, can include one or more central processing units (central processing units, CPU) 1922 (for example, one or more processors) and memory 1932, one or more storage application programs 1942 or the storage medium 1930 (such as one or more mass memory units) of data 1944.Wherein, memory 1932 Can be of short duration storage or persistently storage with storage medium 1930.Be stored in storage medium 1930 program can include one or More than one module (diagram does not mark), each module can include operating the series of instructions in server.Further Ground, central processing unit 1922 be could be arranged to communicate with storage medium 1930, and storage medium 1930 is performed on server 1900 In series of instructions operation.
Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets Network interface 1950, one or more input/output interfaces 1958, one or more keyboards 1956, and/or, one or More than one operating system 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM Etc..
Those skilled in the art will readily occur to the present invention its after considering specification and putting into practice invention disclosed herein Its embodiment.It is contemplated that cover the present invention any modification, purposes or adaptations, these modifications, purposes or Person's adaptations follow the general principle of the present invention and including the undocumented common knowledges in the art of the disclosure Or conventional techniques.Description and embodiments are considered only as exemplary, and true scope and spirit of the invention are by following Claim is pointed out.
It should be appreciated that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and And various modifications and changes can be being carried out without departing from the scope.The scope of the present invention is only limited by appended claim
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.
Above to a kind of input method provided by the present invention, a kind of input unit and a kind of device for being used to input, enter Go and be discussed in detail, specific case used herein is set forth to the principle and embodiment of the present invention, and the above is implemented The explanation of example is only intended to help the method and its core concept for understanding the present invention;Meanwhile for the general technology people of this area Member, according to the thought of the present invention, there will be changes in specific embodiments and applications, in summary, this explanation Book content should not be construed as limiting the invention.

Claims (10)

  1. A kind of 1. input method, it is characterised in that including:
    Obtain primary vector sequence corresponding to input string;
    According to preset n-tuple relation computation rule, the primary vector sequence is calculated corresponding to the first n-tuple relation obtain Point;
    According to the first n-tuple relation score, candidate item corresponding to the input string is determined.
  2. 2. according to the method for claim 1, it is characterised in that the step for obtaining primary vector sequence corresponding to input string Suddenly, including:
    Cutting is carried out to the input string of user according to first word, to obtain the first character cutting result;
    Obtain first yuan of word sequence corresponding to the first character cutting result;
    The term vector storehouse established is inquired about, obtains vector corresponding to each first word in first yuan of word sequence;
    Vector corresponding to each first word in first yuan of word sequence is sequentially connected in series, obtains primary vector corresponding to the input string Sequence.
  3. 3. according to the method for claim 2, it is characterised in that establish the term vector storehouse as follows:
    Obtain first word number corresponding to first word in dictionary;
    To vector corresponding to first word generation in the dictionary;
    According to the mapping relations between first word number and the vector, term vector storehouse is established.
  4. 4. according to the method for claim 1, it is characterised in that methods described also includes:
    Obtain system word sequence corresponding to the input string;
    Determine the second n-tuple relation score corresponding to the system word sequence;
    It is then described according to the first n-tuple relation score, the step of determining candidate item corresponding to the input string, including:
    According to the sequence of the first n-tuple relation score and the second n-tuple relation score, determine to wait corresponding to the input string Option.
  5. 5. according to the method for claim 4, it is characterised in that described to obtain system word sequence corresponding to the input string Step, including:
    Cutting is carried out to the input string according to system word, to obtain the second character cutting result;
    Obtain system word sequence corresponding to the second character cutting result.
  6. 6. according to the method for claim 5, it is characterised in that described to determine corresponding to the system word sequence more than second yuan The step of relation score, including:
    Inquiry obtains word frequency corresponding to each system word in the system word sequence in system dictionary, and the system word is calculated A tuple word score corresponding to sequence;
    When binary crelation be present in the system word sequence, according to the binary crelation, the system word sequence is calculated Corresponding two tuples word score;
    According to the tuple word score and two tuple word scores, the second n-tuple relation corresponding to the system word sequence is determined Score.
  7. 7. according to the method for claim 2, it is characterised in that methods described also includes:
    Obtain the above of the input string and/or hereafter corresponding second yuan of word sequence;
    The term vector storehouse established is inquired about, obtains vector corresponding to each first word in second yuan of word sequence;
    Vector corresponding to each first word in second yuan of word sequence is sequentially connected in series, obtains secondary vector sequence;
    The 3rd n-tuple relation score between the primary vector sequence and secondary vector sequence is calculated, according to described more than 3rd yuan Sequence of the relation score to candidate item corresponding to the input string is adjusted.
  8. 8. according to the method for claim 7, it is characterised in that methods described also includes:
    According to the above of the input string and/or hereafter, association's candidate item corresponding to the input is obtained;
    Obtain the 3rd sequence vector corresponding to association's candidate item;
    The 4th n-tuple relation score between the secondary vector sequence and the 3rd sequence vector is calculated, according to described more than 4th yuan Relation score is ranked up displaying to association's candidate item.
  9. A kind of 9. input unit, it is characterised in that including:
    Primary vector retrieval module, for obtaining primary vector sequence corresponding to input string;
    First n-tuple relation computing module, for according to preset n-tuple relation computation rule, the primary vector to be calculated First n-tuple relation score corresponding to sequence;And
    Candidate item determining module, for according to the first n-tuple relation score, determining candidate item corresponding to the input string.
  10. A kind of 10. device for being used to input, it is characterised in that include memory, and one or more than one program, One of them or more than one program storage is configured to by one or more than one computing device in memory One or more than one program bag contains the instruction for being used for being operated below:
    Obtain primary vector sequence corresponding to input string;
    According to preset n-tuple relation computation rule, the primary vector sequence is calculated corresponding to the first n-tuple relation obtain Point;
    According to the first n-tuple relation score, candidate item corresponding to the input string is determined.
CN201610350134.5A 2016-05-24 2016-05-24 Input method, input device and input device Active CN107422872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610350134.5A CN107422872B (en) 2016-05-24 2016-05-24 Input method, input device and input device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610350134.5A CN107422872B (en) 2016-05-24 2016-05-24 Input method, input device and input device

Publications (2)

Publication Number Publication Date
CN107422872A true CN107422872A (en) 2017-12-01
CN107422872B CN107422872B (en) 2021-11-30

Family

ID=60422811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610350134.5A Active CN107422872B (en) 2016-05-24 2016-05-24 Input method, input device and input device

Country Status (1)

Country Link
CN (1) CN107422872B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110292A (en) * 2018-01-29 2019-08-09 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing
CN110244861A (en) * 2018-03-09 2019-09-17 北京搜狗科技发展有限公司 Data processing method and device
CN111752397A (en) * 2019-03-29 2020-10-09 北京搜狗科技发展有限公司 Candidate word determination method and device
CN112684909A (en) * 2020-12-29 2021-04-20 科大讯飞股份有限公司 Input method association effect evaluation method and device, electronic equipment and storage medium
CN112684909B (en) * 2020-12-29 2024-05-31 科大讯飞股份有限公司 Input method association effect evaluation method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020128831A1 (en) * 2001-01-31 2002-09-12 Yun-Cheng Ju Disambiguation language model
CN101013443A (en) * 2007-02-13 2007-08-08 北京搜狗科技发展有限公司 Intelligent word input method and input method system and updating method thereof
CN101644961A (en) * 2009-08-14 2010-02-10 北京搜狗科技发展有限公司 Encoded string sequencing method, device and character input method and device
CN101697109A (en) * 2009-10-26 2010-04-21 北京搜狗科技发展有限公司 Method and system for acquiring candidates of input method
CN102455845A (en) * 2010-10-14 2012-05-16 北京搜狗科技发展有限公司 Character entry method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020128831A1 (en) * 2001-01-31 2002-09-12 Yun-Cheng Ju Disambiguation language model
CN101013443A (en) * 2007-02-13 2007-08-08 北京搜狗科技发展有限公司 Intelligent word input method and input method system and updating method thereof
CN101644961A (en) * 2009-08-14 2010-02-10 北京搜狗科技发展有限公司 Encoded string sequencing method, device and character input method and device
CN101697109A (en) * 2009-10-26 2010-04-21 北京搜狗科技发展有限公司 Method and system for acquiring candidates of input method
CN102455845A (en) * 2010-10-14 2012-05-16 北京搜狗科技发展有限公司 Character entry method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110292A (en) * 2018-01-29 2019-08-09 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing
CN110110292B (en) * 2018-01-29 2023-11-14 北京搜狗科技发展有限公司 Data processing method and device for data processing
CN110244861A (en) * 2018-03-09 2019-09-17 北京搜狗科技发展有限公司 Data processing method and device
CN110244861B (en) * 2018-03-09 2024-02-02 北京搜狗科技发展有限公司 Data processing method and device
CN111752397A (en) * 2019-03-29 2020-10-09 北京搜狗科技发展有限公司 Candidate word determination method and device
CN111752397B (en) * 2019-03-29 2024-06-04 北京搜狗科技发展有限公司 Candidate word determining method and device
CN112684909A (en) * 2020-12-29 2021-04-20 科大讯飞股份有限公司 Input method association effect evaluation method and device, electronic equipment and storage medium
CN112684909B (en) * 2020-12-29 2024-05-31 科大讯飞股份有限公司 Input method association effect evaluation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN107422872B (en) 2021-11-30

Similar Documents

Publication Publication Date Title
CN107608532B (en) Association input method and device and electronic equipment
CN107436691B (en) Method, client, server and device for correcting errors of input method
CN110008401B (en) Keyword extraction method, keyword extraction device, and computer-readable storage medium
CN107544684B (en) Candidate word display method and device
CN107844199B (en) Input method, system and device for inputting
CN110781305A (en) Text classification method and device based on classification model and model training method
CN107291690A (en) Punctuate adding method and device, the device added for punctuate
CN108008832A (en) A kind of input method and device, a kind of device for being used to input
CN107221330A (en) Punctuate adding method and device, the device added for punctuate
CN108073303B (en) Input method and device and electronic equipment
CN110162600B (en) Information processing method, session response method and session response device
CN111368541A (en) Named entity identification method and device
CN107870677A (en) A kind of input method, device and the device for input
CN107305438A (en) The sort method and device of candidate item, the device sorted for candidate item
CN109710732A (en) Information query method, device, storage medium and electronic equipment
CN107797676B (en) Single character input method and device
CN108803890A (en) A kind of input method, input unit and the device for input
CN107422872A (en) A kind of input method, device and the device for input
CN113190752A (en) Information recommendation method, mobile terminal and storage medium
CN112328783A (en) Abstract determining method and related device
CN108628461A (en) A kind of input method and device, a kind of method and apparatus of update dictionary
CN108073293A (en) A kind of definite method and apparatus of target phrase
CN113821609A (en) Answer text acquisition method and device, computer equipment and storage medium
CN113505596A (en) Topic switching marking method and device and computer equipment
CN110362686B (en) Word stock generation method and device, terminal equipment and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant