CN101216854A - Computer words input method and system and its word library maintenance method and device - Google Patents

Computer words input method and system and its word library maintenance method and device Download PDF

Info

Publication number
CN101216854A
CN101216854A CNA2008100562380A CN200810056238A CN101216854A CN 101216854 A CN101216854 A CN 101216854A CN A2008100562380 A CNA2008100562380 A CN A2008100562380A CN 200810056238 A CN200810056238 A CN 200810056238A CN 101216854 A CN101216854 A CN 101216854A
Authority
CN
China
Prior art keywords
literal
input
word frequency
user
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2008100562380A
Other languages
Chinese (zh)
Other versions
CN101216854B (en
Inventor
陈丽菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN2008100562380A priority Critical patent/CN101216854B/en
Publication of CN101216854A publication Critical patent/CN101216854A/en
Application granted granted Critical
Publication of CN101216854B publication Critical patent/CN101216854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a computer text input method and a system together with a maintenance method and a maintenance device of the thesaurus. The method includes the following steps: pre-storing a deficiency thesaurus of function words; storing the text information input through the computer text input system in the user thesaurus and count the word input frequency; searching out whether a same word as the function word in the thesaurus of function words exists in the user thesaurus and delete the same word from the user thesaurus; analyzing the word frequency of the user thesaurus and merge the text meeting the special requirement of the matching frequency larger than one. The invention can reduce the user thesaurus occupation of stored resources and computing resources and improve input efficiency and accuracy through the maintenance of user thesaurus. The invention also can select candidate word from the maintained user thesaurus for input choice according to the word frequency, thus further improving the input efficiency and accuracy.

Description

Computer word input method and system and word library maintenance method thereof and device
Technical field
The present invention relates to the word processing technology of computing machine, relate in particular to a kind of computer word input method and character input system, and word library maintenance method and attending device.
Background technology
Computer word input method is a lot, mainly by the specific literal code of keyboard input of computing machine, generates corresponding character according to this literal code, finishes final input.Described specific character coding has many kinds, for Chinese character, mainly contains Pinyin coding, the Five-stroke Method coding, stroke encoding etc.
For the character input method system, generally all be provided with user thesaurus, be used for the literal of recording user input, and when user's input characters next time, the higher literal of word frequency is selected input as candidate target for the user in the preferential explicit user dictionary, to improve user's input speed.
Fig. 1 is a kind of treatment scheme synoptic diagram of existing spelling input method.Referring to Fig. 1, this flow process comprises three phases, that is: syllabification, and candidate word generates and intelligent word.
The user is input Pinyin string (literal code just) at first; Character input system carries out phonetic according to certain partition strategy to the pinyin string of input and divides; Again according to the result of syllabification, generate the candidate word of syllable sequence correspondence and be arranged in order and be shown to the user and select according to certain word frequency or last memory function; Not having in whole syllable sequence then needs intelligent word under the situation of corresponding candidate word, promptly carries out the dynamic group speech to form complete whole sentence result by certain algorithm, is shown to the user then; The user selects the input of just having finished a literal of needs in candidate's words or whole sentence.And in input words or whole sentence, character input system need be with the written record of input in user thesaurus, and the corresponding incoming frequency of statistics, i.e. word frequency.
But in the existing character input system, user thesaurus just all carries out record to all literal that the user imported, and carry out word frequency statistics simply.Yet for most of language in the world, exist so objective language regulation, that is: existing notional word, also function word arranged.Concerning the user of computer input method, what text input speed was played materially affect is notional word, and is to use the high notional word of frequency; And also can present various combinations between the notional word, thereby obtain the different phrase of meaning, for example the word frequency of user's input " liking " this speech reaches 100, but it all is to be added with " no " word that 90 fronts are wherein arranged, so user's idiom should be " disliking " rather than " liking ".
Existing character input system all deposits all literal of user's input in user thesaurus, and just wherein the notional word that really uses, wherein a large amount of function words all is the redundant information that does not have practical significance, and to also not carrying out the idiom of original idea analysis with correct reflection user between the notional word.Cause the shared storage space of user thesaurus bigger like this, and required calculated amount is also bigger when carrying out word frequency ordering etc. and handle, influences the processing speed of character input system; And the candidate word of output usually can not accurately reflect user's idiom, influences the speed of user's literal input.
In addition, a kind of network character input system and corresponding input method thereof have also appearred at present, the server of this method network side is provided with core word bank, referring to the core word bank among Fig. 1, character input system on each terminal can report server with the literal of this locality input, perhaps the user thesaurus with this locality reports server, by server with the written record in each user thesaurus in core word bank, and each literal carried out word frequency statistics ordering.The terminal user also can carry out the literal input as the user thesaurus of this locality from the described core word bank of downloaded.
More in view of the redundant information of user thesaurus in the prior art, after terminal reported server with user thesaurus, the redundant information of core word bank will be bigger, causes both taking too much network storage resource, wasted a large amount of computational resources again.And, the user of general using network character input system carries out written communication by the Network Transmission mode, user for same period or same network area, owing to need to exchange mutually, therefore the used idiom of each user is very identical, yet existing this core word bank can not accurately embody whole user's idiom, thereby influences user's text input speed and written communication speed.
Summary of the invention
In view of this, technical matters to be solved by this invention is to provide a kind of word library maintenance method and attending device of computword input system, to save storage resources and computational resource, improves literal input efficiency and input precision.
Another technical matters to be solved by this invention is to provide a kind of computer documents input method and input system, to save storage resources and computational resource, improves literal input efficiency and input precision.
In order to realize the foregoing invention purpose, main technical schemes of the present invention is:
A kind of word library maintenance method of computword input system, this method is stored the function word storehouse in advance, and comprises:
The Word message that is input to computing machine is recorded in the user thesaurus, and the word frequency of statistics input;
Search whether have in the user thesaurus with described function word storehouse in the identical literal of function word, if having then this literal deleted from user thesaurus;
Literal word frequency in the user thesaurus is analyzed, and the more than one literal that the collocation word frequency is reached particular requirement merges.
Preferably, this method sets in advance the analysis report masterplate corresponding with part of speech; And further comprise:
User thesaurus is carried out the word frequency ordering, generate the word frequency tabulation;
Select in the word frequency tabulation and meet the literal of specifying the word frequency condition;
Judge the part of speech of selected literal, select the analysis report masterplate of this part of speech correspondence, selected literal is filled in the corresponding analysis report masterplate, generate analysis report.
Preferably, described character input system has the network number of the account; And this method further comprises after generating described analysis report:
Detect the last teletype command of user's input;
When detecting teletype command, the network number of the account of judging current character input system whether with the network account relating of the service server of user's appointment, if then carry out next step; Otherwise process ends;
Described analysis report is uploaded on the described network number of the account corresponding service server.
Preferably, this method sets in advance the upload masterplate corresponding with the type of service of described service server;
The described concrete grammar of uploading analysis report is: read the pairing masterplate of uploading of the associated type of service of character input system, the content of described user network number of the account and analysis report is filled into uploads in the masterplate, the specific format by uploading masterplate with the content uploading of described analysis report to the corresponding webserver.
Preferably, the described concrete grammar that carries out word frequency analysis and merge literal is: whether the collocation word frequency of judging a literal and other literal is greater than the specific ratios of this total word frequency of literal of being arranged in pairs or groups, if then the literal with described mutual collocation merges.
Preferably, at the written record that will import computing machine during to user thesaurus, further record corresponding input time; And, in follow-up maintenance process to user thesaurus, word frequency analysis time period of reading pre-stored at first, and in user thesaurus, select to meet the literal of described word frequency analysis in the time period, then the literal of selecting is handled as the process object of subsequent step.
A kind of dictionary attending device of computword input system comprises:
The dictionary load module, be used for will be by character input system input computing machine written record to user thesaurus, and carry out word frequency statistics;
First maintenance module wherein stores the function word storehouse, be used for judging whether have in the literal of user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus;
Second maintenance module is used for the literal word frequency of user thesaurus is analyzed, and the more than one literal that the collocation word frequency is reached particular requirement merges.
Preferably, this dictionary attending device further comprises order module, is used for the literal of described user thesaurus is carried out the word frequency ordering, generates the word frequency tabulation.
Preferably, this dictionary attending device further comprises the analysis report automatically-generating module, wherein store the analysis report masterplate corresponding with part of speech, be used for selecting the word frequency tabulation and meet the literal of specifying the word frequency condition, and judge the part of speech of selected literal, select the analysis report masterplate of this part of speech correspondence, selected literal is filled in the corresponding analysis report masterplate, generate analysis report.
Preferably, this dictionary attending device further comprises transmission module on the key, be used to detect the last teletype command of user's input, in case the network number of the account that detects teletype command then judge local character input system whether with the network account relating of the service server of user's appointment, when judging association, described analysis report is uploaded on the described network number of the account corresponding service server.
Preferably, described user thesaurus further comprises logging modle input time, is used at the written record that will import computing machine during to user thesaurus, further record corresponding input time;
Described dictionary attending device comprises that further the third dimension protects module, stored word frequency analysis time period wherein, be used for selecting to meet the literal of described word frequency analysis in the time period, with the literal selected service object as described first maintenance module and second maintenance module at user thesaurus.
A kind of computer word input method comprises:
A, will import computing machine written record in user thesaurus, and carry out word frequency statistics;
B, judge whether have in the literal in the user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus; And the literal word frequency in the user thesaurus analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges;
C, when detecting user's literal input coding, in described user thesaurus, search literal with input coding coupling;
D, the literal that finds is carried out word frequency ordering, the literal of selecting word frequency to meet to specify the word frequency condition shows as the input candidate target;
E, determine that according to the selection instruction of user input final literal finishes input from described candidate target, return step a.
Preferably, this method sets in advance word frequency section input time of storage appointment; And, among the step a, at the written record that will import computing machine during to user thesaurus, further record corresponding input time; Among the step c, when detecting user's literal input coding, in described user thesaurus, search be in described appointment word frequency section input time and with the literal of input coding coupling.
A kind of computword input system, this system comprises:
User thesaurus is used to store user's literal;
The coding input matching module is used to detect user's literal input coding, searches the literal with the input coding coupling in described user thesaurus;
The candidate target display module is used for the literal that described coding input matching module finds is carried out the word frequency ordering, and the literal of selecting word frequency to meet appointment word frequency condition shows as the input candidate target;
The literal load module is used for finishing input according to the selection instruction of user's input from the definite final literal of described candidate target;
The dictionary load module is used for written record with the input of described literal load module to user thesaurus, and carries out word frequency statistics;
The dictionary maintenance module wherein stores the function word storehouse, be used for judging whether have in the literal of user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus; And the literal word frequency in the user thesaurus analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges.
Preferably, further comprise logging modle input time in the described dictionary load module, be used at the written record that will import computing machine during, further record corresponding input time to user thesaurus;
Further comprise word frequency storage unit input time in the described coding input matching module; Described coding input matching module is according to the canned data of word frequency storage unit input time, in described user thesaurus, search be in described appointment word frequency section input time and with the literal of input coding coupling.
With respect to prior art, the present invention can carry out function word to the literal in the user thesaurus and filter, and carry out the word frequency original idea and analyze idiom with correct reflection user, therefore the present invention can realize the reduced combination of user thesaurus Chinese words, reduce the memory space of user thesaurus, minimizing takies the Computer Storage resource, reduce the calculated amount of input system, and can accurately embody the idiom of user's reality, these idioms are imported as candidate word, can significantly be improved the literal input efficiency and the literal input precision of character input system.
And the present invention can also be provided with the time period of word frequency input, only the dictionary of this time period content is simplified and analyzes ordering, thereby make the user can learn accurately what the idiom in a certain special time period is.For the user of some special specialty, some professional document typing personnel for example run into this scene through regular meeting: for example the document in first month input relates generally to internet arena, and the idiom that produces in the period at this section relates generally to internetworking term; Document input in the second month may relate generally to mechanical manufacturing field, and the idiom that produces in the period at this section relates generally to machine-made term; And to may importing the document that relates to internet arena in three month, at this moment just can utilize the present invention to be provided with only the user thesaurus in first month to be simplified and analyze ordering, thereby make user's idiom of character input system output relate generally to internetworking term again, thereby be very easy to the user and import specific term, improved the literal input efficiency.Yet the user thesaurus of prior art can only obtain over overall idiom roughly, can not accurately export the idiom in the section in those years.
For the network character input system, owing to simplified user thesaurus, the content of corresponding network core dictionary has also correspondingly been simplified, and core word bank can embody the idiom of whole terminal user in a certain period, makes things convenient for the user to utilize idiom to carry out the literal input.Especially carrying out between the different user under the situation of written communication, because it is consistent objectively theme being discussed, according to objective language regulation, when the theme unanimity is discussed, adopt identical phrasal probability bigger in the dialogue, therefore upgrade core word bank by the present invention, the each side of discussion group can comprise phrasal literal in shorter time content choice input, and the text input speed that has improved communication each side exchanges efficient.
In addition, the present invention can also carry out the word frequency ordering to user thesaurus, the literal that meets the word frequency condition is added in the analysis report masterplate of its part of speech correspondence, generates analysis report, thereby can make the user understand the speech habits of oneself more, further facilitate the user and use character input system.
For the character input system related with the service-specific server on the network, the present invention can also upload to the corresponding service server with described analysis report one key, and deliver with the specific presentation mode of this service server, to service server, check analysis report to make things convenient for the user.
Description of drawings
Fig. 1 is a kind of treatment scheme synoptic diagram of existing spelling input method;
Fig. 2 is a kind of enforcement processing flow chart that dictionary is safeguarded of the present invention;
Fig. 3 is a kind of implementing procedure figure of automatic generation user idiom analysis report of the present invention;
The schematic flow sheet of the specific implementation method that Fig. 4 uploads for a key of the present invention;
Fig. 5 is a kind of process flow diagram of computer word input method of the present invention;
Fig. 6 is a kind of structural representation of the dictionary attending device of computword input system of the present invention;
Fig. 7 is the another kind of structural representation of the dictionary attending device of computword input system of the present invention;
Fig. 8 is a kind of structural representation of computword input system of the present invention.
Embodiment
Below by specific embodiments and the drawings the present invention is described in further details.
Method of the present invention is suitable for any character input system with user thesaurus, for example for Chinese character, goes for pinyin input system, the Five-stroke Method input system, stroke input system etc.The following examples are that the present invention will be described for example with pinyin input system commonly used.
Core technology scheme of the present invention is: stores the function word storehouse in advance, will record in the user thesaurus by the Word message that the computword input system is input to computing machine, and the word frequency of statistics input; Be later maintenance process then, comprise that at least function word filters and the original idea analysis merges user thesaurus.Described function word filters promptly: for the literal in the user thesaurus, search wherein whether have with described function word storehouse in the identical literal of function word, if having then this literal deleted from user thesaurus; Described original idea analysis merges promptly: the literal word frequency in the user thesaurus is analyzed, and the more than one literal that the collocation word frequency is reached particular requirement merges.
Described to user thesaurus shorthand information and the concrete grammar of adding up word frequency for example: whenever finish the input of a literal, whether then judge in the user thesaurus this literal, if do not have, then this literal is added dictionary, word frequency is set to 1, if having, then that it is corresponding word frequency adds 1.Literal of the present invention can be single word or speech, or a whole sentence.
Described function word filters and the original idea analysis merges triggering opportunity can be to be triggered by input instruction by the user, also can just triggering automatically behind Word message of every record in user thesaurus.
Fig. 2 is a kind of enforcement processing flow chart that dictionary is safeguarded of the present invention.Referring to Fig. 2, in the present embodiment, by the later maintenance process of triggering command triggering to user thesaurus, promptly the user thesaurus among Fig. 2 200 has write down the literal of user's input and has carried out word frequency statistics by the user.This flow process comprises:
Step 201, the time of carrying out of the word content in the user thesaurus is judged.
In order to realize this step, at the written record that will import computing machine during, need further record corresponding input time to user thesaurus; The user can set in advance the word frequency analysis time period in character input system, this time period for example can be a nearest week, nearest one month even 1 year (user can select to be provided with voluntarily).In this step 201, at first read the word frequency analysis time period that the user is provided with, and, select to meet the literal in the described word frequency analysis time period judging the input time of the literal in the described user thesaurus, carry out subsequent step; The process object that does not then belong to subsequent step for the literal that does not meet the described word frequency analysis time.By the setting-up time section, can realize only the user thesaurus in certain time period being simplified ordering, thereby make the user can learn accurately what the idiom in a certain special time period is.
Certainly, this step 201 is optional steps, also can be only to described user thesaurus 200 execution in step 202 and subsequent steps.
Step 202, the literal of described word frequency analysis in the time that belong in the user thesaurus carried out word frequency ordering, if do not have execution in step 201 certainly then all literal in the user thesaurus are carried out word frequency; The ordering back generates the word frequency tabulation, promptly from high to low described literal is sorted according to word frequency.The word frequency of described literal is exactly the incoming frequency of this literal.
Step 203, the literal in the word frequency ordering is carried out the function word filtration treatment.Concrete processing procedure is: judge whether have in the tabulation of described word frequency with described default function word storehouse in the identical literal of function word, if having then this literal deleted from user thesaurus; If not then keep this literal.For example according to the objective language regulation of language-specific, the independent function word that does not have concrete meaning of storage can be set in described function word storehouse, for example " ", " " auxiliary words such as " ", thereby make the independent notional word that possesses concrete meaning continue to keep at user thesaurus.
Step 204, literal is carried out the original idea analysis, promptly the literal word frequency in the user thesaurus is analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges.Specifically comprise: whether the user has arranged in pairs or groups when using this speech simultaneously, and other are used for describing and determining other speech of this speech true intention, if using, the collocation of two speech arrives specific probability, then need these two speech are merged, the position in the word frequency ordering then is that the word frequency with centre word is as the criterion.
Concrete herein disposal route for example can for: total word frequency of suppositive A is N, total word frequency of speech B is M, the collocation word frequency of A+B is X, then this X and described M and N are compared, if X is less than the specific ratios of N (N * 50% for example, this specific ratios can set in advance and be stored in the character input system), simultaneously X is less than the specific ratios of M (M * 50% for example, this specific ratios can set in advance and be stored in the character input system), A and B do not merge as independent speech so; If X is greater than the described specific ratios of N, perhaps X then needs to merge A and B greater than the described specific ratios of M.And can be with the word frequency of X as the A+B after merging, perhaps with the word frequency of A and the B centre word in the two word frequency as A+B.
For example: the user may import " liking ", if but user input " liking " time, there is the probability more than 90% to add " no " word in the front of " liking ", so in its real user's input habit, that often uses should be " disliking ", therefore need " no " and " liking " is merged into " disliking ", because " liking " is centre word, the word frequency that then continues with " liking " is the word frequency of " disliking " as combinatorial word.
Step 205, processing is adjusted in word frequency position in the word frequency tabulation, generates final word frequency and tabulate 300.
Because the processing of above-mentioned steps 203 and step 204, the deleted or merging of some literal possibility in the described word frequency tabulation, thereby cause occurring vacancy in proper order according to the word frequency of certain rule compositor, so need be from last literal, automatically detect whether vacancy of a last word frequency position, if vacancy adjusts upward the word frequency position, till no longer including vacancy.
Certainly, above-mentioned steps 202 and step 205 also are optional steps, are for generate subsequent becomes analysis report or generates the preparation that candidate's input characters is done, and reduce this purpose of user thesaurus amount for only realizing, step 202 and step 205 also are omissible.
At above-mentioned user thesaurus, the present invention can also generate the phrasal analysis report of user automatically, checks the speech habits of oneself to make things convenient for the user.In order to realize generating automatically analysis report, the present invention needs the storage analysis report masterplate corresponding with part of speech in advance, and need carry out the word frequency ordering to user thesaurus, generates word frequency tabulation (this step is finished in flow process shown in Figure 2).The setting analysis report masterplate corresponding respectively for example with noun, verb, adjective or other part of speech.Can set in advance the Word message that (can default setting also can be provided with voluntarily by the user) and described part of speech be complementary in the described analysis report masterplate corresponding, and reserve corresponding blank position to insert the literal of from the word frequency tabulation, selecting with part of speech.For example for noun, can word information relates in a kind of fairly simple analysis report masterplate be set to " you often mention recently _ _, you need one _ _? "
Fig. 3 is a kind of implementing procedure figure of automatic generation user idiom analysis report of the present invention.Referring to Fig. 3,
Step 301, the described word frequency tabulation of handling through ordering 300 is analyzed, selected in the word frequency tabulation and meet the literal of specifying the word frequency condition.Described appointment word frequency condition for example can be that the highest literal of word frequency or word frequency are in the literal of top ten or the literal that word frequency reaches certain particular value.Described word frequency condition can specify and store in the character input system in advance.
Step 302, determine the part of speech of selected literal, read the analysis report masterplate of this part of speech correspondence.
The method of described definite part of speech can for: set in advance syntax library, wherein store the part of speech information of all literal, can from this syntax library, search the part of speech of selected literal.
If selected literal is the compound vocabulary that above-mentioned steps 204 is merged into, then part of speech also can be provided with centre word information based on the centre word after merging in the described syntax library, can determine centre word by the query grammar storehouse.For example: the centre word of " good happy " this compound word is " happily ", and " good " just is used for describing, so just with the part of speech of " happily ", promptly verb is as the criterion, accesses the statement that pre-sets accordingly to the verb coupling.
Step 303, selected literal is filled in the analysis report masterplate corresponding with its part of speech, generates analysis report.
Be a concrete example below, suppose that the literal that word frequency is the highest in the described word frequency tabulation is " computer ", it is noun that analysis draws its part of speech, if the analysis report masterplate of noun correspondence for " you often mention recently _ _; need one _ _? " the then corresponding analysis report content that generates is that " you often improve computer recently, needed a computer? "
In addition, method of the present invention is equally applicable to the network character input system, present many network character input systems all are provided with the network number of the account, can be related with the service server that concrete network application service is provided, the network number of the account that is to say character input system can be logined related service server, on this service server, enjoy the business service that is associated, a kind of instant messaging character input method is for example arranged at present, can be related with the mail service that network mail is provided, utilize same network number of the account, both can realize network user's Word library updating of instant messaging character input method, realize the associated services of receiving and dispatching mail again; Corresponding also can carry out relatedly, utilize same network number of the account can login the associated services of corresponding server to realize posting with forum's server that the function of posting is provided (for example present blog server, forum's server, group space server etc.).
For the network character input system, the present invention can also realize a key upload function, being about to described analysis report 400 triggers by a key, automatically upload on the service server of described network character input system association, and can set in advance the upload masterplate corresponding, when uploading, upload with the uploaded format of associated type of service correspondence with the type of service of described service server.The template of uploading of mailbox server, blog server, forum's server, group space server for example can be set respectively, the form of uploading the template correspondence is corresponding with described type of service, for example for the mail service of mailbox server, the form of uploading template comprises that mail matter topics, Mail Contents are waited for and fills out form, for the business of posting of forum's server, the described form of uploading template comprises that model theme, author, content are waited for and fills out form.
The schematic flow sheet of the specific implementation method that Fig. 4 uploads for a key of the present invention.Referring to Fig. 4, this flow process comprises:
Step 401, detect the trigger event that a key is uploaded, promptly detect the user and whether import teletype command, described on teletype command can correspondence a computing machine shortcut be set and triggers.
Step 402, when detecting teletype command, the network number of the account of judging current character input system whether with the network account relating of specified services server, if then carry out next step; Otherwise need to remind the related situation of customer inspection, return step 401 then and trigger teletype command once more and carry out the network account relating.
Step 403, the type of service of user's association is analyzed, extracted the masterplate of uploading of this associated services correspondence.
Step 404, apply mechanically the described form of uploading template, with the content uploading of described analysis report on corresponding server.
For example deliver the type of service of class for model, the described form of applying mechanically of uploading masterplate is a related pattern of delivering model, fill in form comprising title, author, content wait, also comprise the address of destination server, and final uploaded format is the request data package of posting.After detecting teletype command, character input system auto-associating corresponding server and type of service, if what discovery was associated with is the type of service of posting, then read the corresponding masterplate of uploading, the content of described analysis report is inserted in the described content format, described title can be inserted predefined content for example " user thesaurus analysis report ", and described author can insert described network account information; Then the described corresponding content of uploading in the template is packaged into the request data package of posting,, the described request data package of posting is uploaded to corresponding server according to the destination server address of uploading in the masterplate; Server is after receiving packet, and parsing is the request data package of posting, and then data form is according to the rules wherein stored and is delivered.
Fig. 5 is a kind of process flow diagram of computer word input method of the present invention.Referring to Fig. 5, this flow process comprises that not only user thesaurus is carried out above-mentioned maintenance to be handled, and imports the input that literal finished in speech but also generate the candidate according to the user thesaurus after handling through maintenance.Specifically may further comprise the steps:
Step 501, will import computing machine written record in user thesaurus, and carry out word frequency statistics.
Step 502, carry out function word and filter to judge, promptly judge whether have in the literal in the user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus.The concrete grammar of this step is identical with above-mentioned steps 203.
Step 503, the literal word frequency in the user thesaurus is analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges.The concrete grammar of this step is identical with above-mentioned steps 204.
Step 504, when the literal input coding that detects the user (for example Pinyin coding, the Five-stroke Method coding, handwriting recognition input coding), in described user thesaurus, search literal with the input coding coupling.
Step 505, the literal that finds is carried out word frequency ordering, the literal of selecting word frequency to meet to specify the word frequency condition shows as the input candidate target;
Step 506, determine that according to the selection instruction of user input final literal finishes input from described candidate target, return step 501.
By above-mentioned input method, make character input system after detecting user's literal input coding, the candidate target of output all is that process is simplified the idiom of analysis, thereby reflects user's literal input intention more exactly, improves the literal input efficiency.
In addition, described character input method can also set in advance word frequency section input time of storage appointment; And, in the step 501, at the written record that will import computing machine during to user thesaurus, further record corresponding input time; In the step 503, when detecting user's literal input coding, in described user thesaurus, search be in described appointment word frequency section input time and show with the literal of input coding coupling candidate target as input, select therefrom to determine that according to user's selection instruction the literal of final input is to finish input then.In this way, can make character input system accurately export idiom in a certain special time period, thereby further facilitate professional typing personnel's literal input demand as the candidate target of literal code.
Based on above-mentioned method, the invention also discloses the dictionary attending device of the computword input system that can carry out above-mentioned word library maintenance method, and the computword input system that can carry out above-mentioned character input method.
Fig. 6 is a kind of structural representation of the dictionary attending device of computword input system of the present invention.Referring to Fig. 6, this dictionary attending device 600 mainly comprises:
Dictionary load module 601, be used for will be by character input system input computing machine written record to user thesaurus 610, and carry out word frequency statistics, and with statistics record in user thesaurus.
First maintenance module 602 wherein stores the function word storehouse, is used to realize the function that function word filters, promptly judge whether have in the literal in the user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus.
Second maintenance module 603 is used to realize the function that the original idea analysis merges, and promptly the literal word frequency in the user thesaurus is analyzed, and the more than one literal that the collocation word frequency is reached particular requirement merges.
In a kind of optional embodiment, described dictionary attending device can further include order module 604, is used for the literal of described user thesaurus is carried out the word frequency ordering, generates the word frequency tabulation.
In the optional embodiment of another kind, described dictionary attending device can further include analysis report automatically-generating module 605, wherein store the analysis report masterplate corresponding with part of speech, be used for selecting the word frequency tabulation and meet the literal of specifying the word frequency condition, and judge the part of speech of selected literal, select the analysis report masterplate of this part of speech correspondence, selected literal is filled in the corresponding analysis report masterplate, generate analysis report.
Described dictionary attending device can further include transmission module 606 on the key, be used to detect the last teletype command of user's input, in case the network number of the account that detects teletype command then judge local character input system whether with the network account relating of the service server of user's appointment, when judging association, described analysis report is uploaded on the described network number of the account corresponding service server.
Fig. 7 is the another kind of structural representation of the dictionary attending device of computword input system of the present invention.Referring to Fig. 7, described dictionary load module 601 further comprises logging modle 611 input time, is used at the written record that will import computing machine during to user thesaurus 610, further record corresponding input time; Described dictionary attending device 600 comprises that further the third dimension protects module 607, stored word frequency analysis time period wherein, be used for selecting to meet the literal of described word frequency analysis in the time period, the literal of selecting is exported as the service object of described first maintenance module 602 and second maintenance module 603 at user thesaurus 610.Described first maintenance module 602 and second maintenance module 603 can be handled the literal of described word frequency analysis in the time period successively, can give order module 604 afterwards sorts, from the word frequency tabulation, select the literal that meets predetermined word frequency condition by analysis report automatically-generating module 605 from the back of sorting then and generate analysis report, can be uploaded to the server that is associated by transmission module 606 on the key at last.
Fig. 8 is a kind of structural representation of computword input system of the present invention.Referring to Fig. 8, this computword input system comprises:
User thesaurus 610 is used to store user's literal.
Coding input matching module 801 is used to detect the coding of user by the keyboard input, searches the literal with the input coding coupling in described user thesaurus.
Candidate target display module 802 is used for the literal that described coding input matching module finds is carried out the word frequency ordering, and the literal of selecting word frequency to meet appointment word frequency condition shows as the input candidate target.
Literal load module 803 is used for determining final literal and importing this literal to computing machine from described candidate target according to the selection instruction of user's input.
Dictionary load module 804 is used for written record with 803 inputs of described literal load module to user thesaurus, and carries out word frequency statistics, and the word frequency of statistics can be recorded in the user thesaurus 610.
Dictionary maintenance module 600 wherein stores the function word storehouse, be used for judging whether have in the literal of user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus; And the literal word frequency in the user thesaurus analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges.The concrete structure of this dictionary maintenance module sees also the associated description of Fig. 6 and Fig. 7.
In addition, further comprise logging modle input time (not marking among Fig. 8) in the described dictionary load module 804, be used at the written record that will import computing machine during, further record corresponding input time to user thesaurus; Further comprise word frequency storage unit input time (not marking among Fig. 8) in the described coding input matching module; Described coding input matching module is according to the canned data of word frequency storage unit input time, in described user thesaurus, search be in described appointment word frequency section input time and with the literal of input coding coupling.
The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with the people of this technology in the disclosed technical scope of the present invention; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.

Claims (15)

1. the word library maintenance method of a computword input system is characterized in that, this method is stored the function word storehouse in advance, and comprises:
The Word message that is input to computing machine is recorded in the user thesaurus, and the word frequency of statistics input;
Search whether have in the user thesaurus with described function word storehouse in the identical literal of function word, if having then this literal deleted from user thesaurus;
Literal word frequency in the user thesaurus is analyzed, and the more than one literal that the collocation word frequency is reached particular requirement merges.
2. method according to claim 1 is characterized in that, this method sets in advance the analysis report masterplate corresponding with part of speech; And further comprise:
User thesaurus is carried out the word frequency ordering, generate the word frequency tabulation;
Select in the word frequency tabulation and meet the literal of specifying the word frequency condition;
Judge the part of speech of selected literal, select the analysis report masterplate of this part of speech correspondence, selected literal is filled in the corresponding analysis report masterplate, generate analysis report.
3. method according to claim 2 is characterized in that described character input system has the network number of the account; And this method further comprises after generating described analysis report:
Detect the last teletype command of user's input;
When detecting teletype command, the network number of the account of judging current character input system whether with the network account relating of the service server of user's appointment, if then carry out next step; Otherwise process ends;
Described analysis report is uploaded on the described network number of the account corresponding service server.
4. method according to claim 3 is characterized in that, this method sets in advance the upload masterplate corresponding with the type of service of described service server;
The described concrete grammar of uploading analysis report is: read the pairing masterplate of uploading of the associated type of service of character input system, the content of described user network number of the account and analysis report is filled into uploads in the masterplate, the specific format by uploading masterplate with the content uploading of described analysis report to the corresponding webserver.
5. method according to claim 1, it is characterized in that, the described concrete grammar that carries out word frequency analysis and merge literal is: whether the collocation word frequency of judging a literal and other literal is greater than the specific ratios of this total word frequency of literal of being arranged in pairs or groups, if then the literal with described mutual collocation merges.
6. according to each described method of claim 1 to 5, it is characterized in that,
At the written record that will import computing machine during to user thesaurus, further record corresponding input time; And, in follow-up maintenance process to user thesaurus, word frequency analysis time period of reading pre-stored at first, and in user thesaurus, select to meet the literal of described word frequency analysis in the time period, then the literal of selecting is handled as the process object of subsequent step.
7. the dictionary attending device of a computword input system is characterized in that, comprising:
The dictionary load module, be used for will be by character input system input computing machine written record to user thesaurus, and carry out word frequency statistics;
First maintenance module wherein stores the function word storehouse, be used for judging whether have in the literal of user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus;
Second maintenance module is used for the literal word frequency of user thesaurus is analyzed, and the more than one literal that the collocation word frequency is reached particular requirement merges.
8. dictionary attending device according to claim 7 is characterized in that, this dictionary attending device further comprises order module, is used for the literal of described user thesaurus is carried out the word frequency ordering, generates the word frequency tabulation.
9. dictionary attending device according to claim 8, it is characterized in that, this dictionary attending device further comprises the analysis report automatically-generating module, wherein store the analysis report masterplate corresponding with part of speech, be used for selecting the word frequency tabulation and meet the literal of specifying the word frequency condition, and judge the part of speech of selected literal, select the analysis report masterplate of this part of speech correspondence, selected literal is filled in the corresponding analysis report masterplate, generates analysis report.
10. dictionary attending device according to claim 9, it is characterized in that, this dictionary attending device further comprises transmission module on the key, be used to detect the last teletype command of user's input, in case the network number of the account that detects teletype command then judge local character input system whether with the network account relating of the service server of user's appointment, when judging association, described analysis report is uploaded on the described network number of the account corresponding service server.
11. dictionary attending device according to claim 10 is characterized in that,
Described user thesaurus further comprises logging modle input time, is used at the written record that will import computing machine during to user thesaurus, further record corresponding input time;
Described dictionary attending device comprises that further the third dimension protects module, stored word frequency analysis time period wherein, be used for selecting to meet the literal of described word frequency analysis in the time period, with the literal selected service object as described first maintenance module and second maintenance module at user thesaurus.
12. a computer word input method is characterized in that, comprising:
A, will import computing machine written record in user thesaurus, and carry out word frequency statistics;
B, judge whether have in the literal in the user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus; And the literal word frequency in the user thesaurus analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges;
C, when detecting user's literal input coding, in described user thesaurus, search literal with input coding coupling;
D, the literal that finds is carried out word frequency ordering, the literal of selecting word frequency to meet to specify the word frequency condition shows as the input candidate target;
E, determine that according to the selection instruction of user input final literal finishes input from described candidate target, return step a.
13. computer word input method according to claim 12 is characterized in that, this method sets in advance word frequency section input time of storage appointment; And, among the step a, at the written record that will import computing machine during to user thesaurus, further record corresponding input time; Among the step c, when detecting user's literal input coding, in described user thesaurus, search be in described appointment word frequency section input time and with the literal of input coding coupling.
14. a computword input system is characterized in that, this system comprises:
User thesaurus is used to store user's literal;
The coding input matching module is used to detect user's literal input coding, searches the literal with the input coding coupling in described user thesaurus;
The candidate target display module is used for the literal that described coding input matching module finds is carried out the word frequency ordering, and the literal of selecting word frequency to meet appointment word frequency condition shows as the input candidate target;
The literal load module is used for finishing input according to the selection instruction of user's input from the definite final literal of described candidate target;
The dictionary load module is used for written record with the input of described literal load module to user thesaurus, and carries out word frequency statistics;
The dictionary maintenance module wherein stores the function word storehouse, be used for judging whether have in the literal of user thesaurus with default function word storehouse in the identical speech of function word, if then this speech is deleted from user thesaurus; And the literal word frequency in the user thesaurus analyzed, the more than one literal that the collocation word frequency is reached particular requirement merges.
15. computword input system according to claim 14 is characterized in that,
Further comprise logging modle input time in the described dictionary load module, be used at the written record that will import computing machine during, further record corresponding input time to user thesaurus;
Further comprise word frequency storage unit input time in the described coding input matching module; Described coding input matching module is according to the canned data of word frequency storage unit input time, in described user thesaurus, search be in described appointment word frequency section input time and with the literal of input coding coupling.
CN2008100562380A 2008-01-15 2008-01-15 Computer words input method and system and its word library maintenance method and device Active CN101216854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100562380A CN101216854B (en) 2008-01-15 2008-01-15 Computer words input method and system and its word library maintenance method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100562380A CN101216854B (en) 2008-01-15 2008-01-15 Computer words input method and system and its word library maintenance method and device

Publications (2)

Publication Number Publication Date
CN101216854A true CN101216854A (en) 2008-07-09
CN101216854B CN101216854B (en) 2010-07-14

Family

ID=39623286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100562380A Active CN101216854B (en) 2008-01-15 2008-01-15 Computer words input method and system and its word library maintenance method and device

Country Status (1)

Country Link
CN (1) CN101216854B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102141886A (en) * 2010-01-29 2011-08-03 诺基亚公司 Method and equipment for text edition and computer program product and computer readable medium
WO2011091603A1 (en) * 2010-01-29 2011-08-04 Nokia Corporation Method and device for facilitating text editing and related computer program product and computer readable medium
CN101694608B (en) * 2008-12-04 2012-07-04 北京搜狗科技发展有限公司 Input method and system of same
CN103246703A (en) * 2013-04-03 2013-08-14 百度在线网络技术(北京)有限公司 Method and equipment for determining application word banks
CN103257718A (en) * 2012-02-17 2013-08-21 腾讯科技(深圳)有限公司 Chinese character input method, device and system
CN103631387A (en) * 2012-08-27 2014-03-12 百度国际科技(深圳)有限公司 Candidate word display time acquiring method and device and input method testing method and device
CN104536976A (en) * 2014-12-05 2015-04-22 苏州沃斯麦机电科技有限公司 Associating input system based on Sudoku input mode
CN106406565A (en) * 2016-09-29 2017-02-15 维沃移动通信有限公司 Vocabulary input method for mobile terminal and mobile terminal
CN106951104A (en) * 2017-02-13 2017-07-14 北京奇虎科技有限公司 A kind of entry processing method and device based on dictionary
CN107665206A (en) * 2016-07-27 2018-02-06 北京搜狗科技发展有限公司 Clear up method, system and the device for clearing up user thesaurus of user thesaurus
CN108628461A (en) * 2017-03-16 2018-10-09 北京搜狗科技发展有限公司 A kind of input method and device, a kind of method and apparatus of update dictionary
CN109271040A (en) * 2018-08-06 2019-01-25 北京三个逗号科技有限公司 Construct method, input method, computer storage medium and the terminal of user thesaurus
CN109408796A (en) * 2017-08-17 2019-03-01 北京搜狗科技发展有限公司 A kind of information processing method, device and electronic equipment
CN111722730A (en) * 2020-06-23 2020-09-29 平安医疗健康管理股份有限公司 Character input method, device and equipment based on all-in-one machine and readable storage medium
CN113378539A (en) * 2021-06-29 2021-09-10 华南理工大学 Template recommendation method for standard document compiling

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1045342C (en) * 1993-12-07 1999-09-29 张飞鹏 Chinese character input method and keyboard thereof
CN1115887A (en) * 1994-07-23 1996-01-31 王希曾 Sentence input method in Chinese character input system of computer
CN101064018A (en) * 2006-08-31 2007-10-31 中华人民共和国上海国际机场出入境检验检疫局 HSEncoding computer automatically enquiring system

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101694608B (en) * 2008-12-04 2012-07-04 北京搜狗科技发展有限公司 Input method and system of same
CN102141886B (en) * 2010-01-29 2016-04-20 诺基亚技术有限公司 Method for editing text and equipment
WO2011091603A1 (en) * 2010-01-29 2011-08-04 Nokia Corporation Method and device for facilitating text editing and related computer program product and computer readable medium
US10534445B2 (en) 2010-01-29 2020-01-14 Nokia Technologies Oy Method and device for facilitating text editing and related computer program product and computer readable medium
CN102141886A (en) * 2010-01-29 2011-08-03 诺基亚公司 Method and equipment for text edition and computer program product and computer readable medium
CN103257718A (en) * 2012-02-17 2013-08-21 腾讯科技(深圳)有限公司 Chinese character input method, device and system
CN103257718B (en) * 2012-02-17 2018-05-29 深圳市世纪光速信息技术有限公司 Chinese character input method, equipment and system
CN103631387A (en) * 2012-08-27 2014-03-12 百度国际科技(深圳)有限公司 Candidate word display time acquiring method and device and input method testing method and device
CN103246703A (en) * 2013-04-03 2013-08-14 百度在线网络技术(北京)有限公司 Method and equipment for determining application word banks
CN104536976A (en) * 2014-12-05 2015-04-22 苏州沃斯麦机电科技有限公司 Associating input system based on Sudoku input mode
CN107665206A (en) * 2016-07-27 2018-02-06 北京搜狗科技发展有限公司 Clear up method, system and the device for clearing up user thesaurus of user thesaurus
CN106406565A (en) * 2016-09-29 2017-02-15 维沃移动通信有限公司 Vocabulary input method for mobile terminal and mobile terminal
CN106951104A (en) * 2017-02-13 2017-07-14 北京奇虎科技有限公司 A kind of entry processing method and device based on dictionary
CN108628461A (en) * 2017-03-16 2018-10-09 北京搜狗科技发展有限公司 A kind of input method and device, a kind of method and apparatus of update dictionary
CN108628461B (en) * 2017-03-16 2022-07-08 北京搜狗科技发展有限公司 Input method and device and method and device for updating word stock
CN109408796A (en) * 2017-08-17 2019-03-01 北京搜狗科技发展有限公司 A kind of information processing method, device and electronic equipment
CN109408796B (en) * 2017-08-17 2022-11-01 北京搜狗科技发展有限公司 Information processing method and device and electronic equipment
CN109271040A (en) * 2018-08-06 2019-01-25 北京三个逗号科技有限公司 Construct method, input method, computer storage medium and the terminal of user thesaurus
CN111722730A (en) * 2020-06-23 2020-09-29 平安医疗健康管理股份有限公司 Character input method, device and equipment based on all-in-one machine and readable storage medium
CN113378539A (en) * 2021-06-29 2021-09-10 华南理工大学 Template recommendation method for standard document compiling

Also Published As

Publication number Publication date
CN101216854B (en) 2010-07-14

Similar Documents

Publication Publication Date Title
CN101216854B (en) Computer words input method and system and its word library maintenance method and device
CN110362370B (en) Webpage language switching method and device and terminal equipment
CN101183281B (en) Method for inputting word related to candidate word in input method and system
JP3175399B2 (en) Card data management device
CN101566984B (en) Search engine used in personal hand-held equipment and resource search method
CN101167075B (en) Characteristic expression extracting device, method, and program
CN101246494B (en) Internet web page conversion method, system and equipment
CN110377908B (en) Semantic understanding method, semantic understanding device, semantic understanding equipment and readable storage medium
Müller et al. Multi-level annotation in MMAX
CN101645087A (en) Classified word bank system and updating and maintaining method thereof and client side
CN103268313A (en) Method and device for semantic analysis of natural language
CN102163213B (en) Voice browsing method and browser
CN101887414A (en) The evaluation that the text message that comprises pictorial symbol is passed on is the server of marking automatically
CN101645086B (en) Retrieval method
CN103914513A (en) Entity input method and device
WO2001022277A2 (en) Language translation using a constrained grammar in the form of structured sentences
CN101561725B (en) Method and system of fast handwriting input
CN103365992A (en) Method for realizing dictionary search of Trie tree based on one-dimensional linear space
JP2015187831A (en) Information processing system and information processing method for character input prediction
CN109508448A (en) Short information method, medium, device are generated based on long article and calculate equipment
CN114064851A (en) Multi-machine retrieval method and system for government office documents
CN106294460A (en) A kind of Chinese speech keyword retrieval method based on word and word Hybrid language model
JP5323652B2 (en) Similar word determination method and system
US9495352B1 (en) Natural language determiner to identify functions of a device equal to a user manual
WO2011067463A1 (en) Weight-ordered enumeration of referents and cutting off lengthy enumerations

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant