CN108874175A - A kind of data processing method, device, equipment and medium - Google Patents

A kind of data processing method, device, equipment and medium Download PDF

Info

Publication number
CN108874175A
CN108874175A CN201810637434.0A CN201810637434A CN108874175A CN 108874175 A CN108874175 A CN 108874175A CN 201810637434 A CN201810637434 A CN 201810637434A CN 108874175 A CN108874175 A CN 108874175A
Authority
CN
China
Prior art keywords
word
dictionary
upper screen
time
period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810637434.0A
Other languages
Chinese (zh)
Inventor
孟可丰
贺亮
马鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810637434.0A priority Critical patent/CN108874175A/en
Publication of CN108874175A publication Critical patent/CN108874175A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods

Abstract

The embodiment of the invention discloses a kind of data processing method, appliance arrangement and media, are related to computer and technical field of information retrieval.This method includes:Shielded number on the dictionary in the upper screen period of the word according to word each in dictionary, the attenuation ratio of the word was determined, wherein the upper screen period was the period between nearest at least once upper screen time of the word and the last upper screen time of the dictionary;It is updated according to weight of the attenuation ratio of the word to each word in the dictionary.The embodiment of the present invention provides a kind of data processing method, appliance arrangement and medium, realizes the management to word old in dictionary, solves the problems, such as that old word interference user normally inputs.

Description

A kind of data processing method, device, equipment and medium
Technical field
The present embodiments relate to computer and technical field of information retrieval more particularly to a kind of data processing methods, dress Install standby and medium.
Background technique
With the rise of digital Age, people are increasingly accustomed to data, information and document electronic, and daily exchange is also more Tend to carry out by Email and instant communication software.Therefore in electronic age, as user's " book on an electronic device Write " input method of tool, also in occupation of consequence further in the study of people, work and life.
In order to improve the input efficiency of user, the input method of current mainstream all can be once defeated by user in a manner of self study The word (commonly referred to as self-word creation) entered is recorded, so as to later use.This technology allow user when inputting self-word creation not With again word for word go to piece together.And the high self-word creation of weight is placed on to the forefront of candidate word sequence in preposition mode, with convenient User's selection, thus greatly reduces the input cost of user.Wherein the weight of self-word creation is determined by the use word frequency of self-word creation.
However, using input method the growth of duration with user, more and more words start to be created certainly and heap Product is in the forefront of candidate word.Some old self-word creations have come user and really it is expected before the word of input, to interfere use Family normally inputs, and reduces user's input efficiency.
Summary of the invention
The embodiment of the present invention provides a kind of data processing method, appliance arrangement and medium, to realize to old in dictionary The management of word solves the problems, such as that old word interference user normally inputs.
In a first aspect, the embodiment of the invention provides a kind of data processing method, this method includes:
Shield number on the dictionary in the upper screen period of the word according to word each in dictionary, determines the word Attenuation ratio, wherein nearest at least once upper screen time for being the word upper screen period and the dictionary are most Period between the nearly primary upper screen time;
It is updated according to weight of the attenuation ratio of the word to each word in the dictionary.
Second aspect, the embodiment of the invention also provides a kind of data processing equipment, which includes:
Attenuation ratio determining module, for the dictionary according to word each in dictionary within the upper screen period of the word Upper screen number determines the attenuation ratio of the word, wherein the upper screen period is at least once upper recently of the word Shielded the period between time and the last upper screen time of the dictionary;
Weight update module, for being carried out more according to the attenuation ratio of the word to the weight of each word in the dictionary Newly.
The third aspect, the embodiment of the invention also provides a kind of equipment, the equipment includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes the data processing method as described in any in the embodiment of the present invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer Program realizes the data processing method as described in any in the embodiment of the present invention when program is executed by processor.
The embodiment of the present invention according to word is upper recently by shielding between time and the upper screen time of dictionary the last time Period in, the quantity shielded on word in the dictionary decays to the weight of word in dictionary.Avoid old self-word creation It is arranged in the forefront of candidate word sequence, influences user's input.
Meanwhile the calculation amount for calculating the quantity shielded on word in the dictionary in a period is compared, and calculates word most The calculation amount of time span between nearly upper screen time and current time is small.Again because being stored with a large amount of words in dictionary, System can be reduced by being decayed based on the quantity shielded on word in dictionary described in the period to the weight of word in dictionary Operand.
Detailed description of the invention
Fig. 1 is a kind of flow chart for data processing method that the embodiment of the present invention one provides;
Fig. 2 is the flow chart that the present invention implements a kind of data processing method that two provide;
Fig. 3 is the flow chart that the present invention implements a kind of data processing method that three provide;
Fig. 4 is the flow chart that the present invention implements a kind of data processing method that four provide;
Fig. 5 is a kind of structural schematic diagram for data processing equipment that the embodiment of the present invention five provides;
Fig. 6 is a kind of structural schematic diagram for equipment that the embodiment of the present invention six provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is a kind of flow chart for data processing method that the embodiment of the present invention one provides.The present embodiment is applicable to pair The case where old word in dictionary is managed, the typical word can be self-word creation.This method can be by a kind of number It is executed according to processing unit, which can be realized by the mode of software and/or hardware.Referring to Fig. 1, number provided in this embodiment Include according to processing method:
S110, number was shielded on the dictionary in the upper screen period of the word according to word each in dictionary, determines institute The attenuation ratio of predicate language.
Wherein, the upper screen period was the nearest of nearest at least once upper screen time and the dictionary of the word Period between the primary upper screen time.Shield number on dictionary and refers to the number shielded on word in dictionary.
Upper screen refers to signal according to the user's choice, and the editor that candidate word entr screen is shown is determined from candidate word sequence In text.Wherein, candidate word is the word to match in preset dictionary with keystroke sequence to be matched, keystroke sequence to be matched by The push button signalling of received user's input generates.The candidate word sequence of composition is ranked up to candidate word according to the weight of candidate word.
Specifically, each word shielded number and the word on the dictionary in the upper screen period of the word in dictionary Attenuation ratio be positively correlated, i.e., upper screen number is more, show that the word is got over and do not use for a long time, the attenuation ratio of the word Example it is also bigger, decaying it is also more severe, to avoid the word be arranged in candidate be sequence forefront interfere user it is normally defeated Enter.
Specifically, shielding number on the dictionary in the upper screen period of the word according to word each in dictionary determines institute The attenuation ratio of predicate language includes:
The nearest upper screen time of word was determined according to the upper screen time of word each in dictionary at least twice recently;
The attenuation ratio for determining the word according to number was shielded on the dictionary in the upper screen period, wherein the upper screen time Section was the period between upper screen time and the last upper screen time of the dictionary recently of the word.
Optionally, the nearest upper screen time of word was determined according to the upper screen time of word each in dictionary at least twice recently Method can be:The primary upper screen time in the upper screen time at least twice recently was chosen as the upper screen time recently as word The nearest upper screen time of language.
To realize to the accurate judgement of word attenuation ratio, according to word each in dictionary recently upper screen at least twice when Between determine that the method for the screen time upper recently of word can be:Using the mean value of upper screen time at least twice recently as word It is upper recently to shield the time.
The determination method of above-mentioned upper screen time recently can achieve such a effect:The determination of attenuation ratio is herein in connection with examining The upper screen frequency before shielding on word the last time is considered, to more accurately determine the actual use situation of word, Jin Erti The determination accuracy rate of highly attenuating ratio.
Illustratively, current time was the 6th moment, and the first word is primary at the 1st moment and the 5th moment each upper screen, second Word is primary at the 3rd moment and the 4th moment each upper screen.If the last upper screen time based on the first word is (when the i.e. the 5th Carve) period between the last upper screen time (i.e. the 5th moment) with dictionary was the upper screen period, then on this when screen Between on dictionary in section screen number be 0.
If the average time of the upper screen time (i.e. the 1st moment and the 5th moment) twice recently based on the first word is (i.e. 3rd moment) and the last upper screen time (i.e. the 5th moment) of dictionary between period be the upper screen period (when the i.e. the 3rd Carve the period between the 5th moment), then shielding screen number on the dictionary in the period on this is 1.
It can be seen that if the upper screen frequency before shielding on word the last time is smaller namely upper screen time interval is larger, It will so be elongated according to the upper screen period that the screen time on this determines, number is shielded on the dictionary on this in screen period can also get over It is more;Otherwise, it can be shortened according to the upper screen period that the screen time on this determines, number is shielded on the dictionary on this in screen period and also can It is fewer.Again because screen number is more on the dictionary in the upper screen period, and the attenuation ratio of word is bigger.To realize word most Upper screen frequency before nearly primary upper screen is smaller, determines that screen number also can be more on dictionary, and then to the attenuation ratio of the word It is bigger.
S120, the weight of each word in the dictionary is updated according to the attenuation ratio of the word.
Specifically, according to weight of the attenuation ratio of the word to each word in the dictionary be updated including:
The weight of word is determined according to the word frequency of word, wherein word frequency refers to word upper time shielded in the set time period Number;
Decayed using the attenuation ratio of the word to determining weight;
Weight after decaying is updated to the new weight of the word, and each term weighing is completed in dictionary more with this Newly.
Illustratively, continue with current time as the 6th moment, the first word is at the 1st moment and the 5th moment each upper screen one Secondary, the second word is for the 3rd moment and the 4th moment each upper screen are primary.Based on the method for the present embodiment to the second word in dictionary The update of language can be described as:According to nearest the one of the last upper screen time (i.e. the 4th moment) and dictionary of the second word The secondary upper screen time (i.e. the 5th moment) shields period (period i.e. between the 4th moment and the 5th moment) in determination;In determination Shielding and shielding number in the period on dictionary is 0;Because shielding number on dictionary is 0, the second word is determined according to number is shielded on dictionary The attenuation ratio of language is 0, is decayed with the weight not to the second word;According to determining attenuation ratio to the power of the second word It is updated again.
The technical solution of the embodiment of the present invention, by upper according to word upper screen time and dictionary the last time recently Shield in the period between the time, the quantity shielded on word in the dictionary decays to the weight of word in dictionary.It avoids Old self-word creation is arranged in the forefront of candidate word sequence, influences user's input.
Meanwhile the calculation amount for calculating the quantity shielded on word in the dictionary in a period is compared, and calculates word most The calculation amount of time span between nearly upper screen time and current time is small.Again because being stored with a large amount of words in dictionary, System can be reduced by being decayed based on the quantity shielded on word in dictionary described in the period to the weight of word in dictionary Operand.
To realize the cleaning to word old in dictionary, in the attenuation ratio according to the word to each word in the dictionary After the weight of language is updated, further include:
If the weight of word is less than setting removing weight threshold after decaying, the word is deleted from dictionary.
Wherein, setting removing weight threshold can be determine according to actual needs.
Further, the method further includes:
Shielded number on the dictionary in the upper screen period of the candidate word according to the candidate word in candidate word sequence, and/ Or, according to the time span between the nearest at least once upper screen time of the candidate word in candidate word sequence and current time, it is right The weight of the candidate word decays;
Sequence of the candidate word in candidate word sequence is determined according to the weight after decaying.
Specifically, can the weight according to the weight after decaying to candidate word in dictionary be updated.But because frequently more The weight decaying that newly will lead to candidate word is too fast, so merely with the weight after the decaying redefined before candidate word sequence It is ranked up, the weight to candidate word in dictionary is not updated.
Embodiment two
Fig. 2 is the flow chart that the present invention implements a kind of data processing method that two provide.The present embodiment is in above-mentioned implementation A kind of optinal plan proposed on the basis of example.Referring to fig. 2, data processing method provided in this embodiment includes:
S210, number and dictionary were shielded on the dictionary in the upper screen period of the word according to word each in dictionary In time span between each word nearest at least once upper screen time and current time, determine the attenuation ratio of the word Example.
It is similar, time span in dictionary between each word nearest at least once upper screen time and current time with The attenuation ratio of the word is positively correlated, i.e., upper screen times time length is longer, is shown that the word is got over and is not used for a long time, The attenuation ratio of the word is also bigger.
Specifically, determining that the time in dictionary between each word at least twice nearest upper screen time and current time was long The method of degree can be:The primary upper screen time chosen in the upper screen time at least twice recently shields the time as upper recently;Really Fixed time span between upper screen time and current time recently.
Be realize the accurate judgement of word attenuation ratio is determined each word in dictionary recently upper screen at least twice when Between the method for time span between current time can be:Using the mean value of upper screen time at least twice recently as recently The upper screen time;Determined the time span recently between upper screen time and current time.
The determination method of above-mentioned time span can achieve such a effect:The determination of attenuation ratio not only considers On word the last time shield current time between time span, and combine word shield on the last time between it is upper Shield frequency, to more accurately react the actual use situation of word, and then improves the determination accuracy rate of attenuation ratio.
Specifically, shielding number and word on the dictionary in the upper screen period of the word according to word each in dictionary Time span in library between each word nearest at least once upper screen time and current time, determines the decaying of the word Ratio includes:
If each word shielded number greater than the first setting time on the dictionary in the upper screen period of the word in dictionary Number threshold values, and the time span in dictionary between each word nearest at least once upper screen time and current time is greater than first Setting time length threshold, it is determined that the attenuation ratio of the word is larger;
If each word shielded number less than the second setting time on the dictionary in the upper screen period of the word in dictionary Number threshold values, and the time span in dictionary between each word nearest at least once upper screen time and current time is greater than second Setting time length threshold, it is determined that the attenuation ratio of the word is 100% namely the weight of the word is 0, with basis New input meet new user to candidate word sequence and inputs the rearrangement liked.
Wherein, first setting frequency threshold value be greater than second setting frequency threshold value, the second setting time length threshold be greater than or Equal to the first setting time length threshold.
The above method can effectively solve the problems, such as follows:
The case where data processing equipment is shelved the long period for user not having to, because not having within the period shelved Input operation is carried out, shielding screen number on the dictionary in the period above will not increase, to can not achieve to data processing The problem of old word before shelving in device carries out the decaying of weight.
Meanwhile the above method can identify the different use habits of user, it can be with according to the different use habits of user Design the differential declines strategy of word.
Specifically, if screen number is more on dictionary in the upper screen period, and each word recently at least one in dictionary Time span between secondary upper screen time and current time is shorter, illustrates the user using input frequent operation, then can be with The appropriate decaying weight for increasing word, is timely updated with the weight to word in dictionary.
Based on the technical inspiration, the decaying side of old word in a variety of pairs of dictionaries that those skilled in the art are readily apparent that Method, the present embodiment is to this and without any restriction.
S220, the weight of each word in the dictionary is updated according to the attenuation ratio of the word.
The technical solution of the embodiment of the present invention, by by word each in dictionary within the upper screen period of the word Shielded the time span knot between the nearest at least once upper screen time of each word in number and dictionary and current time on dictionary It closes, the actual use situation of word can be accurately judged to.Accordingly, accurately decayed to the weight of the word, thus Realize the accurate management to self-word creation old in dictionary.
Embodiment three
Fig. 3 is the flow chart that the present invention implements a kind of data processing method that three provide.The present embodiment is in above-mentioned implementation A kind of optinal plan proposed on the basis of example.Referring to Fig. 3, data processing method provided in this embodiment includes:
If the total degree shielded on word in S310, dictionary is greater than in setting and shields frequency threshold value, each word in dictionary is obtained Language shielded number on the dictionary in the upper screen period of the word.
Optionally, the trigger condition to decay to word in dictionary has very much.For example, in setting time interval or dictionary The quantity of word is greater than given threshold etc..
By triggering the pipe to self-word creation library when the total degree shielded on word in dictionary is greater than and shields frequency threshold value in setting Reason.Compared to the quantity triggering based on the word in dictionary, the negligible amounts to word are may be implemented in the triggering based on upper screen number Dictionary in old word management.
S320, number was shielded on the dictionary in the upper screen period of the word according to word each in dictionary, determines institute The attenuation ratio of predicate language.
S330, the weight of each word in the dictionary is updated according to the attenuation ratio of the word.
The technical solution of the embodiment of the present invention, by shielding number threshold when the total degree shielded on word in dictionary is greater than in setting When value, the management to self-word creation library is triggered.Compared to the quantity triggering based on the word in dictionary, the triggering based on upper screen number can With realize in the dictionary of the negligible amounts of word old word it is accurate, manage in time.
Example IV
Fig. 4 is the flow chart that the present invention implements a kind of data processing method that four provide.The present embodiment is in above-mentioned implementation A kind of optinal plan proposed by taking self-word creation as an example on the basis of example.Referring to fig. 4, data processing method packet provided in this embodiment It includes:
The keystroke sequence for obtaining user's input, judges whether keystroke sequence meets from condition is made, and is wherein to sentence from condition is made The disconnected corresponding word of keystroke sequence whether be self-word creation condition;
If so, by the relevant information associated storage of self-word creation and the self-word creation into self-word creation library, wherein self-word creation Relevant information include at least self-word creation word frequency, the mode of associated storage can be to be stored in a manner of multi-component system;
If meeting self-word creation damp condition, according to the relevant information of self-word creation to the weight of the self-word creation in self-word creation library Decay, self-word creation library is cleared up according to attenuation results.
Specifically, according to weight of the relevant information of self-word creation to the self-word creation in self-word creation library carry out decaying include:
If the current value for counting up device certainly is greater than setting count threshold, the self-word creation in self-word creation library is enumerated, according to certainly The current word frequency of word making, time index and the current value of device is counted up certainly determine the weight after the self-word creation decaying.
Wherein, often upper one self-word creation of screen adds one from counting up device automatically in self-word creation library, and by when shielding on self-word creation from Count up time index of the value as the self-word creation of device.Therefore self-word creation library current time is indicated from the current value for counting up device The total of self-word creation library upper shield number.The time index of self-word creation indicates total upper screen in self-word creation library when shielding on self-word creation Number.
According to the current word frequency of self-word creation, time index and certainly count up device current value determine self-word creation decaying after Weight in include:
The weight of self-word creation is determined according to the current word frequency of self-word creation;
According to the current value for counting up device when shielding on the last time in self-word creation time index from the value for counting up device and certainly Difference, determine attenuation ratio;
Decayed according to weight of the determining attenuation ratio to self-word creation.
Wherein, working as device, is counted up with oneself according to the value that oneself counts up device when shielding on the last time in self-word creation time index Preceding value difference reaction be:Each self-word creation is on the self-word creation library in the upper screen period of the self-word creation in self-word creation library Shield number.
Optionally, self-word creation damp condition includes but is not limited to:Weight after decaying is too small or time index and from increasing skill The difference of the current value of art device is excessive etc..
Further, when the self-word creation in self-word creation library meets the current input condition of user, and then when as candidate word, It obtains in self-word creation library with the current word frequency of the candidate word of triple store, time index and counts up the current value of device certainly; It by the current word frequency of candidate word, time index and counts up the current value of device certainly and is fitted, after determining candidate word decaying Weight;Sorting position of the candidate word in candidate sequence is determined according to the weight after decaying.
Wherein, fitting is current word frequency based on self-word creation, time index and is counted up in the current value of device certainly at least The calculating of one any way, to determine the weight after self-word creation decaying.
In practical application, the decaying of self-word creation weight can also be described as:
If meeting self-word creation damp condition (for example, in self-word creation library self-word creation number be greater than given threshold), to making certainly The weight of all self-word creations is adjusted in dictionary, to decay to the self-word creation in self-word creation library, wherein the mode adjusted Including but not limited to additivity adjustment, the adjustment of multiplying property, index replacement, power adjustment and Mixed adjustment.
Following effect may be implemented in this method:It is every to meet self-word creation damp condition, just to the power of self-word creation in self-word creation library Once decayed again.If self-word creation after decaying using frequently, the weight of the self-word creation can according to the self-word creation compared with High word frequency increase comes up.If self-word creation after decaying, using infrequently or not using, the weight of the self-word creation is just not Will increase up, and after repeatedly decaying the self-word creation may be deleted from self-word creation library because weight is too low.
In practical application, the decaying of self-word creation weight can also be described as:
If meeting self-word creation damp condition, each of self-word creation library self-word creation is enumerated, according to the current of self-word creation Word frequency and the recently fitting result of the upper timestamp shielded, the weight after determining self-word creation decaying.
Wherein, the upper timestamp shielded can be repeatedly the corresponding timestamp sequence of upper screen recently, be also possible to the last time The timestamp of upper screen.
Specifically, determining to come from and make according to the fitting result of current word frequency and the nearest upper timestamp shielded to self-word creation Word decaying after weight include:
The weight of self-word creation is determined according to the current word frequency of self-word creation;
According to the difference of the timestamp and current time that shield on self-word creation the last time, attenuation ratio is determined;
Decayed according to weight of the determining attenuation ratio to self-word creation.
Correspondingly, the condition cleared up self-word creation library includes but is not limited to:After decaying the weight of self-word creation it is too small or The difference of the timestamp and current time that shield on self-word creation the last time is excessive etc..
It is similar, when the self-word creation in self-word creation library meets the current input condition of user, and then when as candidate word, obtain It is derived from word making library with the current word frequency of the self-word creation of triple store, timestamp sequence and present system time;It will make certainly Current word frequency, timestamp sequence and the present system time of word are fitted, the weight after determining self-word creation decaying;According to this Weight determines sorting position of the candidate word in candidate sequence.
The technical solution of the embodiment of the present invention, by the relevant information according to self-word creation to the self-word creation in self-word creation library Weight decays, and is cleared up according to attenuation results self-word creation library.To realize to old from the automatic management made and clear It removes, the self-word creation for allowing some users to throw aside already no longer interferes user normally to input, and improves the input efficiency of user.It needs to illustrate , by the technical teaching of the present embodiment, those skilled in the art have motivation by any reality described in above-described embodiment The mode of applying carries out the combination of scheme, to realize the management to word old in dictionary.
Embodiment five
Fig. 5 is a kind of structural schematic diagram for data processing equipment that the embodiment of the present invention five provides.The present embodiment is upper State a kind of optinal plan proposed on the basis of embodiment.Referring to Fig. 5, data processing equipment provided in this embodiment includes:It declines Subtract ratio-dependent module 10 and weight update module 20.
Wherein, attenuation ratio determining module 10, for according to word each in dictionary the word the upper screen period Shield number on interior dictionary and determine the attenuation ratio of the word, wherein the upper screen period be the word recently at least Period between the last upper screen time of primary upper screen time and the dictionary;
Weight update module 20, for being carried out according to the attenuation ratio of the word to the weight of each word in the dictionary It updates.
The technical solution of the embodiment of the present invention, by upper according to word upper screen time and dictionary the last time recently Shield in the period between the time, the quantity shielded on word in the dictionary decays to the weight of word in dictionary.It avoids Old self-word creation is arranged in the forefront of candidate word sequence, influences user's input.
Meanwhile the calculation amount for calculating the quantity shielded on word in the dictionary in a period is compared, and calculates word most The calculation amount of time span between nearly upper screen time and current time is small.Again because being stored with a large amount of words in dictionary, System can be reduced by being decayed based on the quantity shielded on word in dictionary described in the period to the weight of word in dictionary Operand.
Further, attenuation ratio determining module includes:Upper screen time determination unit and attenuation ratio determination unit.
Wherein, upper screen time determination unit, for true according to the upper screen time of word each in dictionary at least twice recently Determined the nearest upper screen time of word;
Attenuation ratio determination unit, the decaying for determining the word according to number was shielded on the dictionary in the upper screen period Ratio, wherein the upper screen period was the last upper screen time of the upper screen time and the dictionary recently of the word Between period.
Further, attenuation ratio determining module includes:Binding time attenuation units.
Wherein, binding time attenuation units, for according to word each in dictionary within the upper screen period of the word Dictionary on time for shielding between the nearest at least once upper screen time of each word in number and dictionary and current time it is long Degree, determines the attenuation ratio of the word.
Further, the data processing equipment further includes:Trigger condition judgment module.
Wherein, trigger condition judgment module, for according to word each in dictionary the word the upper screen period Before shielding the attenuation ratio that number determines the word on interior dictionary, if the total degree shielded on word in dictionary is greater than in setting Shield frequency threshold value, then obtains each word in dictionary and shielded number on the dictionary in the upper screen period of the word.
Further, the data processing equipment further includes:Word cleaning modul.
Wherein, word cleaning modul, in the power according to the attenuation ratio of the word to each word in the dictionary Be updated again after, if the weight of word is less than setting removing weight threshold after decaying, the word is deleted from dictionary It removes.
Further, the data processing equipment further includes:Weight attenuation module and candidate word sorting module.
Wherein, weight attenuation module, for according to the candidate word in candidate word sequence the candidate word the upper screen time Shield number on dictionary in section, decays to the weight of the candidate word;
Candidate word sorting module, for determining row of the candidate word in candidate word sequence according to the weight after decaying Sequence.
Embodiment six
Fig. 6 is a kind of structural schematic diagram for equipment that the embodiment of the present invention six provides.Fig. 6, which is shown, to be suitable for being used to realizing this The block diagram of the example devices 12 of invention embodiment.The equipment 12 that Fig. 6 is shown is only an example, should not be to of the invention real The function and use scope for applying example bring any restrictions.
As shown in fig. 6, equipment 12 is showed in the form of universal computing device.The component of equipment 12 may include but unlimited In:One or more processor or processing unit 16, system storage 28, connecting different system components, (including system is deposited Reservoir 28 and processing unit 16) bus 18.
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by equipment 12 The usable medium of access, including volatile and non-volatile media, moveable and immovable medium.
System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 30 and/or cache memory 32.Equipment 12 may further include it is other it is removable/nonremovable, Volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing irremovable , non-volatile magnetic media (Fig. 6 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 6, use can be provided In the disc driver read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical disk The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can To be connected by one or more data media interfaces with bus 18.Memory 28 may include at least one program product, The program product has one group of (for example, at least one) program module, these program modules are configured to perform each implementation of the invention The function of example.
Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28 In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual Execute the function and/or method in embodiment described in the invention.
Equipment 12 can also be communicated with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.), Can also be enabled a user to one or more equipment interacted with the equipment 12 communication, and/or with enable the equipment 12 with One or more of the other any equipment (such as network interface card, modem etc.) communication for calculating equipment and being communicated.It is this logical Letter can be carried out by input/output (I/O) interface 22.Also, equipment 12 can also by network adapter 20 and one or The multiple networks of person (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.As shown, Network adapter 20 is communicated by bus 18 with other modules of equipment 12.It should be understood that although not shown in the drawings, can combine Equipment 12 uses other hardware and/or software module, including but not limited to:Microcode, device driver, redundant processing unit, External disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and Data processing, such as realize data processing method provided by the embodiment of the present invention.
Embodiment seven
The embodiment of the present invention seven additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should Realize that the data processing method as described in any in the embodiment of the present invention, this method include when program is executed by processor:
Shield number on the dictionary in the upper screen period of the word according to word each in dictionary and determines the word Attenuation ratio, wherein nearest at least once upper screen time for being the word upper screen period and the dictionary are most Period between the nearly primary upper screen time;
It is updated according to weight of the attenuation ratio of the word to each word in the dictionary.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes:Tool There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.? Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as mentioned using Internet service It is connected for quotient by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.

Claims (14)

1. a kind of data processing method, which is characterized in that including:
Shield number on the dictionary in the upper screen period of the word according to word each in dictionary, determines declining for the word Subtracted ratio, wherein the upper screen period was the upper screen time and nearest the one of the dictionary recently at least once of the word Period between the secondary upper screen time;
It is updated according to weight of the attenuation ratio of the word to each word in the dictionary.
2. the method according to claim 1, wherein according to word each in dictionary in the upper screen of the word Between shield number on dictionary in section, determine that the attenuation ratio of the word includes:
The nearest upper screen time of word was determined according to the upper screen time of word each in dictionary at least twice recently;
The attenuation ratio of the word was determined according to screen number on the dictionary in the upper screen period, wherein the upper screen period is The period between upper screen time and the last upper screen time of the dictionary recently of the word.
3. the method according to claim 1, wherein according to word each in dictionary in the upper screen of the word Between shield number on dictionary in section, determine that the attenuation ratio of the word includes:
Shielded each word in number and dictionary on the dictionary in the upper screen period of the word according to word each in dictionary Time span between nearest at least once upper screen time and current time, determines the attenuation ratio of the word.
4. the method according to claim 1, wherein according to word each in dictionary the word upper screen Before shielding the attenuation ratio that number determines the word on dictionary in period, further include:
If the total degree shielded on word in dictionary, which is greater than in setting, shields frequency threshold value, each word is obtained in dictionary in institute's predicate Shielded number on dictionary in the upper screen period of language.
5. the method according to claim 1, wherein in the attenuation ratio according to the word in the dictionary After the weight of each word is updated, further include:
If the weight of word is less than setting removing weight threshold after decaying, the word is deleted from dictionary.
6. method described in any claim in -5 according to claim 1, which is characterized in that further include:
Shielded number on the dictionary in the upper screen period of the candidate word according to the candidate word in candidate word sequence to the time The weight of word is selected to decay;
Sequence of the candidate word in candidate word sequence is determined according to the weight after decaying.
7. a kind of data processing equipment, which is characterized in that including:
Attenuation ratio determining module, for being shielded on the dictionary in the upper screen period of the word according to word each in dictionary Number determines the attenuation ratio of the word, wherein when the upper screen period is the upper screen at least once recently of the word Between period between the last upper screen time the dictionary;
Weight update module, for being updated according to the attenuation ratio of the word to the weight of each word in the dictionary.
8. device according to claim 7, which is characterized in that attenuation ratio determining module includes:
Upper screen time determination unit, for determining word most according to the upper screen time of word each in dictionary at least twice recently The nearly upper screen time;
Attenuation ratio determination unit, the attenuation ratio for determining the word according to number was shielded on the dictionary in the upper screen period Example, wherein the upper screen period be the word the screen time upper recently and the dictionary the last upper screen time it Between period.
9. device according to claim 7, which is characterized in that attenuation ratio determining module includes:
Binding time attenuation units, for being shielded on the dictionary in the upper screen period of the word according to word each in dictionary Time span in number and dictionary between each word nearest at least once upper screen time and current time, determine described in The attenuation ratio of word.
10. device according to claim 7, which is characterized in that further include:
Trigger condition judgment module, in the dictionary according to word each in dictionary within the upper screen period of the word Before screen number determines the attenuation ratio of the word, if the total degree shielded on word in dictionary, which is greater than in setting, shields number threshold Value, then obtain each word in dictionary and shielded number on the dictionary in the upper screen period of the word.
11. device according to claim 7, which is characterized in that further include:
Word cleaning modul, for being updated according to the attenuation ratio of the word to the weight of each word in the dictionary Later, if the weight of word is less than setting removing weight threshold after decaying, the word is deleted from dictionary.
12. according to device described in any claim in claim 7-11, which is characterized in that further include:
Weight attenuation module, for the dictionary according to the candidate word in candidate word sequence within the upper screen period of the candidate word Upper screen number, decays to the weight of the candidate word;
Candidate word sorting module, for determining sequence of the candidate word in candidate word sequence according to the weight after decaying.
13. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as data processing method as claimed in any one of claims 1 to 6.
14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor Such as data processing method as claimed in any one of claims 1 to 6 is realized when execution.
CN201810637434.0A 2018-06-20 2018-06-20 A kind of data processing method, device, equipment and medium Pending CN108874175A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810637434.0A CN108874175A (en) 2018-06-20 2018-06-20 A kind of data processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810637434.0A CN108874175A (en) 2018-06-20 2018-06-20 A kind of data processing method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN108874175A true CN108874175A (en) 2018-11-23

Family

ID=64340095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810637434.0A Pending CN108874175A (en) 2018-06-20 2018-06-20 A kind of data processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN108874175A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030157A (en) * 2007-04-20 2007-09-05 北京搜狗科技发展有限公司 Method and system for updating user vocabulary synchronouslly
CN102209083A (en) * 2010-03-31 2011-10-05 北京搜狗科技发展有限公司 Method and server for synchronous update of user lexicon and input method system
CN103049458A (en) * 2011-10-17 2013-04-17 北京搜狗科技发展有限公司 Method and system for revising user word bank
CN104536976A (en) * 2014-12-05 2015-04-22 苏州沃斯麦机电科技有限公司 Associating input system based on Sudoku input mode
CN106896937A (en) * 2017-02-28 2017-06-27 百度在线网络技术(北京)有限公司 Method and apparatus for being input into information
CN106933380A (en) * 2017-02-13 2017-07-07 北京奇虎科技有限公司 The update method and device of a kind of dictionary
CN107153658A (en) * 2016-03-03 2017-09-12 常州普适信息科技有限公司 A kind of public sentiment hot word based on weighted keyword algorithm finds method
CN107665206A (en) * 2016-07-27 2018-02-06 北京搜狗科技发展有限公司 Clear up method, system and the device for clearing up user thesaurus of user thesaurus
US20180129300A1 (en) * 2015-04-01 2018-05-10 Beijing Qihoo Technology Company Limited Input-based candidate word display method and apparatus

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030157A (en) * 2007-04-20 2007-09-05 北京搜狗科技发展有限公司 Method and system for updating user vocabulary synchronouslly
CN102209083A (en) * 2010-03-31 2011-10-05 北京搜狗科技发展有限公司 Method and server for synchronous update of user lexicon and input method system
CN103049458A (en) * 2011-10-17 2013-04-17 北京搜狗科技发展有限公司 Method and system for revising user word bank
CN104536976A (en) * 2014-12-05 2015-04-22 苏州沃斯麦机电科技有限公司 Associating input system based on Sudoku input mode
US20180129300A1 (en) * 2015-04-01 2018-05-10 Beijing Qihoo Technology Company Limited Input-based candidate word display method and apparatus
CN107153658A (en) * 2016-03-03 2017-09-12 常州普适信息科技有限公司 A kind of public sentiment hot word based on weighted keyword algorithm finds method
CN107665206A (en) * 2016-07-27 2018-02-06 北京搜狗科技发展有限公司 Clear up method, system and the device for clearing up user thesaurus of user thesaurus
CN106933380A (en) * 2017-02-13 2017-07-07 北京奇虎科技有限公司 The update method and device of a kind of dictionary
CN106896937A (en) * 2017-02-28 2017-06-27 百度在线网络技术(北京)有限公司 Method and apparatus for being input into information

Similar Documents

Publication Publication Date Title
US20040117380A1 (en) System and method for command line prediction
KR20100115818A (en) Dynamic formulas for spreadsheet cells
CN106155699B (en) A kind of management method and mobile terminal of background process
CN103645950A (en) Computer acceleration method and device
CN104346148A (en) Method, device and system for acquiring program performance consumption information
CN109714636A (en) A kind of user identification method, device, equipment and medium
CN107729538A (en) comment information processing method, device, terminal device and storage medium
CN110830234A (en) User traffic distribution method and device
CN108121716A (en) The approaches and problems uniprocesser system of process problem list
CN108920651A (en) Information-pushing method, device, server and storage medium
CN109831581A (en) Information filtering method, device, terminal and storage medium
CN112214155A (en) View information playing method, device, equipment and storage medium
CN109584431A (en) A kind of data processing method of priority queue, apparatus and system
US20040044954A1 (en) Data-bidirectional spreadsheet
US8700606B2 (en) Methods for calculating a combined impact analysis repository
CN104580704B (en) Method and device for viewing details of short messages
CN109783321A (en) Monitoring data management method, device, terminal device
CN108874175A (en) A kind of data processing method, device, equipment and medium
CN109543027A (en) The acquisition methods and device of paged data, equipment and storage medium
CN110489598A (en) A kind of user's group dividing method and device
CN109726166A (en) Display methods, device, computer equipment and the readable storage medium storing program for executing of e-book
CN109213664A (en) Method for analyzing performance, device, storage medium and the electronic equipment of SQL statement
CN108989902A (en) A kind of processing method, device, terminal and the storage medium of barrage message
CN105630991B (en) ID automatic generation method and device
CN108255810A (en) Near synonym method for digging, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181123

RJ01 Rejection of invention patent application after publication