CN110362686B - Word stock generation method and device, terminal equipment and server - Google Patents

Word stock generation method and device, terminal equipment and server Download PDF

Info

Publication number
CN110362686B
CN110362686B CN201810284199.3A CN201810284199A CN110362686B CN 110362686 B CN110362686 B CN 110362686B CN 201810284199 A CN201810284199 A CN 201810284199A CN 110362686 B CN110362686 B CN 110362686B
Authority
CN
China
Prior art keywords
word stock
attribute
word
cell
stock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810284199.3A
Other languages
Chinese (zh)
Other versions
CN110362686A (en
Inventor
罗治凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201810284199.3A priority Critical patent/CN110362686B/en
Publication of CN110362686A publication Critical patent/CN110362686A/en
Application granted granted Critical
Publication of CN110362686B publication Critical patent/CN110362686B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a method, a device, terminal equipment and a server for generating a word stock, wherein the method comprises the following steps: receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute; generating a public word stock according to the cell word stock; extracting word stock updating information from the public word stock according to the attribute, and returning the information to the corresponding terminal so as to update the cell word stock of the terminal; and the efficiency of updating the word stock by the terminal is improved by improving the efficiency of generating the public word stock by the server, corresponding word stock updating information can be generated aiming at terminals with different attributes, the data volume processed in the process of updating the word stock by the terminal cells is reduced, and the word stock updating rate is further improved.

Description

Word stock generation method and device, terminal equipment and server
Technical Field
The present invention relates to the field of input methods, and in particular, to a method, an apparatus, a terminal device, and a server for generating a word stock.
Background
With the development of computer technology, electronic devices such as mobile phones and tablet computers are becoming more popular, and great convenience is brought to life, study and work of people. These electronic devices are typically installed with an input method application (abbreviated as input method) so that a user can use the input method to input information; candidate information recommended by the input method for the user is obtained based on matching of word libraries of the input method.
Generally, after generating an initial word stock, the server needs to optimize the initial word stock, that is, the terminal collects input information of a user and uploads the input information to the server, the server needs to analyze and process a large amount of input information, then optimizes word frequencies of words in the initial word stock according to words obtained by analysis and process, then sends update information to each terminal, and the terminal updates a local word stock according to the received update information; wherein, the server takes a long time to process a large amount of input information, resulting in low word stock update efficiency.
Disclosure of Invention
The embodiment of the invention provides a method for generating a word stock, which is used for improving the updating efficiency of the word stock.
Correspondingly, the embodiment of the invention also provides a word stock generating device, a terminal device and a server, which are used for guaranteeing the realization and the application of the method.
In order to solve the above problems, the embodiment of the invention discloses a method for generating a word stock, which specifically comprises the following steps: receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute; generating a public word stock according to the cell word stock; extracting word stock updating information from the public word stock according to the attribute, and returning the word stock updating information to the corresponding terminal so as to update the cell word stock of the terminal.
Optionally, the cell word stock includes words and attribute values of the words, and the generating the public word stock according to the cell word stock includes: selecting words from each cell word stock corresponding to each attribute aiming at each attribute, and calculating attribute values corresponding to the words; and generating a public word stock according to the words and the corresponding attribute values.
Optionally, the calculating the attribute value corresponding to the word includes: for each word, acquiring corresponding attribute values from each cell word stock corresponding to the attribute; calculating an average attribute value corresponding to the attribute of the word according to the attribute values; and taking the average attribute value as an attribute value corresponding to the attribute of the word.
Optionally, extracting word stock update information from the public word stock according to the attribute, and returning the information to the corresponding terminal, including: aiming at each terminal, determining each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
The embodiment of the invention also discloses a generating device of the word stock, which comprises the following steps: the word stock receiving module is used for receiving cell word stocks uploaded by each terminal, wherein the cell word stocks correspond to the attributes; the word stock generating module is used for generating a public word stock according to the cell word stock; and the extraction module is used for extracting word stock updating information from the public word stock according to the attribute and returning the word stock updating information to the corresponding terminal so as to update the cell word stock of the terminal.
Optionally, the word stock generating module includes: the attribute value determining submodule is used for selecting words from each cell word stock corresponding to each attribute according to each attribute, and calculating attribute values corresponding to the words; and the public word stock generating sub-module is used for generating a public word stock according to the words and the corresponding attribute values.
Optionally, the attribute value determining submodule is specifically configured to obtain, for each word, each corresponding attribute value from each cell word stock corresponding to the attribute; calculating an average attribute value corresponding to the attribute of the word according to the attribute values; and taking the average attribute value as an attribute value corresponding to the attribute of the word.
Optionally, the extraction module is specifically configured to determine, for each terminal, each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
The embodiment of the invention also discloses a server, which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, and the one or more programs comprise instructions for: receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute; generating a public word stock according to the cell word stock; extracting word stock updating information from the public word stock according to the attribute, and returning the word stock updating information to the corresponding terminal so as to update the cell word stock of the terminal.
Optionally, the cell word stock includes words and attribute values of the words, and the generating the public word stock according to the cell word stock includes: selecting words from each cell word stock corresponding to each attribute aiming at each attribute, and calculating attribute values corresponding to the words; and generating a public word stock according to the words and the corresponding attribute values.
Optionally, the calculating the attribute value corresponding to the word includes: for each word, acquiring corresponding attribute values from each cell word stock corresponding to the attribute; calculating an average attribute value corresponding to the attribute of the word according to the attribute values; and taking the average attribute value as an attribute value corresponding to the attribute of the word.
Optionally, extracting word stock update information from the public word stock according to the attribute, and returning the information to the corresponding terminal, including: aiming at each terminal, determining each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
The embodiment of the invention also discloses a method for generating the word stock, which specifically comprises the following steps: acquiring input information and associated information, and determining a cell word stock corresponding to each attribute; uploading the cell word stock, wherein the cell word stock corresponds to the attribute; and receiving word stock updating information fed back by the server, and updating the corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to a public word stock of the server.
Optionally, the determining the cell word stock corresponding to each attribute includes: word segmentation processing is carried out on the input information to obtain corresponding words, and corresponding attributes are determined according to the associated information corresponding to the words; and generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
Optionally, the generating a cell word stock corresponding to each attribute according to the word and each attribute, includes: aiming at each attribute, determining each word corresponding to the attribute and counting the frequency of each word; according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word; and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
Optionally, the method further comprises the step of filtering the cell word stock: inquiring the cell word stock, and determining words with privacy labels in the cell word stock; and filtering words with privacy labels in the cell word stock.
Optionally, the updating the corresponding attribute cell word stock according to the word stock updating information includes: selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock; and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
Optionally, the method further comprises: receiving an input sequence; acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information; matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes; and displaying the candidate information.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
The embodiment of the invention also discloses a generating device of the word stock, which specifically comprises the following steps: the word stock determining module is used for acquiring input information and associated information and determining a cell word stock corresponding to each attribute; the uploading module is used for uploading the cell word stock, wherein the cell word stock corresponds to the attribute; and the updating module is used for receiving word stock updating information fed back by the server and updating the corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to the public word stock of the server.
Optionally, the word stock determining module includes: the analysis submodule is used for carrying out word segmentation processing on the input information to obtain corresponding words, and determining corresponding attributes according to the associated information corresponding to the words; and the cell word stock generation submodule is used for generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
Optionally, the cell word stock generation submodule is specifically configured to determine, for each attribute, each word corresponding to the attribute and count the frequency of each word; according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word; and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
Optionally, the method further comprises: the filtering module is used for inquiring the cell word stock and determining words with privacy labels in the cell word stock; and filtering words with privacy labels in the cell word stock.
Optionally, the updating module is specifically configured to select, according to an attribute of a cell word stock, a word corresponding to the attribute from the word stock updating information; and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
Optionally, the method further comprises: the matching module is used for receiving an input sequence; acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information; matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes; and displaying the candidate information.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
A terminal device comprising a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for: acquiring input information and associated information, and determining a cell word stock corresponding to each attribute; uploading the cell word stock, wherein the cell word stock corresponds to the attribute; and receiving word stock updating information fed back by the server, and updating the corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to a public word stock of the server.
Optionally, the determining the cell word stock corresponding to each attribute includes: word segmentation processing is carried out on the input information to obtain corresponding words, and corresponding attributes are determined according to the associated information corresponding to the words; and generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
Optionally, the generating a cell word stock corresponding to each attribute according to the word and each attribute, includes: aiming at each attribute, determining each word corresponding to the attribute and counting the frequency of each word; according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word; and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
Optionally, the method further comprises instructions for performing the filtering the cell thesaurus operation: inquiring the cell word stock, and determining words with privacy labels in the cell word stock; and filtering words with privacy labels in the cell word stock.
Optionally, the updating the corresponding attribute cell word stock according to the word stock updating information includes: selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock; and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
Optionally, further comprising instructions for: receiving an input sequence; acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information; matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes; and displaying the candidate information.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
A readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform a method for generating a thesaurus according to an embodiment of the present invention.
The embodiment of the invention has the following advantages:
the embodiment of the invention can receive the cell word stock uploaded by each terminal, and then generate the public word stock according to each cell word stock, namely the server only needs to fuse the cell word stock uploaded by each terminal without analyzing and processing a large amount of input information, thereby greatly shortening the time of generating the public word stock by the server and improving the efficiency of generating the public word stock; the cell word stock corresponds to the attribute, so that the server can extract word stock update information from the public word stock according to the attribute, namely, generate corresponding word stock update information according to the attribute of the cell word stock, and then return the corresponding word stock update information to the corresponding terminal; and the efficiency of updating the word stock by the terminal is improved by improving the efficiency of generating the public word stock by the server, corresponding word stock updating information can be generated aiming at terminals with different attributes, the data volume processed in the process of updating the word stock by the terminal cells is reduced, and the word stock updating rate is further improved.
Drawings
FIG. 1 is a flowchart illustrating steps of an embodiment of a method for generating a terminal-side thesaurus according to the present invention;
FIG. 2 is a flowchart illustrating steps of an embodiment of a method for generating a server-side thesaurus of the present invention;
FIG. 3 is a flow chart of steps of an alternative embodiment of a method for generating a terminal-side thesaurus of the present invention;
FIG. 4 is a flowchart illustrating steps of an alternate embodiment of a method for generating a server-side thesaurus in accordance with the present invention;
FIG. 5 is a flow chart of the steps of one embodiment of the matching candidate information of the present invention;
FIG. 6 is a schematic diagram of an embodiment of a terminal to server data interaction of the present invention;
fig. 7 is a block diagram showing an embodiment of a terminal-side thesaurus generating apparatus according to the present invention;
FIG. 8 is a block diagram illustrating an alternative embodiment of a terminal-side thesaurus generation apparatus of the present invention;
FIG. 9 is a block diagram illustrating an exemplary embodiment of a server-side thesaurus generation apparatus according to the present invention;
FIG. 10 is a block diagram illustrating an alternate embodiment of a server-side thesaurus generation apparatus of the present invention;
FIG. 11 is a block diagram illustrating a terminal device for generating word stock in accordance with an exemplary embodiment;
fig. 12 is a schematic diagram showing a structure of a server for generating a thesaurus according to another exemplary embodiment of the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
One of the core ideas of the embodiment of the invention is that the terminal analyzes and processes the input information of the user to obtain a cell word stock and uploads the cell word stock to the server; then the server fuses the cell word banks uploaded by each terminal to generate a public word bank, and returns word bank updating information to the terminal, and the terminal updates the local cell word banks according to the word bank updating information; the data volume of the input information of each terminal is far smaller than that of the input information of all terminals, so that the efficiency of the terminal in analyzing and processing the input information is far higher than that of a server in inputting the information, and the efficiency of generating a public word stock by the server is further improved, and the updating efficiency of a local word stock is further improved; the processing procedures corresponding to the server and the terminal are described below.
The processing procedure of the terminal is described:
referring to fig. 1, a step flowchart of an embodiment of a method for generating a terminal side word stock according to the present invention may specifically include the following steps:
Step 102, obtaining input information and associated information, and determining a cell word stock corresponding to each attribute.
Step 104, uploading the cell word stock, wherein the cell word stock corresponds to the attribute.
Step 106, receiving word stock updating information fed back by the server, and updating the corresponding attribute cell word stock according to the word stock updating information.
In the embodiment of the invention, the information input by the user in different scenes is likely to be different, for example, the user is used to input "slightly waiting" after inputting "you" in working time, and the user is used to input "slightly waiting" after inputting "you" in non-working time; for example, in the game A, the user is used to input "drop bar" after inputting "you", in the game B, after inputting "you" the user is used to input ", etc.; therefore, in order to ensure more accurate candidate information matching based on word stock, when input information is acquired, associated information corresponding to the input information can be acquired. The associated information may be used to describe environment information corresponding to the input information, such as an application where the input information is currently located, a current location, a current time, and the like; the input information may be information input by the user by using an input method, such as a candidate selected by the screen operation, or may be information input by the user by executing a paste operation in an edit box of the application, which is not limited in the embodiment of the present invention.
Then, the input information and the associated information can be analyzed to determine the words corresponding to the input information, for example, the words corresponding to the input information "you eat" can be "you" and "you eat", and each attribute corresponding to the associated information is determined, for example, the associated information is the current weather: the corresponding attribute may be rainy days when the light rain changes into medium rain; for another example, the associated information is the current location: * Cell, the corresponding attributes are: a residence; also for example, the associated information is the current date: 2018/1/1, the corresponding attribute is: a primordial denier, etc.; and determining the attribute value corresponding to each word and each attribute, and further generating a cell word stock corresponding to each attribute according to each word and the corresponding attribute value, i.e. the cell word stock can comprise words and attribute values corresponding to the words. Wherein, the cell word banks correspond to the attributes, each cell word bank can correspond to one attribute, the attributes of the cell word banks corresponding to different terminals can be the same or different, and the cell word banks with the same attributes of different terminals can be the same or different; the cell word libraries with different attributes can contain the same word, namely the same word can correspond to a plurality of attributes, and the cell word libraries corresponding to the three terminals are respectively shown by referring to tables 1-3; the example is to intuitively represent each cell word stock in a terminal, put the cell word stock corresponding to a plurality of attributes in a data table for display, and display the same words and corresponding attribute values contained in the cell word stock corresponding to the plurality of attributes, and specifically as follows:
Cell word stock corresponding to terminal a:
TABLE 1
Cell word stock corresponding to terminal B:
TABLE 2
Cell word stock corresponding to terminal C:
TABLE 3 Table 3
Wherein a10-a16, a20-a16 and a30-a36 in Table 1, b10-a15, b40-b45 and b70-b75 in Table 2, and c30-c34, c40-c44, c50-c54 and c80-c84 in Table 3 are all attribute values.
After the terminal of the embodiment of the invention generates the cell word stock corresponding to each attribute, the generated cell word stock can be uploaded to the server, so that the server can generate the public word stock according to the cell word stock uploaded by each terminal, wherein the server generates the public word stock in the following embodiments. After the server extracts word stock updating information from the public word stock according to the attribute and returns the word stock updating information to the terminal, the terminal can receive the word stock updating information fed back by the server, then compares the word stock updating information with the cell word stock, updates the cell word stock of the corresponding attribute according to the word stock updating information, and further completes the update of the local cell word stock.
In summary, in the embodiment of the invention, the terminal can acquire the input information and the associated information, determine the cell word stock corresponding to each attribute according to the input information and the associated information, and upload the cell word stock to the server; after the server generates word stock updating information, receiving word stock updating information fed back by the server, and updating a corresponding cell word stock according to the word stock updating information; and then the input information is analyzed and processed by the terminal to generate a cell word stock and then is uploaded to the server, so that the load of the server is reduced, the speed of generating the public word stock by the server is increased, and the efficiency of updating the local word stock by the terminal is improved. And the terminal stores a plurality of cell word banks, more accurate candidate information can be matched based on the plurality of cell word banks, and the accuracy of the candidate information is improved.
The processing procedure of the server is described as follows:
referring to fig. 2, a step flowchart of an embodiment of a method for generating a server-side word stock according to the present invention may specifically include the following steps:
step 202, receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute.
And 204, generating a public word stock according to the cell word stock.
And 206, extracting word stock update information from the public word stock according to the attribute, and returning the word stock update information to the corresponding terminal.
In the embodiment of the invention, after each terminal uploads the corresponding cell word stock, the server can receive the cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute, and each cell word stock can correspond to one attribute. In the embodiment of the invention, the number of cell word banks uploaded by different terminals can be the same or different, for example, terminal A uploads 3 cell word banks and terminal C uploads 4 cell word banks; the attributes of the cell word libraries uploaded by the same terminal are mutually different, for example, 3 cell word libraries uploaded by the terminal A correspond to the attributes 1, 2 and 3 respectively, and the attributes of the cell word libraries uploaded by different terminals can be the same or different; for example, terminal a uploads 3 cell word banks, and the corresponding attributes are attribute 1, attribute 2, and attribute 3, respectively, and terminal B uploads 3 cell word banks, and the corresponding attributes are attribute 1, attribute 4, and attribute 7, respectively. And then the server can generate a public word stock according to the received cell word stock, namely the cell word stock can be classified according to the attribute of each cell word stock, the cell word stocks with the same attribute are fused, and the public word stock is generated according to the information after corresponding fusion of each attribute.
In one example of the present invention, the cell word stock received by the server may be the cell word stock corresponding to the above table 1-table 3, and then a common word stock is generated according to the cell word stock corresponding to the table 1-table 3, that is, the cell word stock corresponding to the attribute 1 is selected from the terminal a and the terminal B to be fused, the cell word stock corresponding to the attribute 4 is selected from the terminal B and the terminal C to be fused, and the cell word stock corresponding to the attribute 3 is selected from the terminal a and the terminal C; then, generating a public word stock according to the fused information and the cell word stock of other attributes in the terminal A, B, C, as shown in table 4:
TABLE 4 Table 4
Wherein, G and G are attribute values, and the attribute value of each word corresponding to each attribute in the public word stock is determined according to the attribute value of the word in each cell word stock corresponding to the attribute.
In the embodiment of the invention, the public word stock contains words in the cell word stock uploaded by all terminals, but the attributes corresponding to the cell word stock uploaded by different terminals may not be identical, so that corresponding word stock update information can be extracted from the public word stock according to the attributes corresponding to the cell word stock uploaded by the terminals, and then the word stock update information is returned to the terminals. For example, terminal a corresponds to the attribute of the cell word stock: attribute 1, attribute 2, and attribute 3, the lexicon update information extracted from the public lexicon may be as shown in table 5, where G10 may represent an attribute value of attribute 1 for the "good" corresponding attribute, G20 may represent an attribute value of attribute 2 for the "good" corresponding attribute, and G30 represents an attribute value of attribute 1 for the "good" corresponding attribute, and so on:
TABLE 5
If the attributes of the cell word libraries uploaded by the two terminals are the same, the word library updating information returned to the two terminals is the same, and if the attributes of the cell word libraries uploaded by the two terminals are different, the word library updating information returned to the two terminals is different; and further, after the terminal receives the word stock updating information, the cell word stock with the corresponding attribute can be updated according to the word stock updating information.
In summary, the embodiment of the invention can receive the cell word stock uploaded by each terminal, and then generate the public word stock according to each cell word stock, namely the server only needs to fuse the cell word stock uploaded by each terminal without analyzing and processing a large amount of input information, thereby greatly shortening the time of the server for generating the public word stock and improving the efficiency of generating the public word stock; the cell word stock corresponds to the attribute, so that the server can extract word stock update information from the public word stock according to the attribute, namely, generate corresponding word stock update information according to the attribute of the cell word stock, and then return the corresponding word stock update information to the corresponding terminal; and the efficiency of updating the word stock by the terminal is improved by improving the efficiency of generating the public word stock by the server, corresponding word stock updating information can be generated aiming at terminals with different attributes, the data volume processed in the process of updating the word stock by the terminal cells is reduced, and the word stock updating rate is further improved.
In one embodiment of the invention, the cell word stock may include words and attribute values corresponding to the words, where the attribute values are correlation coefficients of the words and attributes corresponding to the cell word stock; the following describes the process of generating the cell word stock at the terminal side in detail, specifically as follows:
referring to fig. 3, a step flow diagram of an alternative embodiment of a method for generating a terminal-side thesaurus of the present invention is shown; the method specifically comprises the following steps:
step 302, input information and associated information are acquired.
Step 304, word segmentation processing is carried out on the input information to obtain corresponding words, and corresponding attributes are determined according to the associated information corresponding to the words.
In the embodiment of the invention, a terminal can acquire input information and associated information, and then determine a cell word stock corresponding to each attribute according to the input information and the associated information; the association information may be obtained once when the input information is obtained once, or the association information may be obtained according to a set time length, and the input information obtained in the set time length corresponds to the association information obtained at this time. In the embodiment of the present invention, the association information may include multiple dimensions: weather dimension, location dimension, application dimension, traffic dimension, time dimension, and user dimension, although other dimensions such as application frequency, etc., are not limited herein; and the environment information corresponding to the input information can be comprehensively described from multiple dimensions.
After the input information and the corresponding associated information are obtained, the input information and the corresponding associated information can be analyzed, and a cell word stock corresponding to each attribute is determined; specifically, the word segmentation processing can be performed on the input information to obtain corresponding words, for example, the input information is subjected to syntactic and semantic analysis, the syntactic information and the semantic information are utilized to perform word segmentation, if the input information is "you eat the word", the words obtained after the word segmentation processing include: "you" and "eat" are, for example, the word segmentation process is performed by adopting a binary word segmentation method corresponding to the input information, for example, the input information "tomorrow early" is input, and the words obtained after the word segmentation by adopting the binary word segmentation method include: "tomorrow" and "early onset". And then, using the associated information corresponding to the input information as the associated information of the word corresponding to the input information, and determining the attribute corresponding to each word according to the associated information of the word. Wherein, the associated information of one dimension may correspond to a type of attribute, for example, the associated information of the weather dimension corresponds to an attribute of the weather class, the associated information of the application dimension corresponds to an attribute of the application class, the associated information of the time dimension corresponds to an attribute of the time class, and the associated information of the user dimension corresponds to an attribute of the user class, etc.; thus, the attributes include at least one of the following types: weather class, location class, application class, traffic class, time class, user class. And one type of attribute may include at least one attribute, for example, the associated information corresponding to the weather dimension includes: sunny, cloudy, rainy, rainstorm, snowy and snowy, the sunny, cloudy and cloudy can be divided into the same attribute: on sunny days, dividing light rain, medium rain and heavy rain into the same attribute: in rainy days, small snow, medium snow and sudden snow are divided into the same attribute: snow day; for another example, attributes of a location class may include: residential, office, on-road; also for example, attributes of the application class may include: chat application a, chat application B, game application a, and question-answering application. For another example, the associated information corresponding to the user dimension may include the user's age, user occupation, and user tags such as hobby tags, e.g., game tags, music tags, and so forth; for user ages, attributes of users under 20 years old may be determined as: the attributes of teenagers, 20-40 years old users are determined as: the attributes of young age 40-60 users are determined as: the attributes of the middle-aged and over 60 years old users are determined to be old; attributes of the user class, such as lawyers, doctors, teachers, etc., may be determined for the user occupation; attributes of the user class, such as game lovers, music lovers, etc., may be determined for the user tag. Then, according to the words and the corresponding attribute values, generating a cell word stock corresponding to each attribute, specifically, as in steps 306-308:
Step 306, for each attribute, determining each word corresponding to the attribute and counting the frequency of each word.
Step 308, determining a correlation coefficient between each word and the attribute according to the frequency of each word, and taking the correlation coefficient as an attribute value of the corresponding word.
Step 310, generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
According to the embodiment of the invention, aiming at each attribute, according to the attribute corresponding to each word, selecting the word corresponding to the attribute, and counting the frequency corresponding to each word; and then determining the correlation coefficient of each word and the attribute according to the frequency of each word, wherein the correlation coefficient is used for representing the correlation degree of the word and the attribute, and the higher the frequency of the word is, the larger the correlation coefficient of the word and the attribute is. And then the relevancy coefficient of the word is used as an attribute value of the corresponding attribute of the word, for example, the word is "eaten" and the attribute is: the correlation coefficient of the chat application A is 0.8, and the attribute value of the chat application A corresponding to 'does not eat' is 0.8; "does eat" and attributes: the correlation coefficient of the game application B is 0.1, and the attribute value of the game application B corresponding to 'does not eat' is 0.1; and then generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value, wherein the words corresponding to each attribute and the attribute values corresponding to the words can be added into the cell word stock corresponding to the attribute when the cell word stock is concretely realized, namely the cell word stock comprises words and the attribute values corresponding to the words. Further, through the steps 306-308, a cell word stock corresponding to each attribute is determined. For example, terminal a generates a cell lexicon of three attributes: rainy days, residential and chat applications a, reference can be made to table 6:
TABLE 6
In an optional embodiment of the invention, when determining the cell word stock corresponding to each attribute, the first N words with the highest attribute value (N is a positive integer) can be selected, and then the cell word stock corresponding to the attribute is generated according to the selected words and the corresponding attribute values; namely, selecting words with high association degree with the attribute, and generating a cell word stock of the attribute.
In an optional embodiment of the present invention, after word segmentation is performed on the input information to obtain corresponding words, the terminal may record the corresponding context information of each word, for example, input information "you eat" and "you eat" corresponding words, "you eat" the context information "you" of "you eat" and "you eat" corresponding words, "you eat" the context of "you", and input information "you" where "and" where "corresponding words," you "corresponding to" the context information "you" of "where" are input, i.e., "you" where "is the context of" you "; then, according to each word, the corresponding attribute value and the corresponding above information, determining a cell word stock corresponding to each attribute, and specifically: after determining the words corresponding to the attributes, counting the frequency of each word when the word is used as the context of each context information according to the context information of the word; for example, for the word "do not eat", the frequency of "do not eat" as the context of the above information "you" is counted, and the frequency of "do not eat" as the context of the above information "your" is counted, … …. And then determining the word as the following of each piece of the above information according to the frequency, and generating a cell word stock corresponding to each attribute according to the steps 308-310. In one example of the present invention, to better illustrate the difference between the corresponding attribute values when each word is used as the following of different pieces of the above information in the cell word stock of each attribute, two tables (table 7 and table 8) may be used to illustrate the corresponding attribute values when the same word is used as the following of two pieces of the above information, "you" and "your", and specific reference may be made to table 7 and table 8:
/>
TABLE 7
TABLE 8
In an optional embodiment of the invention, the terminal may further obtain an input sequence corresponding to the input information when obtaining the input information, then perform word segmentation on the input information to obtain corresponding words, and record an input sequence corresponding to each word segmentation, and then determine a cell word stock corresponding to each attribute according to each word, a corresponding attribute value and a corresponding input sequence; this is similar to the above method for determining the cell word stock corresponding to each attribute according to each word, the corresponding attribute value and the corresponding above information, and will not be described here again.
In another embodiment of the invention, in order to ensure the safety of personal privacy of the user, the privacy information in the cell word stock can be filtered and uploaded; when acquiring input information, detecting whether the input information is privacy information or not, and if so, adding a privacy label to the input information; if the current page is the login page, the input information is likely to be an account number, a password and the like, and the input information is determined to be privacy information; when the input information is segmented, whether the input information is a character string or not can be judged, if yes, the input information can be determined to be privacy information, and privacy labels can be added to the input information; the privacy tag is used for identifying the input information as privacy information. Then, after the input information is segmented, the privacy label corresponding to the input information is used as the privacy label of each word obtained by segmentation; and the privacy information can be filtered according to the privacy label.
Step 312, inquiring the cell word stock, and determining words with privacy labels in the cell word stock.
Step 314, filtering the words with privacy labels in the cell word stock.
Step 316, uploading the cell word stock.
According to the embodiment of the invention, each cell word stock can be queried, the words with privacy labels in each cell word stock are determined, and then the words with privacy labels are filtered out from the corresponding cell word stock; then uploading the filtered cell word stock to a server; and privacy information can be limited in the terminal, so that privacy disclosure is prevented.
In summary, in the embodiment of the present invention, the terminal may obtain the input information and the associated information, determine the cell word stock corresponding to each attribute according to the input information and the associated information, and then upload the cell word stock to the server; after the server generates word stock updating information, receiving word stock updating information fed back by the server, and updating a corresponding cell word stock according to the word stock updating information; and then the input information is analyzed and processed by the terminal to generate a cell word stock and then is uploaded to the server, so that the load of the server is reduced, the speed of the server for generating the public word stock is increased, the efficiency of the server for generating word stock update information is improved, and the efficiency of the terminal for updating the local word stock is improved.
Further, the terminal stores a cell lexicon of a plurality of attributes including at least one of the following types: weather class, position class, application class, traffic class, time class and user class, more accurate candidate information can be matched based on various cell word libraries, and the accuracy of the candidate information is improved; and the user experience can be improved.
And thirdly, before uploading the cell word stock, searching out words with privacy labels in the cell word stock and filtering, so that privacy information of a user is limited in the terminal, privacy disclosure of the user is prevented, and privacy safety is guaranteed.
The server may further receive the cell word stock uploaded by each terminal, and generate a corresponding public word stock according to the cell word stock, and in one embodiment of the present invention, a process of generating the public word stock at the server side may be described in detail, which is specifically as follows:
referring to FIG. 4, a flowchart illustrating steps of an alternate embodiment of a method for generating a server-side thesaurus of the present invention is shown; the method specifically comprises the following steps:
step 402, receiving cell word libraries uploaded by each terminal.
In the embodiment of the invention, the server can receive the cell word stock uploaded by each terminal, and the attribute of the received cell word stock can also comprise at least one of the following types: weather class, location class, application class, traffic class, time class, and user class; the cell word libraries uploaded by each terminal can be one or a plurality of, and the attributes of all cell word libraries uploaded by any two terminals can be completely the same or partially the same or completely different. Then, a public word stock can be generated according to the cell word stock, which is specifically as follows:
Step 404, selecting a word from each cell word stock corresponding to each attribute according to each attribute, and calculating an attribute value corresponding to the word.
Step 406, generating a public word stock according to the words and the corresponding attribute values.
In the embodiment of the invention, the cell word stock is generated according to the attribute, so that when the public word stock is generated according to the cell word stock, the cell word stock with the same attribute can be fused according to the attribute of the cell word stock, and then the public word stock is generated according to the fused information; the cell word stock comprises words and attribute values corresponding to the words. Determining each cell word stock corresponding to each attribute according to the attribute of each cell word stock, selecting words from each cell word stock corresponding to the attribute, and calculating attribute values corresponding to the selected words; then, a public word stock can be generated according to all words and corresponding attribute values, namely, each word and the corresponding attribute value are added into the public word stock.
Optionally, after selecting a word from each cell word stock corresponding to the attribute and calculating an attribute value corresponding to the selected word, a public cell word stock corresponding to the attribute can be generated according to the word corresponding to the attribute and the attribute value corresponding to the word, and further the public cell word stock corresponding to each attribute can be generated, and then the public cell word stock corresponding to each attribute is adopted to form a set to generate the public word stock.
Wherein, calculating the attribute value corresponding to the word may include the following sub-steps:
and step 41, acquiring corresponding attribute values from the cell word stock corresponding to the attributes for each word.
And step 42, calculating an average attribute value corresponding to the attribute of the word according to the attribute values.
And a substep 43, taking the average attribute value as an attribute value corresponding to the attribute of the word.
In the embodiment of the invention, each cell word stock corresponding to the attribute possibly contains the same word, so that each attribute value corresponding to each word is obtained from each cell word stock corresponding to the attribute; and then calculating an average attribute value corresponding to the attribute of the word according to each attribute value corresponding to the word, and taking the average attribute value as the attribute value corresponding to the attribute of the word.
For example, taking the above table 1-table 3 as an example, taking the cell word stock corresponding to the table 1-table 3 as the cell word stock received by the server, taking the attribute 1 as an example, words may be selected from the cell word stock corresponding to the attribute 1 in the table 1 and the table 2, and then the attribute value corresponding to each word is calculated, specifically, each attribute value corresponding to each word may be obtained from the cell word stock corresponding to the attribute 1 in the table 1 and the table 2, for example, two attribute values corresponding to the word "good o" are respectively: a10 and b10, and then calculating an average attribute value of each attribute value: (a10+b10)/2, and then taking (a10+b10)/2 as an attribute value corresponding to the attribute 1 of "good o"; for another example, the word "input" which exists only in table 2, the attribute value of "input" attribute 1 may be obtained: b15, the corresponding average attribute value is as follows: b15. by analogy, the attribute value of the other words in the cell word stock corresponding to the attribute 1 and the attribute value of the other words in the cell word stock corresponding to the attribute 1 can be determined.
Optionally, if the attribute value corresponding to the word in the cell word stock is related to the above information, for each piece of above information of each word, each attribute value when the word is used as the context of the above information is obtained from each cell word stock corresponding to the attribute, and then the corresponding average attribute value when the word is used as the context of the above information is calculated. Similarly, if the attribute value corresponding to the word in the cell word stock is related to the input sequence, determining the attribute value of each word according to the method. And then generating a public word stock according to the words and the corresponding attribute values.
For example, referring to table 9, table 9 is a specific example of a common word stock corresponding to when the above information is "you":
/>
TABLE 9
For example, referring to table 10, table 10 is a specific example of a common thesaurus corresponding to when the input sequence is "ni":
table 10
Step 408, for each terminal, determining each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal.
Step 410, extracting words and corresponding attribute values of words corresponding to each attribute from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning to the terminal.
In the embodiment of the invention, as the cell word libraries uploaded by each terminal are likely to be not identical, the corresponding word library update information is also likely to be not identical; therefore, the word stock updating information corresponding to each terminal can be extracted from the public word stock according to the attribute. Specifically, for each terminal, each attribute corresponding to the terminal can be determined according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes from a public word stock and attribute values corresponding to the words, taking the extracted words and the corresponding attribute values as word stock updating information corresponding to the terminal, and returning to the terminal; that is, the word bank update information corresponding to each terminal may include words of each attribute and attribute values of the words.
Optionally, if the public word stock includes a plurality of public cell word stocks, the public cell word stock corresponding to each attribute may be extracted from the public word stock according to each attribute corresponding to the terminal, and then the extracted public cell word stock is used as word stock update information corresponding to the terminal and sent to the terminal.
In summary, the embodiment of the invention can receive the cell word stock uploaded by each terminal, and then generate the public word stock according to each cell word stock, namely the server only needs to fuse the cell word stock uploaded by each terminal without analyzing and processing a large amount of input information, thereby greatly shortening the time of the server for generating the public word stock and improving the efficiency of generating the public word stock; the cell word stock corresponds to the attribute, so that the server can extract word stock update information from the public word stock according to the attribute, namely, generate corresponding word stock update information according to the attribute of the cell word stock, and then return the corresponding word stock update information to the corresponding terminal; and further, the efficiency of updating the word stock by the terminal is improved by improving the efficiency of generating the public word stock by the server.
Further, in the embodiment of the present invention, when word stock update information is extracted from the public word stock according to attributes and returned to a corresponding terminal, specifically, for each terminal, each attribute corresponding to the terminal may be determined according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal; generating corresponding word stock updating information according to the attribute of each terminal, and further reducing the transmission of updating information irrelevant to the terminal, so that the terminal does not need to search updating data from a large amount of data, and further improving the updating efficiency of the word stock of the terminal; and also can meet the personalized demands of users.
The terminal can further receive word stock updating information fed back by the server, and then update the corresponding cell word stock according to the word stock updating information; the method comprises the following steps:
(1) And selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock.
(2) And updating the attribute value of the corresponding word in the cell word stock by adopting the attribute value corresponding to the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
In the embodiment of the invention, the word bank updating information comprises words and attribute values corresponding to the words, and the same word can correspond to a plurality of different attributes and attribute values of the attributes, so that for each cell word bank, the words corresponding to the attributes can be selected from the word bank updating information according to the attributes of the cell word bank, and then the cell word bank is updated according to the selected words and the corresponding attribute values. The selected words can be compared with the words in the cell word stock, and when the corresponding words exist in the cell word stock, the attribute values of the corresponding words in the cell word stock can be updated by adopting the corresponding attribute values of the selected words; when the selected word is determined to be an incremental word relative to the cell word stock, the selected word and the corresponding attribute value can be added into the cell word stock; and updating the cell word stock.
In an optional embodiment of the present invention, if the word stock update information includes a public cell word stock corresponding to each attribute, the terminal may determine a corresponding public cell word stock in the word stock update information according to the attribute of the cell word stock, and then update the cell word stock according to the corresponding public cell word stock; the attribute value of the words in the public cell word stock is adopted to update the attribute value of the corresponding words in the cell word stock, and/or the increment words and the corresponding attribute values in the public cell word stock relative to the cell word stock are added into the cell word stock.
Therefore, the terminal can match candidate information according to local cell word libraries, which is as follows:
referring to fig. 5, a flowchart illustrating steps of an embodiment of matching candidate information according to the present invention specifically includes the following steps:
step 502, an input sequence is received.
Step 504, obtaining association information corresponding to the input sequence, and determining one or more attributes according to the association information.
Step 506, matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes.
Step 508, displaying the candidate information.
In the embodiment of the invention, the input sequence can be received, then the input sequence is converted, and the corresponding candidate information is matched; when matching corresponding candidate information for an input sequence, acquiring associated information corresponding to the input sequence, and determining one or more corresponding attributes according to the associated information; and then, according to the cell word stock corresponding to one or more attributes, candidate information corresponding to the input sequence is matched, and the score corresponding to the candidate information is determined. When determining the candidate score corresponding to the candidate information, the attribute value corresponding to each candidate information may be obtained from the cell word library of one or more attributes, and for each candidate information, the candidate score of the candidate information is determined according to the attribute value of each attribute corresponding to the candidate information, for example, the sum value obtained by adding the attribute values of each attribute corresponding to the candidate information is the candidate score of the candidate information, such as the input sequence "ni", and the corresponding candidate item is: you, ni, i, and ni, wherein the attribute corresponding to the input sequence is rainy day, office, and question-answering application a, so that the attribute values corresponding to each candidate information can be obtained from the cell word stock corresponding to the 3 attributes, as shown in table 11:
Rain day Office room Question and answer application A Candidate score
You 0.5 0.6 0.7 1.2
Nylon 0.1 0.0 0.5 0.2
Woolen cloth 0.3 0.5 0.9 1.5
Greasiness 0.1 0.1 0.0 0.2
Reverse direction 0.2 0.0 0.1 0.3
TABLE 11
Thus, it may be determined that "you" corresponds to a candidate score of 1.2, "ni" corresponds to a candidate score of "0.2," woolen "corresponds to a candidate score of" 1.5, "greasy" corresponds to a candidate score of "0.2," and "inverse" corresponds to a candidate score of "0.3. The candidate information may then be presented according to the candidate scores.
The input sequence is user input content directly received by the input method system, and the input method system can convert the input content into word candidates of the input text according to the current input mode of the user, and provide the word candidates for the user to select. In the actual processing, the input modes that the user can use are various, for example, pinyin input, stroke input, five-stroke input, voice input and handwriting input, or editing operation input of copy-paste, etc. The user can complete the input of the input sequence through any input mode. For input modes such as pinyin input, stroke input, five-stroke input and the like, the input sequence is usually a coded character string input by a user through a keyboard, a touch screen and the like; for handwriting input, the input sequence may be a movement track input by the user through a handwriting pad, a touch screen, or the like. The present application is not limited to the input method of the user, and the user may use any input method.
In an optional embodiment of the present invention, association may be performed according to a cell word stock, that is, the above information may be obtained, and when the above information is associated, association information corresponding to the above information may be obtained, and one or more attributes may be determined according to the association information; and then matching and displaying association information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes. The method for determining the association scores corresponding to the association information is similar to the method for determining the candidate scores; for example, reference may be made to table 12, the above information: "you", corresponding properties: rainy days, home, and chat application B, wherein the corresponding association information and corresponding association scores are as shown in table 12:
rain day Office room Chat application B Association score
O good 0.1 0.7 0.7 1.5
All take the lead to the fact 0.1 0.3 0.4 0.8
Where is located 0.4 0.5 0.7 1.6
Etc 0.5 0.7 0.2 1.4
Slightly wait for 0.4 0.9 0.2 1.5
Good beauty 0.8 0.5 0.7 2.0
Defeated by a person 0.6 0.7 0.3 1.6
Who is 0.6 0.4 0.5 1.5
Severe true 0.3 0.7 0.4 1.4
Which go 0.5 0.7 0.2 1.4
Landing bar 0.7 0.0 0.1 0.8
Does not eat 0.5 0.8 0.5 1.8
In a dry prayer wheel 0.3 0.6 0.9 1.8
Table 12
Accordingly, the association scores corresponding to the respective association information can be determined according to the table 12, and then the association information can be presented according to the association scores.
Referring to FIG. 6, a schematic diagram of one embodiment of a terminal and server data interaction of the present invention is shown; namely, the terminal generates a cell word stock and then uploads the cell word stock to a server; the server generates a corresponding public word stock according to the received cell word stock, extracts word stock updating information from the public word stock, and then returns the word stock updating information to the terminal; and the terminal updates the local cell word stock according to the word stock updating information.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to fig. 7, a block diagram of an embodiment of a generating device for a terminal side word stock according to the present invention may specifically include the following modules: a thesaurus determination module 702, an upload module 704, and an update module 706, wherein,
the word stock determining module 702 is configured to obtain input information and associated information, and determine a cell word stock corresponding to each attribute;
an uploading module 704, configured to upload the cell word stock, where the cell word stock corresponds to an attribute;
and the updating module 706 is configured to receive word stock updating information fed back by the server, and update the corresponding attribute cell word stock according to the word stock updating information, where the word stock updating information is determined according to a public word stock of the server.
Referring to fig. 8, there is shown a block diagram of an alternative embodiment of a terminal-side thesaurus generation apparatus of the present invention; in an alternative embodiment of the present invention, the apparatus further comprises: a filtering module 708, and a matching module 710, wherein,
a filtering module 708, configured to query the cell word stock, and determine words having privacy tags in the cell word stock; and filtering words with privacy labels in the cell word stock.
A matching module 710 for receiving an input sequence; acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information; matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes; and displaying the candidate information.
In an alternative embodiment of the present invention, the thesaurus determining module 702 includes: an analysis submodule 7022 and a cell word stock generation submodule 7024, wherein,
the analysis submodule 7022 is used for carrying out word segmentation processing on the input information to obtain corresponding words, and determining corresponding attributes according to the associated information corresponding to the words;
the cell word stock generation submodule 7024 is used for generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
In an optional embodiment of the present invention, the cell word bank generating submodule 7024 is specifically configured to determine, for each attribute, each word corresponding to the attribute and count a frequency of each word; according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word; and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
In an alternative embodiment of the present invention, the updating module 706 is specifically configured to select, according to an attribute of a cell word stock, a word corresponding to the attribute from the word stock updating information; and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
In an alternative embodiment of the invention, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
In the embodiment of the invention, the terminal can acquire the input information and the associated information, determine the cell word stock corresponding to each attribute according to the input information and the associated information, and upload the cell word stock to the server; after the server generates word stock updating information, receiving word stock updating information fed back by the server, and updating a corresponding cell word stock according to the word stock updating information; and then the input information is analyzed and processed by the terminal to generate a cell word stock and then is uploaded to the server, so that the load of the server is reduced, the speed of generating the public word stock by the server is increased, and the efficiency of updating the local word stock by the terminal is improved. And the terminal stores a plurality of cell word banks, more accurate candidate information can be matched based on the plurality of cell word banks, and the accuracy of the candidate information is improved.
Referring to fig. 9, a block diagram of an embodiment of a server-side word stock generating device according to the present invention may specifically include the following modules: a thesaurus receiving module 902, a thesaurus generating module 904, and an extracting module 906, wherein:
the word stock receiving module 902 is configured to receive a cell word stock uploaded by each terminal, where the cell word stock corresponds to an attribute;
a thesaurus generating module 904, configured to generate a public thesaurus according to the cellular thesaurus;
and the extracting module 906 is configured to extract word stock update information from the public word stock according to the attribute, and return the word stock update information to the corresponding terminal, so as to update the cell word stock of the terminal.
Referring to fig. 10, a block diagram of an alternative embodiment of a server-side word stock generating device according to the present invention is shown, and specifically includes:
in an alternative embodiment of the present invention, the thesaurus generating module 904 includes: an attribute value determination submodule 9042 and a public thesaurus generation submodule 9044, wherein:
an attribute value determining submodule 9042, configured to select, for each attribute, a word from each cell word stock corresponding to the attribute, and calculate an attribute value corresponding to the word;
the public word stock generating submodule 9044 is used for generating a public word stock according to the words and the corresponding attribute values.
In an optional embodiment of the present invention, the attribute value determining submodule 9042 is specifically configured to obtain, for each word, each corresponding attribute value from each cell word stock corresponding to the attribute; calculating an average attribute value corresponding to the attribute of the word according to the attribute values; and taking the average attribute value as an attribute value corresponding to the attribute of the word.
In an optional embodiment of the present invention, the extracting module 906 is specifically configured to determine, for each terminal, each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
In an alternative embodiment of the invention, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
The embodiment of the invention can receive the cell word stock uploaded by each terminal, and then generate the public word stock according to each cell word stock, namely the server only needs to fuse the cell word stock uploaded by each terminal without analyzing and processing a large amount of input information, thereby greatly shortening the time of generating the public word stock by the server and improving the efficiency of generating the public word stock; the cell word stock corresponds to the attribute, so that the server can extract word stock update information from the public word stock according to the attribute, namely, generate corresponding word stock update information according to the attribute of the cell word stock, and then return the corresponding word stock update information to the corresponding terminal; and the efficiency of updating the word stock by the terminal is improved by improving the efficiency of generating the public word stock by the server, corresponding word stock updating information can be generated aiming at terminals with different attributes, the data volume processed in the process of updating the word stock by the terminal cells is reduced, and the word stock updating rate is further improved.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
Fig. 11 shows a block diagram of a terminal device 1100 for generating word stock according to an exemplary embodiment. For example, terminal device 1100 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 11, a terminal device 1100 may include one or more of the following components: a processing component 1102, a memory 1104, a power component 1106, a multimedia component 1108, an audio component 1110, an input/output (I/O) interface 1112, a sensor component 1114, and a communication component 1116.
The processing component 1102 generally controls overall operation of the terminal device 1100, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 1102 may include one or more processors 1120 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 1102 can include one or more modules that facilitate interactions between the processing component 1102 and other components. For example, the processing component 1102 may include a multimedia module to facilitate interaction between the multimedia component 1108 and the processing component 1102.
Memory 1104 is configured to store various types of data to support operations at device 1100. Examples of such data include instructions for any application or method operating on terminal device 1100, contact data, phonebook data, messages, pictures, video, and the like. The memory 1104 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power component 1106 provides power to the various components of the terminal device 1100. Power component 1104 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal device 1100.
Multimedia component 1108 includes a screen between the terminal device 1100 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, multimedia component 1108 includes a front camera and/or a rear camera. When the terminal device 1100 is in an operation mode, such as a photographing mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 1110 is configured to output and/or input an audio signal. For example, the audio component 1110 includes a Microphone (MIC) configured to receive external audio signals when the terminal device 1100 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 1104 or transmitted via the communication component 1116. In some embodiments, the audio component 1110 further comprises a speaker for outputting audio signals.
The I/O interface 1112 provides an interface between the processing component 1102 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 1114 includes one or more sensors for providing status assessment of various aspects of the terminal device 1100. For example, sensor assembly 1114 may detect the on/off state of device 1100, the relative positioning of the components, such as the display and keypad of terminal device 1100, the sensor assembly 1114 may also detect the change in position of terminal device 1100 or a component of terminal device 1100, the presence or absence of a user's contact with terminal device 1100, the orientation or acceleration/deceleration of terminal device 1100, and the change in temperature of terminal device 1100. The sensor assembly 1114 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 1114 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1114 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communications component 1116 is configured to facilitate wired or wireless communication between the terminal device 1100 and other devices. Terminal device 1100 can access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication part 1114 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 1114 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the terminal device 1100 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a memory 1104 including instructions executable by the processor 1120 of the terminal device 1100 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
A non-transitory computer readable storage medium, which when executed by a processor of a terminal device, causes the terminal device to perform a method of generating a thesaurus, the method comprising: acquiring input information and associated information, and determining a cell word stock corresponding to each attribute; uploading the cell word stock, wherein the cell word stock corresponds to the attribute; and receiving word stock updating information fed back by the server, and updating the corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to a public word stock of the server.
Optionally, the determining the cell word stock corresponding to each attribute includes: word segmentation processing is carried out on the input information to obtain corresponding words, and corresponding attributes are determined according to the associated information corresponding to the words; and generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
Optionally, the generating a cell word stock corresponding to each attribute according to the word and each attribute, includes: aiming at each attribute, determining each word corresponding to the attribute and counting the frequency of each word; according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word; and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
Optionally, the method further comprises instructions for performing the filtering the cell thesaurus operation: inquiring the cell word stock, and determining words with privacy labels in the cell word stock; and filtering words with privacy labels in the cell word stock.
Optionally, the updating the corresponding attribute cell word stock according to the word stock updating information includes: selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock; and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
Optionally, further comprising instructions for: receiving an input sequence; acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information; matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes; and displaying the candidate information.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
Fig. 12 is a schematic diagram showing a structure of a server 1200 for generating a thesaurus according to another exemplary embodiment of the present invention. The server 1200 may vary considerably in configuration or performance and may include one or more central processing units (central processing units, CPU) 1222 (e.g., one or more processors) and memory 1232, one or more storage media 1230 (e.g., one or more mass storage devices) storing applications 1242 or data 1244. Wherein memory 1232 and storage medium 1230 can be transitory or persistent. The program stored on the storage medium 1230 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 1222 may be configured to communicate with a storage medium 1230, executing a series of instruction operations on the storage medium 1230 on a server.
The servers may also include one or more power supplies 1226, one or more wired or wireless network interfaces 1250, one or more input/output interfaces 1258, one or more keyboards 1256, and/or one or more operating systems 1241, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.
A non-transitory computer readable storage medium, which when executed by a processor of a server, causes the server to perform a method of generating a lexicon, the method comprising: receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute; generating a public word stock according to the cell word stock; extracting word stock updating information from the public word stock according to the attribute, and returning the word stock updating information to the corresponding terminal so as to update the cell word stock of the terminal.
Optionally, the cell word stock includes words and attribute values of the words, and the generating the public word stock according to the cell word stock includes: selecting words from each cell word stock corresponding to each attribute aiming at each attribute, and calculating attribute values corresponding to the words; and generating a public word stock according to the words and the corresponding attribute values.
Optionally, the calculating the attribute value corresponding to the word includes: for each word, acquiring corresponding attribute values from each cell word stock corresponding to the attribute; calculating an average attribute value corresponding to the attribute of the word according to the attribute values; and taking the average attribute value as an attribute value corresponding to the attribute of the word.
Optionally, extracting word stock update information from the public word stock according to the attribute, and returning the information to the corresponding terminal, including: aiming at each terminal, determining each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
Optionally, the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The above detailed description of a word stock generation method, a word stock generation device, a terminal device and a server provided by the present invention applies specific examples to illustrate the principles and embodiments of the present invention, and the above description of the examples is only used to help understand the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (32)

1. The method for generating the word stock is characterized by comprising the following steps:
receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute;
generating a public word stock according to the cell word stock;
extracting word stock updating information from the public word stock according to the attribute, and returning the information to the corresponding terminal so as to update the cell word stock of the terminal;
the cell word stock comprises words and attribute values of the words, and the generating of the public word stock according to the cell word stock comprises the following steps:
selecting words from each cell word stock corresponding to each attribute aiming at each attribute, and calculating attribute values corresponding to the words;
And generating a public word stock according to the words and the corresponding attribute values.
2. The method of claim 1, wherein the calculating the attribute value corresponding to the word comprises:
for each word, acquiring corresponding attribute values from each cell word stock corresponding to the attribute;
calculating an average attribute value corresponding to the attribute of the word according to the attribute values;
and taking the average attribute value as an attribute value corresponding to the attribute of the word.
3. The method of claim 1, wherein extracting word stock update information from the public word stock according to attributes and returning the information to the corresponding terminal comprises:
aiming at each terminal, determining each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal;
extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
4. A method according to any one of claims 1-3, characterized in that the properties comprise at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
5. The method for generating the word stock is characterized by comprising the following steps:
acquiring input information and associated information, and determining a cell word stock corresponding to each attribute;
uploading the cell word stock, wherein the cell word stock corresponds to the attribute;
receiving word stock updating information fed back by a server, and updating a corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to a public word stock of the server;
the updating the corresponding attribute cell word stock according to the word stock updating information comprises the following steps:
selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock;
and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
6. The method of claim 5, wherein determining the cell word stock corresponding to each attribute comprises:
word segmentation processing is carried out on the input information to obtain corresponding words, and corresponding attributes are determined according to the associated information corresponding to the words;
and generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
7. The method of claim 6, wherein generating a cell word stock corresponding to each attribute based on the word and each corresponding attribute comprises:
aiming at each attribute, determining each word corresponding to the attribute and counting the frequency of each word;
according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word;
and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
8. The method of claim 5, further comprising the step of filtering the cell word stock:
inquiring the cell word stock, and determining words with privacy labels in the cell word stock;
and filtering words with privacy labels in the cell word stock.
9. The method of claim 5, wherein the method further comprises:
receiving an input sequence;
acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information;
matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes;
And displaying the candidate information.
10. The method according to any of claims 5-9, wherein the attributes comprise at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
11. A word stock generation device, comprising:
the word stock receiving module is used for receiving cell word stocks uploaded by each terminal, wherein the cell word stocks correspond to the attributes;
the word stock generating module is used for generating a public word stock according to the cell word stock;
the extraction module is used for extracting word stock update information from the public word stock according to the attribute and returning the word stock update information to the corresponding terminal so as to update the cell word stock of the terminal;
the word stock generation module comprises:
the attribute value determining submodule is used for selecting words from each cell word stock corresponding to each attribute according to each attribute, and calculating attribute values corresponding to the words;
and the public word stock generating sub-module is used for generating a public word stock according to the words and the corresponding attribute values.
12. The apparatus of claim 11, wherein the device comprises a plurality of sensors,
the attribute value determining submodule is specifically used for acquiring corresponding attribute values from cell word libraries corresponding to the attributes for each word; calculating an average attribute value corresponding to the attribute of the word according to the attribute values; and taking the average attribute value as an attribute value corresponding to the attribute of the word.
13. The apparatus of claim 11, wherein the device comprises a plurality of sensors,
the extraction module is specifically configured to determine, for each terminal, each attribute corresponding to the terminal according to the attribute corresponding to each cell lexicon uploaded by the terminal; extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
14. The apparatus of any of claims 11-13, wherein the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
15. A word stock generation device, comprising:
the word stock determining module is used for acquiring input information and associated information and determining a cell word stock corresponding to each attribute;
the uploading module is used for uploading the cell word stock, wherein the cell word stock corresponds to the attribute;
the updating module is used for receiving word stock updating information fed back by the server and updating the corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to a public word stock of the server;
The updating module is specifically used for selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock; and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
16. The apparatus of claim 15, wherein the thesaurus determination module comprises:
the analysis submodule is used for carrying out word segmentation processing on the input information to obtain corresponding words, and determining corresponding attributes according to the associated information corresponding to the words;
and the cell word stock generation submodule is used for generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
17. The apparatus of claim 16, wherein the device comprises a plurality of sensors,
the cell word stock generation submodule is specifically used for determining each word corresponding to each attribute and counting the frequency of each word; according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word; and generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
18. The apparatus as recited in claim 15, further comprising:
the filtering module is used for inquiring the cell word stock and determining words with privacy labels in the cell word stock; and filtering words with privacy labels in the cell word stock.
19. The apparatus as recited in claim 15, further comprising:
the matching module is used for receiving an input sequence; acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information; matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes; and displaying the candidate information.
20. The apparatus of any of claims 15-19, wherein the attribute comprises at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
21. A readable storage medium, characterized in that instructions in said storage medium, when executed by a processor of a server, enable the server to perform the method of generating a lexicon according to any of the method claims 1-6.
22. A server comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
Receiving a cell word stock uploaded by each terminal, wherein the cell word stock corresponds to the attribute;
generating a public word stock according to the cell word stock;
extracting word stock updating information from the public word stock according to the attribute, and returning the information to the corresponding terminal so as to update the cell word stock of the terminal;
the cell word stock comprises words and attribute values of the words, and the generating of the public word stock according to the cell word stock comprises the following steps:
selecting words from each cell word stock corresponding to each attribute aiming at each attribute, and calculating attribute values corresponding to the words;
and generating a public word stock according to the words and the corresponding attribute values.
23. The server of claim 22, wherein the calculating the attribute value corresponding to the word comprises:
for each word, acquiring corresponding attribute values from each cell word stock corresponding to the attribute;
calculating an average attribute value corresponding to the attribute of the word according to the attribute values;
and taking the average attribute value as an attribute value corresponding to the attribute of the word.
24. The server according to claim 22, wherein extracting the word stock update information from the public word stock according to the attribute and returning the information to the corresponding terminal includes:
Aiming at each terminal, determining each attribute corresponding to the terminal according to the attribute corresponding to each cell word stock uploaded by the terminal;
extracting words corresponding to all attributes and attribute values corresponding to the words from the public word stock, generating word stock update information according to the extracted words and the corresponding attribute values, and returning the word stock update information to the terminal.
25. The server according to any of the claims 22-24, wherein the attributes comprise at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
26. A terminal device comprising a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
acquiring input information and associated information, and determining a cell word stock corresponding to each attribute;
uploading the cell word stock, wherein the cell word stock corresponds to the attribute;
receiving word stock updating information fed back by a server, and updating a corresponding attribute cell word stock according to the word stock updating information, wherein the word stock updating information is determined according to a public word stock of the server;
The updating the corresponding attribute cell word stock according to the word stock updating information comprises the following steps:
selecting words corresponding to the attributes from the word stock updating information according to the attributes of the cell word stock;
and updating the attribute value of the corresponding word in the cell word stock by adopting the corresponding attribute value of the selected word, and/or adding the selected word and the corresponding attribute value into the cell word stock.
27. The terminal device of claim 26, wherein the determining the cell word stock corresponding to each attribute comprises:
word segmentation processing is carried out on the input information to obtain corresponding words, and corresponding attributes are determined according to the associated information corresponding to the words;
and generating a cell word stock corresponding to each attribute according to the words and the corresponding attributes.
28. The terminal device of claim 27, wherein the generating the cell word stock corresponding to each attribute according to the word and each corresponding attribute comprises:
aiming at each attribute, determining each word corresponding to the attribute and counting the frequency of each word;
according to the frequency of each word, determining the correlation coefficient of each word and the attribute, and taking the correlation coefficient as the attribute value of the corresponding word;
And generating a cell word stock corresponding to the attribute according to each word and the corresponding attribute value.
29. The terminal device of claim 26, further comprising instructions for performing the filtering the cell thesaurus operation:
inquiring the cell word stock, and determining words with privacy labels in the cell word stock;
and filtering words with privacy labels in the cell word stock.
30. The terminal device of claim 26, further comprising instructions for:
receiving an input sequence;
acquiring association information corresponding to the input sequence, and determining one or more attributes according to the association information;
matching candidate information corresponding to the input sequence according to the cell word stock corresponding to the one or more attributes;
and displaying the candidate information.
31. The terminal device according to any of the claims 26-30, characterized in that the properties comprise at least one of the following types: weather class, location class, application class, traffic class, time class, user class.
32. A readable storage medium, characterized in that instructions in said storage medium, when executed by a processor of a terminal device, enable the terminal device to perform the method of generating a thesaurus as claimed in any one of the method claims 5-10.
CN201810284199.3A 2018-04-02 2018-04-02 Word stock generation method and device, terminal equipment and server Active CN110362686B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810284199.3A CN110362686B (en) 2018-04-02 2018-04-02 Word stock generation method and device, terminal equipment and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810284199.3A CN110362686B (en) 2018-04-02 2018-04-02 Word stock generation method and device, terminal equipment and server

Publications (2)

Publication Number Publication Date
CN110362686A CN110362686A (en) 2019-10-22
CN110362686B true CN110362686B (en) 2024-02-06

Family

ID=68213444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810284199.3A Active CN110362686B (en) 2018-04-02 2018-04-02 Word stock generation method and device, terminal equipment and server

Country Status (1)

Country Link
CN (1) CN110362686B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021092848A1 (en) * 2019-11-14 2021-05-20 Citrix Systems, Inc. Text classification for input method editor

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030157A (en) * 2007-04-20 2007-09-05 北京搜狗科技发展有限公司 Method and system for updating user vocabulary synchronouslly
CN101645093A (en) * 2009-09-02 2010-02-10 腾讯科技(深圳)有限公司 Method of realizing classified lexicon and input method client end
CN102209083A (en) * 2010-03-31 2011-10-05 北京搜狗科技发展有限公司 Method and server for synchronous update of user lexicon and input method system
CN105677841A (en) * 2016-01-04 2016-06-15 北京新美互通科技有限公司 Method and device for pushing geographic word bank
CN107729420A (en) * 2017-09-27 2018-02-23 维沃移动通信有限公司 A kind of update method and mobile terminal of input method dictionary

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9336299B2 (en) * 2009-04-20 2016-05-10 Microsoft Technology Licensing, Llc Acquisition of semantic class lexicons for query tagging
JP5576003B1 (en) * 2013-09-30 2014-08-20 楽天株式会社 Corpus generation device, corpus generation method, and corpus generation program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030157A (en) * 2007-04-20 2007-09-05 北京搜狗科技发展有限公司 Method and system for updating user vocabulary synchronouslly
CN101645093A (en) * 2009-09-02 2010-02-10 腾讯科技(深圳)有限公司 Method of realizing classified lexicon and input method client end
CN102209083A (en) * 2010-03-31 2011-10-05 北京搜狗科技发展有限公司 Method and server for synchronous update of user lexicon and input method system
CN105677841A (en) * 2016-01-04 2016-06-15 北京新美互通科技有限公司 Method and device for pushing geographic word bank
CN107729420A (en) * 2017-09-27 2018-02-23 维沃移动通信有限公司 A kind of update method and mobile terminal of input method dictionary

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CCSS——一个彻底解决汉语分词的方案;李文焘;《湖北广播电视大学学报》;第131-132页 *
Knowledge discovery in large text databases using the MST algorithm;Romanov, V etc.;《 Data Mining VI: Data Mining, Text Mining and Their Business Applications》;第153-162页 *
多知识库融合的属性抽取方法;周小强;《万方》;第7-32页 *

Also Published As

Publication number Publication date
CN110362686A (en) 2019-10-22

Similar Documents

Publication Publication Date Title
CN109800325B (en) Video recommendation method and device and computer-readable storage medium
US20170154104A1 (en) Real-time recommendation of reference documents
CN109144285B (en) Input method and device
CN110781305A (en) Text classification method and device based on classification model and model training method
CN108073606B (en) News recommendation method and device for news recommendation
CN110069624B (en) Text processing method and device
CN111708943B (en) Search result display method and device for displaying search result
CN108874827B (en) Searching method and related device
CN106815291B (en) Search result item display method and device and search result item display device
CN112784142A (en) Information recommendation method and device
CN111382339A (en) Search processing method and device and search processing device
CN112307281A (en) Entity recommendation method and device
CN111538830A (en) French retrieval method, French retrieval device, computer equipment and storage medium
CN111368161B (en) Search intention recognition method, intention recognition model training method and device
CN112541110A (en) Information recommendation method and device and electronic equipment
CN113987128A (en) Related article searching method and device, electronic equipment and storage medium
CN113849723A (en) Search method and search device
CN114168798A (en) Text storage management and retrieval method and device
CN111538998B (en) Text encryption method and device, electronic equipment and computer readable storage medium
CN110362686B (en) Word stock generation method and device, terminal equipment and server
CN113157923B (en) Entity classification method, device and readable storage medium
CN107301188B (en) Method for acquiring user interest and electronic equipment
CN107291259B (en) Information display method and device for information display
CN112987941B (en) Method and device for generating candidate words
CN110929122B (en) Data processing method and device for data processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant