WO2012139394A1 - 用于确定资源候选项的排序结果的方法、装置及设备 - Google Patents

用于确定资源候选项的排序结果的方法、装置及设备 Download PDF

Info

Publication number
WO2012139394A1
WO2012139394A1 PCT/CN2011/083406 CN2011083406W WO2012139394A1 WO 2012139394 A1 WO2012139394 A1 WO 2012139394A1 CN 2011083406 W CN2011083406 W CN 2011083406W WO 2012139394 A1 WO2012139394 A1 WO 2012139394A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
user
input
determining
sorting
Prior art date
Application number
PCT/CN2011/083406
Other languages
English (en)
French (fr)
Inventor
赵正雄
李彦宏
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Publication of WO2012139394A1 publication Critical patent/WO2012139394A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying

Definitions

  • the present invention relates to the field of computers, and in particular, to a method, apparatus, and device for determining a ranking result of a resource candidate.
  • the retrieval device After the input sequence from the user input is acquired, the retrieval device performs retrieval based on the entire input sequence, and sorts each resource candidate obtained by the retrieval to obtain a sort result and provides it to the user.
  • the input sequence often contains information that the user focuses on and information that is not focused on, the method of searching based on the entire input sequence in the prior art is difficult to distinguish the key content that the user desires to retrieve.
  • a method for determining a ranking result of a resource candidate implemented by a computer device is provided, wherein the method comprises the following steps:
  • a sort determining apparatus for determining a sorting result of a resource candidate, wherein the sort determining means comprises:
  • a first obtaining means for obtaining search information and adjustment information from an input sequence from a user; a search means for performing a search according to the search information to obtain a plurality of resource candidates; a sorting means for Adjusting information, determining a sorting result of the plurality of resource candidates; providing means for generating presentation information according to the sorting result to provide to the user.
  • a computer apparatus comprising the order determining means.
  • the present invention has the following advantages: 1) According to the method of the present invention, the search information is selected from the input sequence to perform the search, and the influence of the non-focused information on the search result is ensured; 2) According to the present invention The method can obtain the sorting result of the retrieved resource candidates according to the adjustment information obtained by the input sequence, further improving the possibility that the user obtains the required resource candidate; 3) the method according to the present invention is applicable to various methods Where the search is performed according to the user input sequence, for example, in the B2B/B2C website, the resource candidate for the corresponding item is provided according to the input sequence input by the user, and is used in the search engine to provide the corresponding input sequence according to the user input. Resource candidates, etc.
  • FIG. 1 is a flow chart of a method for sequencing a search result for determining a ranking result of a resource candidate according to an aspect of the present invention
  • FIG. 2 is a flowchart of a method for determining a sorting result of resource candidates for sorting search results according to another preferred embodiment of the present invention
  • FIG. 3 is a flowchart of a method for sorting search results to determine a ranking result of resource candidates according to still another preferred embodiment of the present invention
  • FIG. 4 is a flowchart of a method for sorting search results to determine a sort result of resource candidates according to still another preferred embodiment of the present invention
  • FIG. 5 is a schematic diagram of an apparatus for determining a ranking result of a resource candidate according to an aspect of the present invention
  • FIG. 6 is a schematic diagram of a ranking determining apparatus for determining a ranking result of a resource candidate to sort a search result according to a preferred embodiment of the present invention
  • FIG. 7 is a schematic diagram of a sequence determining apparatus for determining a ranking result of resource candidates according to another preferred embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a sequence determining apparatus for determining a ranking result of resource candidates according to still another preferred embodiment of the present invention.
  • the user equipment 2 can be any electronic product that can interact with the user through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, including but not limited to a computer, a smart phone, a PDA, or an IPTV;
  • the computer device 3 can be any electronic product that can communicate with the user device 2, including but not limited to: a single network server, a server group composed of multiple network servers, or a cloud computing based computer or network.
  • the method for sorting search results according to the present invention is mainly performed by an operating system of the sort determining device or a processing controller installed therein, and for the sake of brevity, an operating system or a processing controller in the sort determining device will be hereinafter described. They are collectively referred to as a sort determination device.
  • step S1 the user equipment 2 inputs an input sequence through any interactive device that can perform human-computer interaction with the user 1, which may be a keyboard, a mouse, a remote controller, a touch pad or a voice control device or the like.
  • the user 1 inputs information to be retrieved in the information input field of the search page displayed by the user device 2 through the keyboard, for example, input "TV drama starring Li Hongji".
  • the user equipment 2 transmits an input sequence input by the user 1, for example, "a TV drama starring Li Hongji" to the computer device 3.
  • the user equipment 2 can send an input sequence to the computer device 3 through a network, including but not limited to: the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless ad hoc network (Ad Hoc network), and the like.
  • the manner in which the user equipment 2 sends an input sequence to the computer device 3 includes but is not limited to: 1) directly transmitting the input sequence to the computer device 3 through a network; 2) via one or more devices in the network The input sequence is sent to the computer device 3 or the like.
  • step S3 the sort determining means obtains the search information and the adjustment information from the input sequence received by the computer device 3.
  • the ranking determining device performs a query in a predetermined common vocabulary according to the input sequence, obtains a "television drama” and a "starring” as common words, and analyzes the input sequence, and determines "" as a supporting word, and the ranking determining device determines
  • the search information obtained by the input sequence "TV drama starring Li Hongji” includes “Li Hongji”
  • the adjustment information includes "starring” and "television drama”.
  • the predetermined common vocabulary includes a plurality of common words.
  • the ranking determining means performs a search based on the retrieval information to obtain a plurality of resource candidates.
  • one resource candidate corresponds to one or more links
  • the resource candidate includes description information of resources provided by one or more websites pointed to by the one or more links, where the description information includes but is not limited to: Title, summary of the content of the resource, full text content of the resource, etc.
  • the ranking determining means performs a search based on the search information "Li Hongji" and "TV drama", and the obtained resource candidates include: resource candidate A and resource candidate B.
  • step S6 the ranking determining means determines the ranking result of the plurality of resource candidates based on the adjustment information.
  • the ranking determining means analyzes that the adjustment information "prospect" is included in the resource candidate A, and the adjustment information "prospect” is not included in the resource candidate B.
  • the ranking determining means sorts the two resource candidates as follows:
  • step S7 the ranking determining means generates presentation information based on the sorting result to provide to the user 1.
  • the ranking determining apparatus determines, based on the ranking results of the resource candidate A and the resource candidate B, that the ranking of the presentation information A corresponding to the resource candidate A and the presentation information B corresponding to the resource candidate B is as follows, and The sorted presentation information is provided to the user 1 by the user equipment 2:
  • the ranking determining device may select part of the resource candidates to generate the presentation information and provide the information to the user 1 according to actual conditions, for example, the number of presentation information requested by the user equipment 2 is less than the number of resource candidates.
  • the foregoing step S3 further includes: the order determining device first acquiring first type determining information for determining the search information and the adjusting information, and further determining information according to the first type. And the step of obtaining the search information and the adjustment information from an input sequence from a user.
  • the first type determination information includes but is not limited to:
  • the predetermined keyword type library includes a plurality of information units, and each information unit corresponds to one type.
  • the sort determining device performs a query in a predetermined keyword type library according to the input sequence "TV drama starring Li Hongji”, and the obtained information units include: “Li Hongji", “starring”, and “TV drama”, wherein the information unit
  • the types of "Li Hongji” and “TV drama” are search types, and the type of the information unit "starring” is an adjustment type, and the sort determining means is based on the result of querying the information unit and its type in the predetermined keyword type library.
  • the search information for the input sequence "Li Hongji's TV series” is determined to include “Li Hongji" and "TV drama", and the adjustment information includes "starring".
  • the sort determining means determines, based on the information unit obtained in the predetermined keyword type library and the result of the type query, the search information of the input sequence "TV drama starring Li Hongji” includes “Li Hongji starring”, and the adjustment information includes "starring” TV series”.
  • search information and the adjustment information may partially overlap.
  • search information "starring” in “Li Hongji's starring” also appears in the adjustment information "starring drama”.
  • the semantic analysis result includes but is not limited to:
  • the result of semantic analysis based on part of speech includes but is not limited to: nouns, adjectives, adverbs, verbs, and the like.
  • the sorting determining device analyzes the four words “Li Hongji”, “starring”, “the”, and “television drama” obtained by cutting the word “the TV drama starring Li Hongji”, and obtains semantic analysis results including: “Li Hongji "And” TV drama "is a noun, "starring” is a verb, "” is a mnemonic; then the sorting determining device uses the noun as the retrieval information, the verb as the adjustment information, and the retrieval information including "Li Hongji” based on the semantic analysis result "and "TV drama", adjustment information including "starring”.
  • the sorting determining device divides it into three parts based on the sentence pattern: “Li Hongji”, “starring”, and “TV drama”, and based on "Li Hongji” in the input sequence Located at the beginning of the sentence to determine its subject, based on the "professional” in the input sequence in the sentence to determine the predicate, based on the "television” in the input sequence at the end of the sentence to determine its object; according to the aforementioned semantic analysis results,
  • the ranking determining means uses the subject and the object as the retrieval information, and the predicate as the adjustment information, and the determined retrieval information includes "Li Hongji” and “TV drama", and the adjustment information includes "starring”.
  • any first method for determining the search information and the adjustment information is obtained.
  • the type determining information, and determining the information according to the first type, and obtaining the search information and the adjustment information from the input sequence of the user are all included in the scope of the present invention.
  • the semantic analysis results of both the part of speech and the sentence type for example, "Li Hongji” is a noun and is located at the beginning of the sentence, "starring” is a verb and is located after "Li Hongji”, "TV drama” is a noun and is located after "starring”, etc.
  • the solution according to the present invention further includes: after acquiring the input sequence from the user, first removing the invalid information in the input sequence to obtain available information, and acquiring the available information from the available information.
  • the invalid information includes but is not limited to: 1) auxiliary words; 2) spaces; 3) punctuation marks; 4) information units and the like included in the predetermined invalid dictionary.
  • the sort determining means first removes the invalid information in the input sequence, for example, removes the auxiliary word "of" to obtain the available information "Li Hongji starring the TV series”, and then And obtaining the search information and the adjustment information from the available information.
  • the manner of obtaining the search information and the adjustment information from the available information is the same as or similar to the manner of obtaining the search information and the adjustment information in the input sequence in the foregoing step S3, and is included herein by reference. Let me repeat.
  • any first method for determining the search information and the adjustment information is obtained.
  • the type determining information, and determining the information according to the first type, and obtaining the search information and the adjustment information from the input sequence of the user are all included in the scope of the present invention.
  • Figure 2 is a flow diagram of a method for determining the ranking results of resource candidates in accordance with a preferred embodiment of the present invention.
  • steps S 1 and S2 have been described in detail in the embodiment shown in FIG. 1 and are included herein by reference, and are not described again.
  • step S3' the sort determining means obtains the search information and the adjustment information from the input sequence received by the computer device 3.
  • the retrieval information includes one or more retrieval units, and the adjustment information includes one or more adjustment units.
  • the retrieval information obtained by the sort determining means from the input sequence includes a retrieval unit "Li Hongji” and a retrieval unit “TV drama”, and the adjustment information includes an adjustment unit "starring” And the adjustment unit "show”.
  • the manner in which the sort determining device obtains the search information and the adjustment information from the input sequence has been described in detail in the embodiment shown in FIG. 1, and is included herein by reference, and is not described.
  • step S4 the sort determining means performs a search based on the search information to obtain a plurality of resource candidates.
  • the sort determining means searches based on the search information "Li Hongji" and "TV drama", and the obtained resource candidates include: resource candidate C, resource candidate D, resource candidate E, and the like.
  • the ranking determining means acquires a first one for assisting in determining the sorting result.
  • Sort auxiliary information includes but is not limited to at least one of the following:
  • the sort determining means 3 obtains the weight information of the adjustment unit "protagon", for example, 5; and obtains the weight information of the adjustment unit "parameter”, for example, 1. It should be understood by those skilled in the art that the above weight information is represented by numerical values only for listing, and is not intended to limit the present invention. In fact, the weight information may also be expressed in other manners, for example, by level.
  • the adjustment unit distribution information includes but is not limited to at least one of the following:
  • the resource candidate obtained by the sort determining device includes: resource candidate C, resource candidate D, and resource candidate E, and obtains the number of occurrences of the adjusting unit in each resource candidate as follows:
  • Resource candidate C The adjustment unit “star” appears 2 times; the adjustment unit “parameter” appears 0 times; resource candidate D: the adjustment unit “star” appears 0 times, the adjustment unit “parameter” appears 1 time; resource candidate Item E: The adjustment unit “star” appears 0 times, and the adjustment unit “parameter” appears 0 times.
  • the appearance position includes, but is not limited to: a title, a summary, a body, a multimedia resource descriptive content such as UGC, etc., and the appearance position may be a label of the information corresponding to the resource candidate or a text included in the corresponding information. Information, for example, ⁇ title>, "summary", etc., to identify.
  • the resource candidate obtained by the order determining device includes the resource candidate F and the resource candidate G, and obtains the appearance position of each adjusting unit in each resource candidate according to the label of the information corresponding to the resource candidate:
  • the candidate for the resource candidate F contains the adjustment unit "starring"
  • Resource candidate G's summary contains the adjustment unit "Parameter”.
  • the resource candidate obtained by the order determining device includes the resource candidate H and the resource candidate I, and obtains two adjustment units including the adjustment unit "starring" and the adjustment unit "parameter” in the resource candidate H; resource candidate I includes an adjustment unit that "stands in” an adjustment unit; and the order determination device determines the number of different adjustment units in each resource candidate as:
  • the quality information includes but is not limited to at least one of the following:
  • the manner in which the order determining device obtains the authority of the resource candidate includes, but is not limited to, at least one of the following:
  • the resource candidate obtained by the ranking determining device includes a resource candidate J and a resource candidate K, wherein the resource candidate J corresponds to the website L resource candidate K corresponding to the website K, and the ranking determining device obtains the website J as the predetermined authoritative website.
  • the website K is a predetermined ordinary website, and the order determining device determines that the authority of the resource candidate J is the "authority" level, and the authority of the resource candidate K is the "normal" level.
  • the manner in which the ranking determining apparatus obtains the quality of each resource candidate includes, but is not limited to, at least one of the following:
  • the quality information is obtained by analyzing the content information included in the website corresponding to the resource candidate.
  • the factors referenced in the analysis of the content information include at least one of the following: i) whether the advertisement information is included; ii) the amount of resource shield provided by the website, for example, picture definition, video definition, song sound quality, etc. ; iii) the amount of resources provided by the website, etc.
  • the resource candidate M corresponds to the website M; and the ranking determining device acquires the content information included in the website corresponding to the resource candidate L And analyzing that the content information does not include the advertisement information and the average pixel of the image provided by the website is higher than the first predetermined threshold, the ranking determining device determines that the quality of the website L is superior, and determines the quality of the resource candidate L.
  • the ranking is "excellent"; and the ranking determining device acquires the content information included in the website corresponding to the resource candidate M, and analyzes that the content information includes the advertisement information and the number of music resources provided by the website is higher than a second predetermined threshold.
  • the ranking determining means determines that the quality of the website L is excellent, and determines that the quality level of the resource candidate L is "excellent". It should be understood by those skilled in the art that the above implementation manner of expressing the quality is only listed, not limited to the present invention. In fact, the quality can also be expressed in other ways, for example, by value, etc. .
  • step S6' the sort determining means is based on all the adjusting units, and combined with the first row
  • the auxiliary information is used to determine the sorting result of the plurality of resource candidates.
  • the manner in which the order determining device determines the sorting result includes but is not limited to:
  • the ranking determining means determines the ranking result of each resource candidate based on all the adjustment units and the weight information of the adjustment unit in the first sorting assistance information. For example, the ranking determining device obtains the weighting information of the "starring” as 5, the weighting information of the "parameter” is 1, the resource candidate C includes the adjustment information unit "starring", and the resource candidate D includes the adjustment information unit” In the following, the sort determining device determines the sorting result of the resource candidate C and the resource candidate D according to the weight information of each adjusting unit:
  • the order determining device determines the sorting result of each resource candidate according to all the adjusting units, combined with the weight information of the adjusting unit in the first sorting auxiliary information and the adjusting unit distribution information of each resource candidate.
  • the ranking determining device obtains the weighting information of the "starring” as 5, the weighting information of the "parameter” is 1, the title of the resource candidate C includes the adjustment information unit "starring", and the summary of the resource candidate D includes Adjusting the information unit "starring", the resource candidate E includes an adjustment unit "parameter”, whereby the ranking determining means first includes the resource candidate C and the resource candidate including the adjustment unit "starring” according to the weight information of each adjustment unit
  • the item D is sorted before the resource candidate E containing the adjustment unit "parameter", and according to the appearance position information of the adjustment unit "starring", the resource candidate C in which the adjustment unit "starring” appears in the title is sorted and adjusted in the summary.
  • the resource candidate D of the unit "starring obtain the following sort results:
  • the sort determining means sorts based on at least the number of occurrences of the adjusting unit; 2) the sort determining means sorts based on at least the number of adjusting units; 3) the order determining means determines the authority of the website corresponding to the resource candidate Sex or superior shield is sorted from high to low; 4) The sorting determining device first sorts each resource candidate based on the highest to lowest number of occurrences of the adjusting unit, and then adjusts the unit based on the high quality to high quality.
  • Each resource candidate having the same number of occurrences is sorted, etc.; 5) when each item in the first sorting auxiliary information is represented by a value, the sort determining means obtains each according to the value of each item in the first sorting auxiliary information The evaluation value of the resource candidate, and sorting each resource candidate based on the evaluation value.
  • step S7' the sort determining means generates the presentation information based on the sort result and provides it to the user 1 through the user equipment 2.
  • the ranking determining device determines the presentation information C corresponding to the resource candidate C, the presentation information D corresponding to the resource candidate D, and the resource candidate according to the resource candidate (the candidate result of the resource candidate D and the resource candidate E).
  • the order of the presentation information E corresponding to E is as follows, and the sorted presentation information is provided to the user 1 by the user equipment 2:
  • the ranking determining device may select part of the resource candidates to generate the presentation information and provide the information to the user 1 according to actual conditions, for example, the number of presentation information requested by the user equipment 2 is less than the number of resource candidates.
  • Figure 3 is a flow chart showing a method for determining the ranking results of resource candidates in accordance with another preferred embodiment of the present invention.
  • steps S1 and S2 have been described in detail in the embodiment shown in FIG. 1 and are included herein by reference.
  • step S3 the sort determining means obtains the search information and the adjustment information from the input sequence received by the computer device 3.
  • the sorting determining means obtains the search information "Li Hongji” and "TV drama” and the package adjustment information "starring” from the input sequence "TV drama starring Li Hongji” received by the computer device 3.
  • the manner in which the sorting determining device obtains the search information and the adjustment information from the input sequence has been described in detail in the embodiment shown in FIG. 1 and is hereby incorporated by reference.
  • step S4" the sort determining means performs a search based on the search information to obtain a plurality of resource candidates.
  • the ranking determining device performs a search according to the retrieval unit "Li Hongji” and the retrieval unit "TV drama", and the obtained resource candidates include: resource candidate A1, resource candidate B1, and resource candidate C1.
  • step S6 the ranking determining means determines the ranking result of the plurality of resource candidates based on the adjustment information and the retrieval information.
  • the sort determining means obtains the resource candidate A1, the resource candidate B1, and the resource candidate C1, and obtains the resource candidate A1 including the search information "Li Hongji” and "TV drama” and adjustment
  • the ranking determining means determines the resource candidate A1 including the retrieval information and the adjustment information at the same time.
  • the sorting is located before the resource candidate B 1 and the resource candidate C 1 , and the resource candidate B 1 and the resource candidate C1 containing only the retrieval information are randomly sorted, and the sorting result is obtained as follows:
  • step S7 the ranking determining means generates presentation information according to the sorting result to provide to the user 1.
  • the ranking determining means generates presentation information according to the sorting result to provide to the user
  • the manner of 1 has been described in detail in step S7 with reference to the embodiment shown in Fig. 1, and is hereby incorporated by reference.
  • Figure 4 is a flow chart showing a method for determining the ranking results of resource candidates in accordance with still another preferred embodiment of the present invention.
  • steps S1 and S2 have been described in detail in the embodiment shown in FIG. 1, and are included herein by reference, and are not described again.
  • step S3' the sort determining means obtains the search information and the adjustment information from the input sequence received by the computer device 3.
  • the adjustment information includes one or more adjustment units; the search information includes One or more search units.
  • the sort determining means obtains the search information including the search unit "Li Hongji” and the retrieval unit “TV drama” from the input sequence "Li Hongji starring and participating in the TV drama” received by the computer device 3, and includes the adjustment unit "starring” and Adjust the adjustment information of the unit "Parameter”.
  • the sort determining device described above acquires search information and adjusts from an input sequence received by the computer device 3.
  • the manner of the entire information is the same as or similar to the manner of obtaining the search information and the adjustment information in step S3 shown in FIG. 1, and is included herein by reference, and details are not described herein.
  • step S4 the sort determining means performs a search based on the search information to obtain a plurality of resource candidates.
  • the ranking determining device searches according to the retrieval unit "Li Hongji” and the retrieval unit "TV drama", and the obtained resource candidates include: resource candidate C l , resource candidate D l , resource candidate El, and the like.
  • the sort determining means acquires second sorting auxiliary information for assisting in determining the sorting result.
  • the second sorting auxiliary information includes but is not limited to at least one of the following:
  • step S5 shown in Fig. 2 and is hereby incorporated by reference.
  • the ranking determining means 3 acquires the weight information of the retrieval unit "Li Hongji", for example, 5, and obtains the weight of the retrieval unit "TV drama", for example, 1. It should be understood by those skilled in the art that the above-mentioned numerical values are only used for indicating weight information, and are not intended to limit the present invention. In fact, the weight information may also be expressed in other manners, for example, by level.
  • the retrieval unit distribution information includes but is not limited to at least one of the following:
  • the resource candidate obtained by the sort determining device includes a resource candidate Cl, a resource candidate D1, and a resource candidate E1, and the number of occurrences of the adjusting unit in each resource candidate is statistically obtained as follows:
  • Resource candidate C1 The search unit "Li Hongji” appears twice, the search unit “TV series” appears twice; resource candidate D1: the search unit “Li Hongji” appears once, the search unit “TV series” appears once; resource candidate E1 : The search unit “Li Hongji” appears once, and the search unit "TV drama” appears once.
  • the appearance position includes, but is not limited to: a title, a summary, a body, a multimedia resource descriptive content such as UGC, etc., and the location may pass the label or text information of the information corresponding to the resource candidate, for example, ⁇ title> , "summary", etc., to identify.
  • the resource candidate obtained by the order determining device includes the resource candidate F1 and the resource candidate G1, and obtains the appearance position of each of the search units in each resource candidate according to the label of the information corresponding to the resource candidate:
  • the title of the resource candidate F 1 includes the search unit "Li Hongji” and the retrieval unit "TV drama”; the title of the resource candidate G 1 includes the retrieval unit "Li Hongji”, and the abstract includes the retrieval unit "TV drama”.
  • the resource candidate obtained by the order determining device includes the resource candidate H 1 and the resource candidate I 1
  • the obtained resource candidate HI includes the retrieval unit “Li Hongji” and “TV drama”, in the resource candidate I 1
  • the retrieval unit "Li Hongji” is included; then the ranking determining means determines the number of different retrieval units in each resource candidate as:
  • predetermined quality information of each of the plurality of resource candidates.
  • the predetermined quality information has been described in detail in step S5 of the embodiment shown in Fig. 2, and is hereby incorporated by reference.
  • step S6' the sort determining means determines the sorting result of the plurality of resource candidates according to all the adjusting units and all the searching units, in combination with the second sorting auxiliary information.
  • the ranking determining means determines the ranking result of the plurality of resource candidates based on at least one of the second sorting assistance information.
  • the manner in which the order determining device determines the sorting result includes but is not limited to:
  • the ranking determining means obtains the retrieval unit "Li Hongji" in the resource candidate C1 twice in the resource candidate C1, once in the resource candidate D1, and once in the resource candidate E1 in the step S5', and The retrieval unit "television” appears twice in the resource candidate C1, once in the resource candidate D1, and once in the resource candidate E1, and the ranking determining means determines that the two retrieval units are in the resource candidate Four occurrences in C1, two search units appear twice in resource candidate D1, two search units appear twice in resource candidate E1, and resource candidate D1 contains two adjustment units, resource candidate E1 contains For adjusting the information of the unit, the sorting determining device first sorts the number of occurrences of the searching unit to obtain an initial sorting result, and then adjusts the initial sorting result according to the number of adjusting units to obtain the resource candidate C1, the resource candidate D1 and the capital.
  • the sorting result of source candidate E 1 is as follows:
  • the sort determining means sorts based on the occurrence positions of the retrieval unit in each resource candidate, for example, sorting the resource candidates in which the retrieval unit appears at the title position before the retrieval unit appears in the resource candidate of the summary position 2) the sort determining device sorts based on the number of different search units in each resource candidate, for example, sorting the resource candidates including the number of search units by a number of resource candidates including the number of search units; 3) the sort determining device sorts based on the quality information of each resource candidate, for example, sorting the resource candidates corresponding to the authoritative website or the high-quality website in advance; 4) the sort determining device is based on the searching unit and the adjusting unit simultaneously The weight information is sorted.
  • the weight information includes a weight value
  • the sort determining device multiplies the search unit weight value and the adjustment unit weight value respectively included in each resource candidate to obtain a total weight value, and then performs each weight based on the total weight value.
  • Resource candidate sorting; 5) the sort determining device is based on the check The distribution information element to sort candidates for each resource, the distribution of those same information, and then based on the quality information or distribution information adjustment unit candidate resources to sort the like.
  • step S7"' the sort determining means generates the presentation information based on the sort result and provides it to the user 1 through the user device 2.
  • the method according to the present invention further includes the step of the order determining means acquiring the keyword unit and its type, and establishing or updating the predetermined keyword type library according to the keyword unit and its type. .
  • the step of the order determining device acquiring the keyword unit and its type further includes the order determining device acquiring the keyword unit, acquiring second type determining information for determining the type of the keyword unit, and according to the first The second type determines the information to determine the type of the keyword unit.
  • the order determining device acquires a mode packet of a keyword unit but not limited to:
  • the sort determining means obtains a keyword unit from an input sequence input by a plurality of users.
  • the sort determining means obtains the same part of the two input sequences "TV drama” from the input sequence "TV drama A” input by the user A received by the computer device 3, and the input sequence "TV drama B” input by the user B, and The same part of the "TV series” is used as a keyword unit.
  • the sorting determining device cuts the input sequence "actor Li Hongji", obtains “actor” and “Li Hongji”, and uses “actor” and "Li Hongji” as key unit.
  • the sort determining device obtains "starring” and “starring” from the input method vocabulary, and takes “starring” and “starring” as keyword units.
  • the order determining means acquires second type determination information for determining the type of the keyword unit.
  • the second type determination information includes, but is not limited to, at least one of the following:
  • Distribution concentration of keyword units in a predetermined corpus includes a plurality of corpora.
  • the distribution concentration indicates a degree of distribution concentration of the keyword unit in a plurality of corpora of the predetermined corpus, and the distribution concentration is based on the appearance information of the keyword unit in the predetermined corpus and the different corpus including the keyword unit. Quantity information to get.
  • the presence information is at least one of the following:
  • the quantity information includes at least one of the following:
  • Semantic analysis results obtained from the keyword unit includes, but is not limited to, part of speech of a keyword unit, such as a noun, a verb, an adjective, and the like.
  • the ranking determining means performs a part-of-speech analysis on the keyword unit "Li Hongji" to obtain a semantic analysis result as a noun.
  • the user history input sequence matching the same corpus is a user history input sequence containing the same corpus in the search result, for example, three user history input sequences "iphone4 sign sale”, “iphone4 sale”, “iphone4 sale”
  • the search results all contain the same corpus "Iphone4 sales have broken through"
  • the keyword unit "iphone4" For the keyword unit "iphone4", if the user history input sequence including it includes “iphone4 sign sale”, “iphone4 sale”, “iphone4 sale”, “iphone4 game” and “iphone4 waiting", among them, " Iphone4 signing, ', 'iphone4 for sale, and “iphone4 for sale", the same corpus, "iphone4 game, 'and” iphone4 entertainment, matching the same corpus, then the keyword unit "iphone4" and match the same The number of user history input sequences for the corpus is 5.
  • the sort determining means determines the type of the keyword unit based on the second type determination information.
  • the type includes: a retrieval type, an adjustment type, and the like; preferably, an invalid type or the like that needs to be removed from the input sequence.
  • the sort determining device obtains a distribution concentration of the keyword unit "Li Hongji" in a predetermined expected library of 6.5, and determines that the distribution concentration degree 6.5 exceeds a distribution predetermined threshold value 4, and the sort determining means determines the keyword unit"
  • the type of Li Hongji is the search type.
  • the ranking determining means obtains the semantic analysis result of the keyword unit "television drama" as a noun, and the ranking determining means determines the type of the keyword unit "television drama” as the retrieval type based on the semantic analysis result. Further, for example, the ranking determining means obtains the semantic analysis result of the keyword unit "prospect” as a verb, and the ranking determining means determines the keyword unit "prospect” as the adjustment type based on the semantic analysis result. Further, for example, the sort determining means obtains the semantic analysis result of the keyword unit as "auxiliary word”, and the sort determining means determines the "key” of the keyword unit as an invalid type based on the semantic analysis result.
  • the ranking determining means determines that the type of the keyword unit "Li Hongji" is searched Types of.
  • the sort determining device obtains the semantic analysis result of the keyword unit "Li Hongji" as a noun and a horse
  • the number of user history input sequences matching the same corpus is 1000
  • the sort determining means determines the keyword unit according to a predetermined rule that when the number of user history input sequences matching the same corpus of the noun exceeds 900, the noun is determined as the retrieval type.
  • the type of "Li Hongji" is the search type.
  • the ranking determining device For another example, if the ranking determining device obtains the distribution concentration degree of the keyword unit "Li Hongji" is 6.5 and the number of user history input sequences matching the same corpus is 1000, the ranking determining device first inputs the distribution concentration and the user history input matching the same corpus. The number of sequences is normalized, and then the combined evaluation value of the keyword unit "Li Hongji" is 1.2, which is higher than the comprehensive predetermined threshold, and the type of the keyword unit "Li Hongji" is determined as the retrieval type.
  • Fig. 5 is a diagram showing an arrangement of sorting determining means for determining the sorting result of resource candidates in an aspect of the present invention.
  • the sorting determining device includes: a first obtaining device 31, a searching device 32, a sorting device 33, and a providing device 34.
  • the user device 2 inputs an input sequence through any interactive device that can perform human-computer interaction with the user 1, and the interactive device can be a keyboard, a mouse, a remote controller, a touch pad, or a voice control device.
  • the user 1 inputs information to be retrieved in the information input field of the search page displayed by the user device 2 through the keyboard, for example, input "TV drama starring Li Hongji".
  • the user device 2 transmits an input sequence input by the user 1, for example, "a TV show starring Li Hongji" to the computer device 3.
  • the user equipment 2 can send an input sequence to the computer device 3 through a network, including but not limited to: the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless ad hoc network (Ad Hoc network), and the like.
  • the manner in which the user equipment 2 sends an input sequence to the computer device 3 includes but is not limited to: 1) directly transmitting the input sequence to the computer device 3 through a network; 2) via one or more devices in the network The input sequence is sent to the computer device 3 or the like.
  • the first obtaining means 31 acquires the retrieval information and the adjustment information from the input sequence received by the computer device.
  • the first obtaining means 31 performs a query in a predetermined common vocabulary according to the input sequence, obtains "television drama” and "starring” as common words, and analyzes the input sequence, and determines "" as a supporting word, then the first The acquisition means 31 determines that the retrieval information obtained by the input sequence "TV drama starring Li Hongji" includes “Li Hongji”, and the adjustment information includes "starring" and "television drama".
  • the predetermined common vocabulary includes a plurality of common words.
  • the search device 32 performs a search based on the search information to obtain a plurality of resource candidates.
  • one resource candidate corresponds to one or more links
  • the resource candidate includes description information of resources provided by one or more websites pointed to by the one or more links, where the description information includes but is not limited to: Title, summary of the content of the resource, full text content of the resource, etc.
  • the retrieval means 32 searches based on the retrieval information "Li Hongji" and "TV drama", and the obtained resource candidates include: resource candidate A and resource candidate B.
  • the sorting means 33 determines the sorting result of the plurality of resource candidates based on the adjustment information.
  • the sorting means 33 analyzes that the adjustment information "prospect" is included in the resource candidate A, and the adjustment information "starring” is not included in the resource candidate B, whereby the sorting means 33 sorts the two resource candidates as follows: Resource candidate A;
  • the providing device 34 generates presentation information according to the sorting result to provide to the user.
  • the providing device 34 determines, according to the ranking result of the resource candidate A and the resource candidate B, that the ranking of the presentation information A corresponding to the resource candidate A and the presentation information B corresponding to the resource candidate B is as follows, and The sorted presentation information is provided to the user 1 by the user equipment 2:
  • the providing device 34 may select part of the resource candidates to generate the presentation information and provide the information to the user 1 according to actual conditions, for example, the number of presentation information requested by the user equipment 2 is less than the number of resource candidates.
  • the ranking determining apparatus further includes a fourth obtaining means (not shown); and the first obtaining means 31 further includes a first sub-acquisition means (not shown).
  • the fourth obtaining device acquires first type determining information for determining the search information and the adjustment information; the first sub-acquisition device determines, according to the first type, the input sequence from the user Obtaining the search information and the adjustment information.
  • the first type determining information includes but is not limited to:
  • the predetermined keyword type library includes a plurality of information units, and each information unit corresponds to one type.
  • the fourth obtaining device performs a query in a predetermined keyword type library according to the input sequence "TV drama starring Li Hongji”, and the obtained information unit includes: “Li Hongji", “starring”, and “TV drama”, wherein the information unit
  • the types of "Li Hongji” and “TV series” are search types, and the type of information unit "starring” is tune.
  • the entire type, the first sub-acquisition device determines the search information of the input sequence "TV drama starring Li Hongji" based on the result of the information unit and its type queried in the predetermined keyword type library, including "Li Hongji" and "TV drama”
  • the adjustment information includes "starring".
  • the fourth acquisition device queries the information element in the predetermined keyword type library, "Li Hongji starring” and “starring TV drama", wherein the information unit "Li Hongji starring”"
  • the type of the retrieval type, the type of the information unit "starring drama” is the adjustment type
  • the first sub-acquisition device determines the input sequence based on the information unit obtained in the predetermined keyword type library and the result of the type query thereof"
  • the search for "TV drama " starring Lee Hung-chi includes “Li Hongji starring”
  • the adjustment information includes "starring TV drama”.
  • search information and the adjustment information may partially overlap.
  • search information "starring” in “Li Hongji's starring” also appears in the adjustment information "starring in the TV series,”.
  • the semantic analysis result includes but is not limited to:
  • the result of semantic analysis based on part of speech includes but is not limited to: nouns, adjectives, adverbs, verbs, and the like.
  • the fourth obtaining device analyzes the four words “Li Hongji", “starring”, “” and “television” obtained by cutting the word “the TV drama starring Li Hongji”, and obtains the semantic analysis results including: Li Hongji and "TV drama” are nouns, "starring” is a verb, and "of” is a collateral; then the first sub-acquisition device uses the noun as the retrieval information and the verb as the adjustment information to determine the retrieval information based on the semantic analysis result. Including "Li Hongji" and "TV drama", the adjustment information includes "starring".
  • Semantic analysis results based on sentence patterns For example, for the input sequence "TV drama starring Li Hongji", the fourth acquisition device divides it into three parts based on the sentence pattern: “Li Hongji”, “starring”, and “TV drama”, and based on "Li Hongji” in the input sequence The middle is located at the beginning of the sentence to determine its subject, based on the "professional” in the input sequence in the sentence to determine the predicate, based on the "television” in the input sequence at the end of the sentence to determine its object; according to the aforementioned semantic analysis results, The first sub-acquisition device uses the subject and the object as the search information, and the predicate as the adjustment information, and the determined search information includes "Li Hongji” and “TV drama", and the adjustment information includes "starring”.
  • any first method for determining the search information and the adjustment information is obtained.
  • the type determining information, and determining the information according to the first type, and obtaining the search information and the adjustment information from the input sequence of the user are all included in the scope of the present invention.
  • the semantic analysis results of both the part of speech and the sentence type for example, "Li Hongji” is a noun and is located at the beginning of the sentence, "starring” is a verb and is located after "Li Hongji”, "TV drama” is a noun and is located after "starring”, etc.
  • the sentence-based semantic analysis result includes not only a subject, a predicate, an object, Further, the positional relationship between the components is further included, for example, the attribute is located before the subject, the attribute is before the object, the adverbial is before the predicate, and the like; or, the ranking determining device only uses the subject as the retrieval information, and the predicate and the object as the adjustment information, etc. .
  • the first acquisition device further includes an input sequence acquisition device (not shown), a removal device (not shown), and a second sub-acquisition device (not shown).
  • the input sequence obtaining device acquires an input sequence from a user; then, the removing device removes invalid information in the input sequence to obtain available information; and then, the second sub-acquisition device is configured by the available information Obtaining the search information and the adjustment information.
  • the invalid information includes but is not limited to: 1) auxiliary words; 2) spaces; 3) punctuation marks; 4) information units and the like included in the predetermined invalid dictionary.
  • the removal device removes the invalid information in the input sequence, for example, removes the auxiliary word "of” to obtain the available information "Lee Hongji starring "Television”, and then the second sub-acquisition device acquires the search information and the adjustment information from the available information.
  • the manner of obtaining the search information and the adjustment information from the available information is the same as or similar to the manner of obtaining the search information and the adjustment information in the input sequence in the foregoing step S3, and is included herein by reference. Let me repeat.
  • any first method for determining the search information and the adjustment information is obtained.
  • the type determining information, and determining the information according to the first type, and obtaining the search information and the adjustment information from the input sequence of the user are all included in the scope of the present invention.
  • FIG. 6 is a block diagram showing an apparatus for determining the ranking of resource candidates in accordance with a preferred embodiment of the present invention.
  • the sorting determining device includes: a first obtaining device 31, a searching device 32, a sorting device 33, a second obtaining device 35, and a providing device 3, and the sorting device 33 further includes a first sub-sorting device 33 1 .
  • the first obtaining means 31 acquires the retrieval information and the adjustment information from the input sequence received by the computer device.
  • the retrieval information includes one or more retrieval units
  • the adjustment information includes one or more adjustment units.
  • the retrieval information acquired by the first acquisition means 31 by the input sequence includes a retrieval unit “Li Hongji” and a retrieval unit “TV drama”, and the adjustment information includes an adjustment unit. "Starring” and adjustment unit "Participation”.
  • the first obtaining device 31 is input from the sequence The manner of obtaining the search information and the adjustment information has been described in detail in the embodiment shown in FIG. 5, and is hereby incorporated by reference.
  • the retrieval means 32 performs a search based on the retrieval information to obtain a plurality of resource candidates.
  • the retrieval device 32 searches based on the retrieval information "Li Hongji" and "TV drama", and the obtained resource candidates include: resource candidate C, resource candidate D, resource candidate E, and the like.
  • the second obtaining means 35 acquires first sorting auxiliary information for assisting in determining the sorting result.
  • the first sorting auxiliary information includes but is not limited to at least one of the following:
  • the second obtaining means 35 acquires the weight information of the adjustment unit "prospect", for example, 5; and obtains the weight information of the adjustment unit "parameter”, for example, 1. It should be understood by those skilled in the art that the above weight information is represented by numerical values only for listing, and is not intended to limit the present invention. In fact, the weight information may also be expressed in other manners, for example, by level.
  • the adjustment unit distribution information includes but is not limited to at least one of the following:
  • the resource candidate obtained by the retrieval device 32 includes: a resource candidate C, a resource candidate D, and a resource candidate E, and the second obtaining means 35 statistically obtains the number of occurrences of the adjustment unit in each resource candidate as: resource candidate Item C: Adjustment unit "star” appears 2 times; adjustment unit "parameter” appears 0 times; resource candidate D: adjustment unit “star” appears 0 times, adjustment unit "parameter” appears 1 time; resource candidate E : The adjustment unit "starring” appears 0 times, and the adjustment unit "parameter” appears 0 times.
  • the appearance position includes, but is not limited to: a title, a summary, a body, a multimedia resource descriptive content such as UGC, etc., and the appearance position may be a label of the information corresponding to the resource candidate or a text included in the corresponding information. Information, for example, ⁇ title>, "summary", etc., to identify.
  • the resource candidate obtained by the retrieval device 32 includes the resource candidate F and the resource candidate G, and the second obtaining device 35 obtains the appearance position of each adjustment unit in each resource candidate according to the label of the information corresponding to the resource candidate:
  • the candidate for the resource candidate F contains the adjustment unit "starring"
  • Resource candidate G's summary contains the adjustment unit "Parameter”.
  • the resource candidates obtained by the retrieval device 32 include resource candidates H and resource candidates I
  • the obtaining device 35 obtains the resource candidate H including two adjusting units of the adjusting unit "starring" and the adjusting unit "parameter”; the resource candidate I includes an adjusting unit "parameter” for an adjusting unit; then the second obtaining means 35 Determine the number of different adjustment units in each resource candidate:
  • the quality information includes but is not limited to at least one of the following:
  • the manner in which the second obtaining device 35 obtains the authority of the resource candidate includes, but is not limited to, at least one of the following:
  • the resource candidate obtained by the retrieval device 32 includes a resource candidate J corresponding to the website J, the resource candidate K corresponding to the website K, and the second obtaining means 35 obtaining the website J as the resource candidate J and the resource candidate K.
  • the authoritative website is reserved, and the website K is a predetermined ordinary website, and the second obtaining means 35 determines that the authority of the resource candidate J is "authoritative" level, and the authority of the resource candidate K is "normal" level.
  • the manner in which the second obtaining device 35 obtains the high-quality of each resource candidate includes, but is not limited to, at least one of the following:
  • the quality is obtained by analyzing the content information included in the website corresponding to the resource candidate.
  • the factors referred to in the analysis of the content information include at least one of the following: i) whether the advertisement information is included; ii) the quality of the resources provided by the website, for example, picture definition, video definition, song sound quality, etc.; Iii) The amount of resources provided by the website.
  • the resource candidate L corresponds to the website L
  • the resource candidate M corresponds to the website M
  • the second obtaining means 35 acquires the website included in the resource candidate L
  • the content information is analyzed, and the content information is not included in the content information, and the average pixel of the image provided by the website is higher than the first predetermined threshold.
  • the second obtaining device 35 determines that the quality of the website L is excellent, and determines the resource candidate.
  • the quality level of item L is "excellent"; and,
  • the second obtaining means 35 obtains the content information included in the website corresponding to the resource candidate M, and analyzes that the content information includes the advertisement information and the number of the music resources provided by the website is higher than the second predetermined threshold. It is judged that the quality of the website L is excellent, and it is determined that the quality level of the resource candidate L is "excellent". It should be understood by those skilled in the art that the above implementation manner of expressing the quality is only listed, not limited to the present invention. In fact, the quality can also be expressed in other ways, for example, by value, etc. .
  • the first sub-sorting device 331 determines the sorting result of the plurality of resource candidates according to all the adjusting units and in combination with the first sorting auxiliary information.
  • the manner in which the first sub-sorting device 331 determines the sorting result includes, but is not limited to: 1) determining an initial sorting result of the multiple resource candidates according to one of the first sorting auxiliary information, And then adjusting the initial sorting result according to at least one of the first sorting auxiliary information to obtain the sorting result;
  • the first sub-sorting device 331 determines the sorting result of each resource candidate according to all the adjusting units and the weight information of the adjusting unit in the first sorting auxiliary information.
  • the second obtaining device 35 obtains the weight information of the "starring” as 5, and the weight information of the "starring” is 1, and the first sub-sorting device 331 includes the adjusting information unit "starring" according to the resource candidate C.
  • the resource candidate D includes an adjustment information unit "parameter", and determines the ranking result of the resource candidate C and the resource candidate D according to the weight information of each adjustment unit:
  • the first sub-sorting device 331 determines each resource candidate according to all the adjusting units, combined with the weight information of the adjusting unit in the first sorting auxiliary information and the adjusting unit distribution information of each resource candidate. Sort results.
  • the second obtaining means 35 obtains the weighting information of the "starring” as 5, the weighting information of the "starring” is 1, and the first sub-sorting means 331 includes the adjustment information unit "starring” according to the title of the resource candidate C.
  • the summary of the resource candidate D includes the adjustment information unit "starring"
  • the resource candidate E includes the adjustment unit "parameter”
  • the candidate D sorts the resource candidate C of the adjustment unit "starring” in the title according to the appearance position information of the adjustment unit "starring" before the resource candidate E containing the adjustment unit "star". Before the resource candidate D of the adjustment unit "starring" appears, the following sorting result is obtained: resource candidate C;
  • the first sub-sorting device 33 1 sorts based on at least the number of occurrences of the adjusting unit; 2) the first sub-sorting device 331 sorts based on at least the number of adjusting units; 3) the first sub-sequencing device 331 sorting according to the authority or quality of the website corresponding to the resource candidate from high to low; 4) the first sub-sorting device 331 first sorts each resource candidate based on the highest to lowest number of occurrences of the adjusting unit, Further sorting the resource candidates having the same number of occurrences of the adjustment unit according to the high-quality degree from high to low, etc.; 5) when each item in the first sorting auxiliary information is represented by a value, the first sub-sequencing device 331 obtain an evaluation value of each resource candidate based on the value of each item in the first sorting assistance information, and sort each resource candidate according to the evaluation value.
  • the providing device 34 generates presentation information according to the sorting result, and provides the user 1 to the user 1 through the user equipment 2.
  • the providing device 34 determines the presentation information C corresponding to the resource candidate C, the presentation information D corresponding to the resource candidate D, and the resource candidate according to the resource candidate (the candidate result of the resource candidate D and the resource candidate E).
  • the order of the presentation information E corresponding to E is as follows, and the sorted presentation information is provided to the user 1 by the user equipment 2:
  • the providing device 34 may select part of the resource candidates to generate the presentation information and provide the information to the user 1 according to actual conditions, for example, the number of presentation information requested by the user equipment 2 is less than the number of resource candidates.
  • Fig. 7 is a block diagram showing an apparatus for determining the sorting result of resource candidates in another preferred embodiment of the present invention.
  • the sort determining device includes: a first obtaining device 31, a retrieving device 32, a sorting device 33, and a providing device 34.
  • the sorting device 33 also includes a second sub-sorting device 332.
  • the first obtaining means 31 acquires the retrieval information and the adjustment information from the input sequence received by the computer device.
  • the first acquisition means 31 obtains the retrieval information "Li Hongji” and “TV drama” and the adjustment information "starring” from the input sequence "TV drama starring Li Hongji” received by the computer device.
  • the manner in which the first obtaining device 31 obtains the search information and the adjustment information from the input sequence has been described in detail in the embodiment shown in FIG. 5, and is hereby incorporated by reference.
  • the retrieval means 32 performs a search based on the retrieval information to obtain a plurality of resource candidates.
  • the retrieval device 32 searches according to the retrieval unit "Li Hongji” and the retrieval unit "TV drama", and the obtained resource candidates include: resource candidate Al, resource candidate B 1 and resource candidate Cl.
  • the second sub-sorting means 332 determines the ranking result of the plurality of resource candidates based on the adjustment information and the retrieval information.
  • the retrieval device 32 obtains the resource candidate A1, the resource candidate B1, and the resource candidate C1, and the second sub-sorting device 332 obtains the resource candidate A1 including the retrieval information "Li Hongji” and “TV drama” and adjustment information "starring” 'resources'
  • the candidate B 1 includes the retrieval information "Li Hongji” and "starring”
  • the resource candidate C 1 contains the retrieval information "Li Hongji”
  • the second sub-sequencing means 332 determines the resource candidate A1 which includes both the retrieval information and the adjustment information.
  • the providing means 34 generates presentation information for providing to the user 1 based on the sorting result.
  • the manner in which the providing device 34 generates the presentation information according to the sorting result to provide to the user 1 has been described in detail in the providing device 34 of the embodiment shown in FIG. 5, and is incorporated herein by reference. No longer.
  • FIG. 8 is a flow chart showing a method for determining a ranking result of a resource candidate to sort a search result according to a preferred embodiment of the present invention.
  • the order determining device includes: a first obtaining device 31, a searching device 32, a sorting device 33, a third obtaining device 36, and a providing device 34.
  • the sorting device 33 further includes:
  • the second sub-sorting device 332 further includes: a third sub-sorting device 333.
  • the first obtaining means 31 acquires the retrieval information and the adjustment information from the input sequence received by the computer device.
  • the adjustment information includes one or more adjustment units; the retrieval information includes one or more retrieval units.
  • the first obtaining means 31 obtains the retrieval information including the retrieval unit “Li Hongji” and the retrieval unit “TV drama” from the input sequence "Li Hongji starring and participating in the TV drama” received by the computer device and includes the adjustment unit "starring” Adjustment information with the adjustment unit "Parameter”.
  • the manner in which the first obtaining means 31 obtains the search information and the adjustment information from the input sequence received by the computer device 3 and the manner in which the first obtaining means 31 receives the search information and the adjustment information from the input sequence shown in FIG. The same or similar, and are included herein by reference, and are not described again.
  • the retrieval means 32 performs a search based on the retrieval information to obtain a plurality of resource candidates.
  • the retrieval device 32 searches according to the retrieval unit "Li Hongji” and the retrieval unit "TV drama", and the obtained resource candidates include: resource candidate C l , resource candidate D l , resource candidate El, and the like.
  • the third obtaining means 36 acquires second sorting auxiliary information for assisting in determining the sorting result.
  • the second sorting auxiliary information includes but is not limited to at least one of the following:
  • the third obtaining means 36 obtains the weight information of the retrieval unit "Li Hongji", for example, 5, and obtains the weight of the retrieval unit "TV drama", for example, 1. It should be understood by those skilled in the art that the above-mentioned numerical values are only used for indicating weight information, and are not intended to limit the present invention. In fact, the weight information may also be expressed in other manners, for example, by level.
  • the retrieval unit distribution information includes but is not limited to at least one of the following:
  • the resource candidate obtained by the searching device 32 includes the resource candidate C1, the resource candidate D1, and the resource candidate E1, and the third obtaining device 36 statistically obtains the number of occurrences of the adjusting unit in each resource candidate as:
  • Resource candidate CI The search unit "Li Hongji” appears twice, the search unit "TV drama” appears twice;
  • resource candidate E1 The search unit "Li Hongji” appears once, and the search unit "TV drama” appears once.
  • the appearance position includes, but is not limited to: a title, a summary, a body, a multimedia resource descriptive content such as UGC, etc., and the location may pass the label or text information of the information corresponding to the resource candidate, for example, ⁇ title> , "summary", etc., to identify.
  • the resource candidates obtained by the retrieval device 32 include resource candidates F 1 and resource candidates.
  • the third obtaining means 36 obtains the occurrence positions of the respective retrieval units in the respective resource candidates according to the tags of the information corresponding to the resource candidates:
  • Resource candidate The title of F 1 contains the search unit "Li Hongji” and the search unit "TV drama”; Resource candidate G 1 contains the search unit "Li Hongji”, and the abstract contains the search unit "TV drama”. c) the number of different retrieval units among the resource candidates corresponding to the retrieval unit distribution information.
  • the resource candidates obtained by the retrieval device 32 include resource candidates H 1 and resource candidates.
  • the third obtaining device 36 obtains the resource candidate HI including the search unit "Li Hongji” and "TV drama", and the resource candidate I1 includes a retrieval unit "Li Hongji”; then the third obtaining means 36 determines different among the resource candidates
  • the number of retrieval units is:
  • predetermined shield information of each of the plurality of resource candidates has been described in detail in the description of the second obtaining means 35 of the embodiment shown in Fig. 6, and is hereby incorporated by reference.
  • the third sub-sorting means 333 determines the sorting result of the plurality of resource candidates based on all the adjusting units and all the retrieving units in combination with the second sorting auxiliary information.
  • the third sub-sorting device 333 determines the sorting result of the plurality of resource candidates according to at least one of the second sorting auxiliary information.
  • the manner in which the third sub-sorting device 333 determines the sorting result includes but is not limited to:
  • the third obtaining means 36 obtains the retrieval unit "Li Hongji" appears twice in the resource candidate C1, once in the resource candidate D1, once in the resource candidate E1, and the retrieval unit " The television drama "appears 2 times in the resource candidate C 1 , appears once in the resource candidate D 1 , and appears once in the resource candidate E 1 , then the third sub-sequencing means 333 determines that the two retrieval units are There are four occurrences in the resource candidate C1, two retrieval units appear twice in the resource candidate D1, two retrieval units appear twice in the resource candidate E1, and the resource candidate D1 contains two adjustment units, resources
  • the candidate E1 includes information of an adjustment unit, and the third sub-sorting device 333 first sorts the number of occurrences of the retrieval unit to obtain an initial ranking result, and then adjusts the initial ranking result according to the number of adjustment units to obtain the resource candidate C1,
  • the sorting results of resource candidate D1 and resource candidate E1 are as follows:
  • the third sub-sorting device 333 sorts based on the appearance positions of the retrieval units in the resource candidates, for example, sorting the resource candidates in which the retrieval unit appears at the title position in the resources in which the retrieval unit appears in the summary position Before the candidate; 2) the third sub-sorting device 333 sorts based on the number of different retrieval units among the resource candidates, for example, sorting the number of resource candidates including the number of retrieval units in the number of retrieval units 3)
  • the third sub-sorting device 333 sorts the quality information of each resource candidate based on the quality information of each resource candidate, for example, sorting the resource candidates corresponding to the authoritative website or the quality website in advance; 4)
  • the three sub-sorting means 333 simultaneously sorts based on the weight information of the retrieval unit and the adjustment unit, for example, the weight information includes a weight value, and the third sub-sorting means 333 sets the retrieval unit weight value and the adjustment unit weight value respectively included in each resource candidate.
  • the third sub-sorting means 333 sorts the resource candidates based on the distribution information of the retrieval unit, and sorts the distribution information based on the distribution information of the adjustment unit or the shield information of the resource candidates, etc. .
  • the providing device 34 generates the presentation information according to the sorting result, and provides the user information 2 Give the user 1.
  • the manner in which the providing device 34 generates the presentation information according to the sorting result and is provided to the user 1 by the user equipment 2 has been detailed in the description of the providing device 34 of the embodiment shown in FIG. And is included here by reference, and will not be described again.
  • the ranking determining means further includes a fifth obtaining means (not shown) and an updating means (not shown).
  • the fifth obtaining device acquires a keyword unit and a type thereof; and then, the updating device establishes or updates the predetermined keyword type library according to the keyword unit and its type.
  • the fifth acquisition device further includes a keyword acquisition device (not shown), a sixth acquisition device (not shown), and a type determination device (not shown).
  • the keyword obtaining means acquires the keyword unit; then, the sixth obtaining means acquires second type determining information for determining the type of the keyword unit; and then, the type determining means according to the second The type determination information determines the type of the keyword unit.
  • the manner in which the keyword acquiring device acquires the keyword unit includes but is not limited to:
  • the keyword acquisition means obtains a keyword unit from an input sequence input by a plurality of users.
  • the keyword acquisition means obtains the same part of the two input sequences "TV drama” from the input sequence "TV drama A” input by the user A and the input sequence "TV drama B” input by the user B, which is received by the computer device 3, And the same part of the "TV series” as a keyword unit.
  • the keyword acquisition device cuts the input sequence "actor Li Hongji", obtains “actor” and “Li Hongji”, and uses “actor” and "Li Hongji” as keyword units.
  • the keyword acquisition means obtains "progress” and “paragraph” from the input method vocabulary, and takes “starring” and “joining” as keyword units.
  • the sixth obtaining means determines second type determination information of the keyword unit type.
  • the second type determination information includes, but is not limited to, at least one of the following:
  • the predetermined expected library includes a plurality of corpora.
  • the distribution concentration indicates a degree of distribution concentration of the keyword unit in a plurality of corpora of the predetermined corpus, and the distribution concentration is based on the presence information of the keyword unit in the predetermined corpus and includes the The quantity information of different corpora of the keyword unit is obtained.
  • the presence information is at least one of the following:
  • the quantity information includes at least one of the following:
  • Semantic analysis results obtained from the keyword unit includes, but is not limited to, part of speech of a keyword unit, such as a noun, a verb, an adjective, and the like.
  • the sixth obtaining means performs a part-of-speech analysis on the keyword unit "Li Hongji" to obtain a semantic analysis result as a noun.
  • the user history input sequence matching the same corpus is a user history input sequence containing the same corpus in the search result, for example, three user history input sequences "iphone4 sign sale”, “iphone4 sale”, “iphone4 sale”
  • the search results all contain the same corpus "Iphone4 sales have broken through"
  • the sixth acquisition device determines to include The keyword unit "iphone4" and the number of user history input sequences matching the same corpus is 5.
  • the type determining means determines the type of the keyword unit based on the second type determination information.
  • the type includes: a retrieval type, an adjustment type, and the like; preferably, an invalid type or the like that needs to be removed from the input sequence.
  • the sixth obtaining means obtains the distribution concentration of the keyword unit "Li Hongji" in the predetermined expected library is 6.5, and the type determining means determines that the distribution concentration degree 6.5 exceeds the distribution predetermined threshold 4, and determines the keyword unit
  • the type of "Li Hongji" is the search type.
  • the sixth obtaining means obtains the semantic analysis result of the keyword unit "television drama" as a noun, and the type determining means determines the type of the keyword unit "television drama” as the retrieval type based on the semantic analysis result.
  • the sixth obtaining means obtains the semantic analysis result of the keyword unit "prospect” as a verb, and the type determining means determines the keyword unit "prospect” as the adjustment type based on the semantic analysis result.
  • the sixth obtaining means obtains the semantic analysis result of the keyword unit as "auxiliary word”, and the type determining means determines the "key" of the keyword unit as an invalid type based on the semantic analysis result.
  • the sixth obtaining means obtains the keyword unit "Li Hongji" and the number of user history input sequences matching the same corpus is 1000, which is higher than a predetermined judgment threshold, and the type determining means determines that the type of the keyword unit "Li Hongji" is Search type.
  • the sixth obtaining means obtains the semantic analysis result of the keyword unit "Li Hongji" as a noun and the number of user history input sequences matching the same corpus is 1000, and the type determining device inputs the sequence according to the user history of the noun matching the same corpus.
  • the quantity exceeds 900 the term is determined as a predetermined rule of the retrieval type, and the type of the keyword unit "Li Hongji" is determined as the retrieval type.
  • the sixth obtaining means obtains the distribution concentration of the keyword unit "Li Hongji" of 6.5 and the number of user history input sequences matching the same corpus is 1000, and the sort determining device firstly distributes the concentration degree and the user history matching the same corpus. The number of input sequences is normalized and then added.
  • the comprehensive evaluation value of the keyword unit "Li Hongji” is 1.2, which is higher than the comprehensive predetermined threshold, and the type of the keyword unit "Li Hongji" is determined as the retrieval type.
  • the mouse and the keyboard are used as input devices for human-computer interaction
  • the display of the user device is used as an output device for human-computer interaction
  • the present invention does not exclude the use of other input devices and
  • an output device for example, the user inputs through a tablet, the user device passes through a speaker as an output device, and the like.
  • the present invention proposes a method, user equipment, network server and system for assisting text entry by using a network server.
  • the first embodiment of the method, the device, the server and the system for the user to input the text As shown in FIG. 4, the user device 140 of the first embodiment of the present invention stores a local corpus 1403, the local corpus The 1403 stores a basic vocabulary set, a basic language model, and a vocabulary set generated by the user during the input method.
  • the local corpus 1403 can also store some auxiliary information: for example, various setting properties of the input method by the user, including but not limited to fuzzy sound, simplified and simplified, double spell, full spell, simple spell, etc.; User attribute information, including but not limited to occupations, hobbies, areas of expertise, resumes, ages, etc. This auxiliary information helps to optimize the ordering of candidate terms.
  • User device 140 also has a keyboard 1401 for inputting pinyin letters or stroke sequences of text by the user.
  • the matching device 1402 in the user device 140 finds a matching local term option in the local corpus 1403 based on the input pinyin letter or stroke sequence and displays it through the display device 1406 for user selection.
  • the keyboard can be a pure numeric keyboard or a full alphanumeric keyboard (QWERTY keyboard), or it can be a physical keyboard or a virtual keyboard.
  • the network communication device 1404 and the aggregation device 1405 are added to the user device 140 of the present invention.
  • the network communication device 1404 communicates with the web server 150 via the Internet or a local area network, and transmits the pinyin letters and stroke sequences input through the keyboard 1401 to the web server 150.
  • Web server 150 utilizes a large network corpus and powerful processing power to find suitable entry options.
  • the network term option obtained by the web server 150 is returned to the network communication device 1404, and the received network term option is transmitted by the network communication device 1404 to the summary device 1405.
  • the summary device 1405 receives the local entry option from the matching device 1402 and the network entry option from the network communication device 1404, which is summarized and displayed on the display device 1406 for user selection.
  • the web server 150 may be a plurality of web servers 1501 150n distributed over the Internet. These network servers 1501 150n work together to form a server cloud that provides services to a large number of users. Network server 150 can also be one or more servers located on the corporate local area network.
  • step S1 101 the key input sequence of the user on the keyboard 1401 of the user device is detected.
  • the key sequence can be one or more phrases or even a sentence of simple spelling or full spelling.
  • the user has to input "I like to use Baidu search engine”, and can input each word initials "wxhybdssyq", you can input each The full spell of the word “woxihuanyongbaidusousuoyinqing", you can also enter the mixed input "woxhuanybaidssyinq”.
  • the candidate words are more precise, reducing the number of page-turning searches, but requiring more characters to be entered.
  • the re-code is more, which leads to a longer page search time and less efficient. Therefore, it is usually more effective to combine the full spelling and the simple spelling.
  • the local corpus 1403 will synchronously place the new term based on the user's choice.
  • step S1102 after obtaining the user's key input sequence, the input sequence is matched and searched in the local corpus 1403 of the user equipment 140 to obtain one or more matching local entry options.
  • step S1103 the key input sequence is sent to the web server 150.
  • steps S 1 102 and S 1 103 may be performed sequentially or simultaneously.
  • step S1 105 In order to quickly display the obtained entry options, after obtaining the local entry option in step S1 102, it is immediately possible to proceed to step S1 105, and the obtained local entry options are summarized and displayed to the user for selection.
  • web server 150 receives the key input sequence from user device 140 and looks for a matching network term option in the network corpus.
  • the user equipment 140 receives the network entry option from the web server 150 and sends it to the aggregation device 1405.
  • step S1105 the local entry option from the matching device 1402 and the network entry option from the web server 150 are summarized in the summary device 1405 and provided to the display device 1406 for display, which is selected by the user.
  • the term option after summarizing with the network entry option is more precise.
  • step S1105 is followed by a step S1101 to detect the key input of the user equipment.
  • steps S 1102 and S 1103 may be reversed, and the detected key input sequence is first sent to the network server.
  • the summary device 1405 will generally receive the local entry option before receiving the network entry option.
  • the local entry can be immediately
  • the options are provided to display device 1406 for user selection without having to be displayed with the network entry option.
  • the summary device 1405 in order to quickly display the candidate term, obtains the obtained network term option before receiving the network term option.
  • the local entry options are displayed to the user in the entry bar according to their priority order, wherein the higher the priority, the higher the input term option is displayed.
  • the matching device 1402 can determine the priority level according to the selection frequency of each term option in the user input history and the semantic relevance between each term in each term option.
  • the matching device 1402 can also determine the priority level according to the input preference selection set by the user.
  • the user may have selected a partial entry in the previously displayed local entry option, or has flipped through the partial local entry option. Some of the entries in the network term options received at this time may be the same as the previously obtained local term options. It is therefore necessary to remove these entries that have been selected and/or repeated from the network entry options. Next, the remaining network entry options are inserted into the current and subsequent displayed local entry options according to the priority of the entry without changing the order in which the local candidate entries are arranged.
  • the advantage of this processing is that when adding a network candidate entry, the position of the term on the term option bar currently browsed by the user does not change much.
  • the entry “I like” is inserted into the appropriate position of the subsequently displayed entry option according to its priority.
  • the network entry option "I like” that has been inserted in the local entry option is displayed.
  • the term "I like to use Baidu search engine” may be directly returned, instead of turning the page selection words one by one, so even if the network feedback is slightly delayed, it will still be greatly Speed up the input.
  • the network entry option is a supplement to the local entry option. It only needs to select the network entry option that is not in the local entry option to be inserted in a certain order. It can be in the currently displayed entry. At this time, the order of the local entry options is not changed, but the network entry options that are not in the local entry options are added.
  • the second example of the first embodiment of the present invention is similar to the first example.
  • the summary device 1405 after obtaining the local term option, before receiving the network term option, is also The obtained local entry options are first displayed to the user in the entry bar according to their priority order.
  • the difference from the first example is that after receiving the network term option, the network term option remaining after the already displayed term is removed from the network term option is currently displayed and not yet displayed.
  • the local entry options are rearranged according to the priority of these terms and displayed for the user to select.
  • the term that has been displayed as described herein refers to the term option that has been viewed before the currently displayed term is excluded from page turning.
  • the user is browsing the first few terms of the local entry option, "no signal, miniaturization, infinitely good, joke" and other irrelevant terms. Since the displayed item has not been selected by the user, the previously displayed item is first removed from the received network entry option, and the remaining network entry options are displayed with the current display and the local display that has not yet been displayed. The term options are rearranged along with the priority of these terms, because the correct entry "I like" The network term option has a higher priority and is rearranged to be adjusted to the currently displayed term option.
  • An advantage of this embodiment is the ability to quickly adjust the correct term to the top or top of the term option. Since the same term may have different priorities in the local term option and the network term option, the new order may be determined according to the weighted average of the two priorities of the term when rearranging.
  • the network entry options received later are rearranged according to certain rules with the currently displayed and undisplayed local entry options. This will also change the order in which local entry options are displayed.
  • the third example of the first embodiment of the present invention is similar to the second example. The only difference is that the rearrangement is performed according to the priority of the network entry option, and details are not described herein again.
  • the local entry option is no longer displayed after the received network entry option, but the network entry option is not displayed previously.
  • the fourth example of the first embodiment of the present invention is different from the first to third examples, and after obtaining the local entry option, the obtained local entry option is not immediately displayed to the user according to its priority for its Selecting, but waiting to receive the network entry option, after being arranged according to the priority of these terms, is displayed to the user for selection.
  • the term option displayed in this embodiment is relatively accurate, especially for the case where the entire sentence is continuously input and then the term selection is more advantageous, because the continuous input time of the whole sentence is relatively long, and the delay of the network response does not cause too much Impact, and the matching of the whole sentence requires the support of a larger corpus, language matching model, and processing power, so waiting to receive the network entry option is displayed to the user selection will provide more accurate results.
  • the matching device 1402 may be based on whether the term was previously selected, the time in which the term was previously selected, the number of times the term was previously selected, the user-preset input preference option, and/or the number of times the term was searched on the network. To determine the priority of local entry options.
  • the web corpus on the web server 150 can include a user web corpus 1501 and a public web corpus 1504, respectively, corresponding to each user.
  • the user network corpus 1501 is a local corpus 1403 of each registered user backed up on the web server 150.
  • the user device 140 also includes a local synchronization device (not shown) for synchronizing with the local corpus 1403 when the registered user logs into the web server 150 or the user network corpus 1501 that retains the user on the web server 150.
  • User equipment 140 also includes local update means (not shown) for wording based on the user The selection of the bar updates the local corpus 1403 and sends the selection to the web server 150 to update the user network corpus 1501. Since some words will be repeatedly input in the context, it is necessary to update the local thesaurus 1403 and the user network corpus 1501 in time to increase the priority of the recently entered entry to speed up the input.
  • the user network corpus 1501 also stores a basic vocabulary set, a basic language model, and a vocabulary set generated by the user during the input method.
  • Some auxiliary information may also be stored: for example, various setting properties of the input method by the user, including but not limited to fuzzy sounds, simplified and simplified, double spelling, full spelling, spelling, etc.; and attribute information of the user, including but not limited to Occupation, hobbies, areas of expertise, resumes, ages, etc.
  • the user network corpus is stored on the web server 150, regardless of which terminal device the user uses, as long as it can connect to the web server 150, it can be quickly entered by synchronizing the local corpus 1403 after login or using the user network corpus 1501 online.
  • the public network corpus 1504 is formed based on analysis and statistics on public documents, publications, input of a large number of users, search vocabulary of a large number of users on a web search engine, index keywords of a large number of web pages, and/or keyword advertisement information, which reflects users.
  • the commonality or hotspot of the group is formed based on analysis and statistics on public documents, publications, input of a large number of users, search vocabulary of a large number of users on a web search engine, index keywords of a large number of web pages, and/or keyword advertisement information, which reflects users.
  • the commonality or hotspot of the group is formed based on analysis and statistics on public documents, publications, input of a large number of users, search vocabulary of a large number of users on a web search engine, index keywords of a large number of web pages, and/or keyword advertisement information, which reflects users.
  • the commonality or hotspot of the group is formed based on analysis and statistics on public documents, publications, input of a large number of users, search vocabulary
  • the network server 150 of the first embodiment of the present invention includes a user network corpus 1501, a matching device 1502, a network communication device 1503, a public network corpus 1504, a corpus update device 1505, a keyword advertisement library 1506, and a synchronization device 1507. .
  • the network communication device 1503 is connected to one or more user devices 140 via a network for receiving a key input sequence of the user on the user device via the network, and feeding back the network entry option obtained based on the key input sequence back to the user Equipment, for users to choose.
  • the matching device 1502 is connected to the user network corpus 1501, the public network corpus 1504, and the matching device 1502, for performing matching queries in the user network corpus 1501 and the public network corpus 1504 based on the key input sequence to obtain one or more matching network words. Bar options.
  • the matching device 1502 further includes priority determining means (not shown) for depending on whether the entry was previously selected, the time sequence in which the entry was previously selected, the number of times the entry was previously selected, the user-preset input preference. The number of times the options and/or terms are searched on the network determines the priority of each of the matching term options.
  • the synchronization device 1507 is connected to the user network corpus 1501 and the network communication device 1503, for when the user logs in to the network server 150 through the user device 140, after receiving the corpus synchronization instruction from the user device, the user of the user
  • the web corpus 1501 is synchronized with the local corpus 1403 in the user device.
  • the corpus update device 1505 is coupled to the user network corpus 1501 and the public network corpus 1504 for updating the user network corpus 1501 according to user input and word selection, and for inputting a large number of users, and a large number of users on the network search engine.
  • the public network corpus 1504 is updated by retrieving vocabulary, index keywords of a large number of web pages, and/or keyword advertisement information for analysis and statistics.
  • the keyword advertisement library 1506 is for providing an advertisement link related to a keyword. Some manufacturers can buy a number of keywords or letter combinations. For example, Baidu can buy keywords such as “Baidu” and “Search Engine”, or buy “bd”, “baidu”, “ssyq”, etc. The letter combination, when the matching device 1502 matches the entry “Baidu” or “Search Engine” in the network corpus based on the key sequence from the user device 140, or receives "bd”, "baidu”, “ssyq” When the combination is made, the corresponding keyword advertisement information "Baidu.com” and its link are found from the keyword advertisement library 1506. The keyword advertisement information is returned to the user device 140 via the network communication device 1503 and displayed in the entry options.
  • the user can select the advertisement information to jump to the corresponding website link.
  • the "selection” here includes mouse clicks, and also includes selecting the corresponding number selection button directly through the keyboard.
  • the priority determining means in the matching means 1502 may assign a higher priority to the keyword advertisement information to ensure that the advertisement information is arranged to be displayed in the first round of entry options displayed to the user or in the currently displayed options.
  • the keyword advertisement library 1506 can be incorporated into the public network corpus 1504 or incorporated into the user network corpus 1501 as needed.
  • the user network corpus 1501 is downloaded to the local corpus 1403 of the user device 140 in a synchronized manner.
  • the local corpus 1403 includes a keyword advertisement library.
  • the keyword advertisement information is displayed as a local entry option, and the keyword advertisement information has a link, when the user selects the location.
  • the entry option of the advertisement information may appear when the text is input, thereby increasing the exposure rate of the advertisement.
  • FIG. 2 is a flow chart of a method for assisting a user to perform text input on a web server in accordance with a first embodiment of the present invention.
  • step S1201 the network communication device 1503 of the web server 150 receives a key input sequence of the user on the user device 140 via the network;
  • step S1202 performing a matching query in the network corpus based on the key input sequence to obtain one or more matching network term options
  • step S1203 the obtained network term option is fed back to the user equipment 140 for user selection.
  • the user can register as a registered user on the web server 150.
  • the registered user can log in to the web server 150 and retain the user web corpus 1501 on the web server 150.
  • the user can also select whether to synchronize the user network corpus 1501 and the local corpus 1403 of the user device.
  • the matching device 1502 can perform different matching operations to provide accurate network term options.
  • Figure 3 shows the processing steps in these cases.
  • FIG. 3 is a flow chart showing the specific steps of performing a matching query in a network corpus in a method for assisting a user to perform text input on a web server according to a first embodiment of the present invention.
  • step S1301 is performed to determine whether the user logs in to the network server 150. If the user is not logged in, since the network server 150 cannot determine the identity of the user, the user cannot be retained on the network server 150.
  • the user network corpus 1501 performs a matching query, but proceeds to step S1503 to retrieve the matching network term option only in the public network corpus upon receiving the key input sequence of the user on the user device.
  • log in There are various ways to log in, such as logging in with a username and password, automatically logging in with the MAC address of the user device, automatically logging in with the fixed IP address of the user device, and so on.
  • step S1302 determines whether the user's user network corpus 1501 is synchronized with the user device's local corpus 1403. If the user is using another person's or public computer, he does not want to synchronize his user's web corpus 1501 to the local device, nor does he want to synchronize the other people's thesaurus on the local device to his user's web corpus 1501, so the user can Choose not to sync. However, users still want to be supported by their user network corpus 1501.
  • step S1302 when it is determined in step S1302 that there is no synchronization, the matching network term option is retrieved in the public network corpus 1504 and the user's user network corpus 1501 upon receiving the user's key input sequence on the user device. If it is determined in step S1302 that no user network corpus has been synchronized with the local corpus of the user equipment, since the user equipment will first find the matching term option in the local corpus, the web server 150 does not have to be repeated in the same user network corpus 1501. The matching term option is found, so that only the matching network term option is retrieved in the public network corpus 1504 upon receiving the user's key input sequence on the user device.
  • the matching device 1402 and the matching device 1502 obtain the priority of each input term option when performing a matching query in the lexicon according to the user input sequence to obtain a plurality of input term options.
  • the summary device 1405 displays the plurality of matched input term options provided by the matching device 1402 and the matching device 1502 in the item column to the user in priority order, wherein the higher the priority, the higher the input term option is. display.
  • the matching device 1402 and the matching device 1502 can determine the priority level according to the selection frequency of each term option in the user history record and the semantic relevance between each term in each term option.
  • Matching device The 1402 and the matching device 1502 can also determine the priority according to the input preference selected by the user.
  • the matching device 1402 and the matching device 1502 can also determine the region in which the user device is located according to the IP address of the current user equipment, so that the priority of the vocabulary related to the region in the input sequence can be determined, for example, but the user input sequence is
  • a first embodiment of the present invention further provides a system for a user to input characters, including a user equipment for inputting characters according to the first embodiment of the present invention and an auxiliary user equipment according to the first embodiment of the present invention. Enter the text of the web server.
  • the present invention is applied to a second embodiment of a method, device, server and system for a user to input text.
  • the second embodiment is based on the first embodiment, and the concept of "group corpus" is added to Good to improve the hit rate of the input method preferences, improve input efficiency:
  • Multi-users of intranets or local area networks such as enterprise customers, Internet cafes, and translation services, usually have obvious commonalities. Such commonalities may be the same or similar work content, the same or similar hobbies, the same or similar age stages, the same or similar geographic areas, and the like. Because of this commonality, the choice of alternative terms in a text input between multiple users exhibits a convergence or similarity. For example, the same type of "shzh" is used for the abbreviated input.
  • the preferred item with the highest hit rate is "Shenzhen”; for the residents of Shenzhou City, Hebei province, the preferred item with the highest hit rate is “ Shenzhou”;
  • the most preferred term for the highest hit rate is likely to be “Shenzhou (N)”; for ordinary users who do not have the same commonality, the highest preferred hit rate
  • the item may be "Shenzhou (the earth)".
  • the auxiliary user can perform text entry on the user terminal, which will undoubtedly greatly improve the accuracy and efficiency of text entry.
  • step S2101 the key input sequence of the user on the keyboard 1401 of the user device is detected.
  • This step is similar to the step S 1 101 in the first embodiment of the present invention.
  • the user can further simplify the input, and even if only the first letter of each word is used, the desired result can be quickly obtained, because the available group words Articles have been trained by other users who have something in common with them.
  • step S2102 after obtaining the user's key input sequence, the input sequence is matched and searched in the local corpus 1403 of the user equipment 140 to obtain one or more matching local entry options.
  • step S2103 the key input sequence is sent to a web server.
  • steps S2102 and S2103 may be performed sequentially or simultaneously.
  • step S2102 In order to quickly display the obtained entry options, after obtaining the local entry option in step S2102, it is immediately possible to proceed to step S2105, and the obtained local entry options are summarized and displayed to the user for selection.
  • the web server receives the key input sequence from the user device 140 and looks for a matching group entry option in the group corpus associated with the group to which it belongs.
  • the user equipment 140 receives the group entry option from the web server and sends it to the summary device 1405. Then, proceeding to step S2105, the local entry option from the matching device 1402 and the group entry option from the web server are summarized in the summary device 1405 and provided to the display device 1406 for display, which is selected by the user. Due to the lag of network transmission and server processing, the summary device 1405 will generally receive the local entry option before receiving the group entry option. When the network server has not returned the group entry option, it can immediately localize. The entry options are provided to display device 1406 for user selection without having to be displayed with the group entry option. Of course, the term option after summarizing with the group entry option is more precise.
  • step S2105 is followed by a step S2101 to detect the key input of the user device.
  • steps S2102 and S2103 may be reversed, and the detected key input sequence is first sent to the network server.
  • the foregoing method may further include a registration step on the user equipment side.
  • S2106 that is, before the text input, the user may be registered by the network communication device 1404 to one or more user groups on the network server, the user group being associated with the group corpus.
  • Such a registration process may employ, for example, a group registration function as is known in the art.
  • the method may further include the step S2107 of the user sending the user identity information to the network server, so that the network server determines the user group associated with the network server, and further determines the group corpus associated with the network corpus.
  • this step is not an essential step for the web server to determine the user group.
  • the web server itself is an intranet server that is only applicable to one or some groups of users, the user can be considered to be able to utilize the group corpus on the web server without any authentication procedures.
  • the foregoing method may further include the following step S2108: the user sends the selected item to the network server, so that the network server is updated to be associated with the user group to which the user belongs.
  • Group corpus According to this function, each member of the user group can provide its own contribution to the group corpus, such as new terms, self-entrancing input habits, etc. These resources can be collected in the appropriate language in the group corpus.
  • a greater weight is assigned to contributions of members with higher authority or dominance in the user group. For example, new terms provided by department heads have higher priority in feedback.
  • FIG. 9 is a block diagram of a network server for assisting a user in text input in accordance with the present embodiment.
  • web server 250 includes one or more group corpora 2501 (only one of which is shown for simplicity), matching device 2502, network communication device 2503, group management device 2504, and group corpus.
  • the device 2505 is managed and updated.
  • the network communication device 2503 is connected to one or more user devices 140 via a network for receiving a key input sequence of the user on the user device via the network, and the group obtained based on the key input sequence. The term option is fed back to the user device for selection by the user.
  • the network communication device 2503 further includes identity information receiving means for receiving optional user identity information and forwarding it to the group management device 2504 for determining the group of users with which it is managed.
  • the network communication device 2503 further includes a term receiving device for receiving a term finally selected by the user upon input, for forwarding it to the group corpus
  • the device 2505 is managed and updated to utilize these terms to update the group corpus associated with the user group to which the user belongs.
  • the term receiving device is further configured to receive material related to the user group to forward it to the group corpus management and update device 2505 to use the materials to initially construct the user group to which the user belongs. Associated group corpus.
  • the matching device 2502 is connected to the user group corpus 2501, the matching device 2502, and the group management device 2504, for performing matching query in the user group corpus 2501 based on the key input sequence to obtain one or more matching group entries.
  • the option then sends the group entry option to the network communication device 2503 for return to the user device 140.
  • the matching device 2502 selects one or more group corpora corresponding to the user from the plurality of group corpora 2501 according to the user group information determined by the group management device 2503 to perform a matching query.
  • the matching device 2502 further includes priority determining means (not shown) for prioritizing the user according to the entry source, whether the entry was previously selected, the time sequence in which the entry was previously selected, and the entry being previously The number of selections, the user-preset input preference options, and/or the number of times the term is searched on the network determines the priority of each of the matching term options.
  • the group management device 2504 is connected to the matching device 2502, the network communication device 2503, and the group corpus management and update device 2505.
  • the group management device 2504 is responsible for managing user groups, including receiving user registration information from the network communication device 2503, and registering the user with one or more user groups; maintaining user group information, such as group name, group member ID The group corpus number corresponding to the group, etc.; determining the group to which the group belongs according to the identity of the user, and transmitting the determination result to the matching device 2502 to help it select the group corpus 2501 for performing the matching query.
  • the group management device also assists the group corpus management and update device 2505 in managing and updating the thesaurus, for example, sending priority information of a certain user in the user group to the group corpus management and update device 2505, the latter according to This information adjusts the priority attributes of related terms and the like.
  • the group corpus management and update device 2505 is coupled to the network communication device 2503 for receiving entries sent by the user, updating them to the terms of the group corpus 2501 or attributes thereof.
  • the group corpus management and update device 2505 can also receive material related to the user group from the network communication device 2503, and initialize or update the group corpus 2501 by learning or training it. This feature is very useful for further simplifying user input and reducing the overhead of building a categorical lexicon. For example, for a group working on translation of patent documents in the semiconductor field, users can upload a file containing common semiconductor domain terms such as "etching", "vapor deposition", “coating", and the like.
  • the group corpus management and update device 505 uses this material to initialize a group corpus 2501 for the group, thereby eliminating the need for training labor for members to enter the relevant entry for the first time.
  • the group corpus 2501 is an important concept introduced by the present invention, which directly corresponds to a user group, usually Each user group corresponds to a group corpus.
  • the group corpus 2501 contains the most commonly used terms of the group members in the corresponding user group, and sharing the corpus in the group members can save these member users a lot of time-consuming and laborious input method training steps, in other The input of the member directly obtains the input result that you want.
  • the composition of the group corpus 2501 and the attributes of the group entry will be described in detail below with reference to FIG.
  • step S2201 the network communication device 2503 of the web server 250 receives a key input sequence of the user on the user device 140 via the network;
  • step S2202 a matching query is performed in the corresponding group corpus based on the key input sequence and the group joined by the user to obtain one or more matching group term options;
  • step S2203 the obtained group term option is fed back to the user device 140 for selection by the user.
  • the foregoing method may further include a registration step S2204 on the network server side, that is, before the text input, the registration information of the user may be received by the network communication device 2503, and the user is registered to one or more user groups.
  • the user group is associated with the group corpus.
  • Such a registration process can employ, for example, a group registration function as known in the art.
  • user groups can be established according to various criteria, i.e., any commonality that the user has, such as engaging in related work, completing the same task, having similar hobbies, or living in the same city. An example of a user group is detailed below in conjunction with FIG.
  • a user may not be necessary to associate a user with a group without such a registration step.
  • a group for that business unit is created (for example, by a department head)
  • the department member including the user is automatically added to the group.
  • the web server can immediately determine that it belongs to the business department group based on its identity, and searches for the relevant group entry from the corresponding group corpus.
  • the method may further include the step of receiving the user identity information sent by the user, step S2205, to determine a user group associated with the user group, and thereby determining a group corpus associated with the user corpus.
  • this step is also not an essential step for the web server 250 to determine the group of users.
  • the web server 250 itself is an intranet server that is only applicable to one or some groups of users, the user is considered to be able to utilize the group corpus 2501 on the web server 250 without any authentication procedures.
  • the above method may further include receiving step S2206 of the user returning the entry to update the group corpus 2501.
  • each member of the user group can provide its own contribution to the group corpus, such as new terms, self-advancing input habits, etc., these resources can be Collected in the term or term attribute of the group corpus for reference or direct use by other members of the group.
  • a greater weight is assigned to the contributions of members with higher authority or dominance in the user group. For example, new terms provided by department heads have higher priority in feedback.
  • FIG. 7 is a block diagram of a system 230 for assisting a user to perform text input in accordance with the present embodiment.
  • system 230 includes a web server 250 and user equipment 140, which are connected by a network.
  • the network represents a worldwide network and set of gateways that communicate with each other using, for example, the TCP/IP suite of protocols, which may be the Internet centered on the backbone of high-speed data communication lines between the primary nodes or host computers, which are made up of thousands Tens of thousands of businesses, governments, education, and other computer systems that route data and messages.
  • the network can also be implemented as a number of different types of networks, such as, for example, an intranet, a local area network (LAN), or a wide area network (WAN).
  • Figure 8 is intended as an example and is not a structural limitation of the various exemplary embodiments.
  • FIG. 10 shows a schematic diagram of a group of users registered on a network server according to the present embodiment.
  • the figure shows four user groups, namely "Basketball League” group 2601, “Zhenghua Garden Community” group 2602, “Semiconductor Domain Translation” group 2603, and “Expo Tour” group 2604.
  • These groups represent user groups that are divided according to the user's common interests, living areas, work, and short-term attention. In fact, there are more grouping criteria, such as players of the same online game, students of the same university, and so on. In real life, users may be inextricably linked, resulting in a certain number of users having a certain effect on the efficiency of the input method for the group members.
  • the user groups 2601 to 2604 each include their group members, and the group members can register to the group by using their real names or network IDs and the like. Members included in different groups may have duplicates or crossovers.
  • the "Semiconductor Domain Translation” group 2603 and the “Expo Tour” group 2604 each include a member “Li Meng”
  • all members of the "Basketball League” group 2601 are members of the "Zhenghua Garden” group 2602 because The basketball league is an organization between the owners of the community.
  • each user group that are highlighted (indicated by a bold font in the figure) (for example, Zhang Xiaoliang in group 2601) have a higher priority, and other members have a lower priority. If needed, You can also set a more granular priority level.
  • This priority distinction is primarily used to help group members prioritize the contribution of terms when contributing to the group corpus.
  • the priority of the term will be assigned or updated according to the priority of the source user, the current priority of the entry, the most recently selected time or number of times.
  • the term priority is an important item in the term attribute and is used to determine the order in which the group terms provided to the user appear in the candidate box.
  • the captain of the "Basketball League” group recently proposed to promote the flag football game to the group members to improve the physical fitness of the members.
  • the term “yq-waist flag” provided by the source user will be given more than a member.
  • the entry “yq- ⁇ ” has a higher priority.
  • the group corpus corresponding to the "Zhenghua Garden” group contains the words “xfxch - Shuangfu car wash line”, "syj - Sanyouju hotel", "lj - Lijia baby”
  • the name, as well as the community owners such as "psy - Pan Shanyi (property manager name)", “rjq - Ren Jianqiang (business director)" are familiar with common terms.
  • FIG. 1 A schematic diagram of an attribute list 2700 of group entries in a group corpus maintained on a web server in accordance with one embodiment of the present invention is shown in FIG. Multiple entries for the group corpus of a basketball league group are listed in this list 2700.
  • the first column 2701 is the content of the entry, such as "zb - walking (basketball technical terminology)", “lb - rebounding (basketball technical terminology)”, “bkl - Barkley (basketball star name)” and “huren - Lakers (American NBA basketball team name).
  • the group corpus also includes "zhhb - Zhenghua Cup (community event name)", “yq - waist flag (group member activity)” Specific group terms.
  • the second to seventh columns 2702 - 2707 of list 2700 list the various attributes of the entry, which are group identification, priority level, number of times selected, time of most recent selection, source user, and/or target user. Among them, the three levels of priority, source user and/or target user need to be highlighted.
  • the priority level determines the order in which the user is provided with the entry after the multiple entries are matched, and the higher priority entry is ranked by the source user who is the member user who provides the entry to the corpus management and update device. , this attribute will affect the priority setting of the term.
  • the target user it refers to the range of users who can provide the term to it.
  • the target user of some terms may be all members, and the target users of other terms may be only a part of the staff, and the association with the entry is excluded. Ordinary staff for the choice of the term.
  • the target user of "post”, “ssyq-search engine” can be all members, and the target users of the terms “ZX-writing” and “qq-infringement” can be limited to members of the company's intellectual property management function to further Improve the accuracy of group entry feedback and user input efficiency.
  • the term priority attribute can be dynamically adjusted according to the selected time, number of times, source user, etc., so that the order of the returned terms is more in line with the input habits of the group members, and the most accurate terms will be The option is returned to the user.
  • many input methods assign the most convenient space key to the first-order alternative words, and the following alternative words need to be selected by numeric keys, " + " and " - " or other shortcut keys.
  • the later words even need to use the page-turning key. If you can put the most accurate terms in the front position or even the first place, it will undoubtedly save the user's labor and improve the user's input efficiency.
  • a second embodiment of the present invention further provides a system for a user to input characters, including a user equipment for inputting characters and a second embodiment of the present invention for assisting user equipment according to the second embodiment of the present invention. Enter the text of the web server.
  • FIG. 12 is a schematic diagram of a typical computer network implementing the present embodiment and the present invention, showing various types of network elements including User equipments 311 and 312, Internet 321, 331 and network servers 341-343.
  • the user device 3 11 is also referred to as a computer 31 1
  • the user device 312 is also referred to as a mobile phone 312.
  • the number of various types of network elements shown in FIG. 12 for the sake of simplicity may be less than the number in an actual network, but such omission is undoubtedly clear and sufficient to not obscure the present invention. Open for the premise.
  • Those skilled in the art will also appreciate that some other types of network elements are also omitted from the figure, such as a modem (modem), an access device (such as a DSLAM), etc., typically located between the user equipment and the Internet.
  • modem modem
  • DSLAM DSLAM
  • the respective network servers 341 to 343 are different in their geographical regions. The reason is attributed to different routers, and respectively serve user equipment in different areas. Typically, one such server can be provided for a city, a province, or even a country. Of course, considering the unevenness of the distribution of end users, the distribution of servers can be subject to the distribution of end users. In summary, any variation in the network layout is intended to fall within the scope and spirit of the invention.
  • 13a to 13d are schematic diagrams of a human-machine interface displayed at the user equipment for setting the input method according to the present embodiment.
  • the present invention provides a solution for providing text input to a user, and in fact provides a solution
  • the input method may be the only input method used by the computer 31 1 , or each object indicated by the user identifier may be a thousand button provided in the Windows operating system, wherein the button 3201 It is a switch button of the input method.
  • the button 3201 It is a switch button of the input method.
  • FIG. 13a to FIG. 13c when the user controls the mouse 3205 to move to the button 3201, click the left button to change the display state of the button 3201 and change the state of the input method.
  • the color of the button 320 ⁇ indicating activation shown in Fig. 13c is different from the button 3201 indicating the inactivity in Fig. 13a.
  • the input method based on the interaction between the user equipment and the web server according to the present embodiment becomes the input method for convenience of expression.
  • a button 3202 is also shown, which can be switched between Chinese and English input by a mouse click.
  • the present invention is not limited to the input of Chinese and/or English (such as button 3202), and the characters of almost all kinds of languages that can be input through the user equipment, those skilled in the art pass The reading context can also be understood.
  • this input method is used to input Chinese, it is not limited to pinyin input, but also to other Chinese input methods such as strokes, Wubi, and the input methods formed by them.
  • the Chinese and English input switching of this input method can also be controlled by a specific button on the keyboard.
  • CapsLock CapsLK
  • this input method will provide the user with a selection list of Chinese characters or words.
  • the user adds the input result to the corresponding input box through further selection of the keyboard and the mouse; correspondingly, when CapsLK is in uppercase state, the input method will not provide a selection list including Chinese characters, without loss of generality, directly Add the English alphabet sequence corresponding to the user's key sequence to the corresponding input box.
  • a button 3203 is also shown, which can be switched between full and half angles by a mouse click, without loss of generality, the crescent of the button 3203 in Fig. 13a represents a half angle, in Fig. 13b The circle in button 3203' represents the full angle.
  • a button 3204 is also shown, which can be switched between a Chinese punctuation mark and an English punctuation mark by a mouse click.
  • the button 3204 in Fig. 13a represents a Chinese punctuation mark
  • the button 3204 in Fig. 13b 'Express English punctuation.
  • Shown in any of Figures 13a-13c is a control bar that is presented by default in the lower right corner of the user device screen and is preferably draggable by the mouse.
  • Figure 13d shows another human-computer interaction interface for the user to configure the on/off of the input method, which is specifically a part of a web-based browser, in which "Whether you want to use Baidu online input method (This input method) After the item, two options for opening and closing are provided. By clicking, you can manually control whether this input method is activated.
  • the user activates the input method in the above manner or its equivalent, even if the computer 311 is pre-installed with other input methods, the input method becomes the current default input method.
  • the computer 311 preferably automatically activates other pre-installed input methods such as Google input.
  • activation of other input methods can also be manually controlled by the user to select other input methods that are desired to be switched to.
  • a pop-up menu appears listing the other Various input methods are available for the user to select.
  • the input method can also be the only input method available to the computer 311.
  • the input method can be activated under other conditions, for example, when the user turns on the online input method in the personality setting of FIG. 13d, each time the user accesses a predetermined network address set using the browser. This input method is activated at any address.
  • the method for providing text input to the user in the user equipment exemplified by the computer 3 11 will be described below with reference to FIG. 14 and FIG. 12 .
  • the computer 31 1 is taken as an example, the same process is understood by those skilled in the art. It can be implemented in the handset 312. The following description will refer to some of the contents of the web server, which are also detailed below.
  • Figure 14 is a flow chart showing a method for a user to input text in a user equipment connected to a network server according to the present embodiment.
  • the user for example, Zhang San, has activated the input method.
  • Figure 16 It should be understood that the manner of connection between the user equipment and the network server is not limited to a high speed, stable wired connection, but also includes a wireless connection or a wired and wireless hybrid.
  • step S3301 the computer 311 obtains input information provided by the user. Specifically, Zhang San opens the IE browser, and the home page www.baidu.com is automatically opened, so that the content as shown in FIG. 16 is presented. Of course, the search bar 3501 should be empty and the map is located below the search bar. The content should be ignored. Zhang San moves the cursor to a position near or in the search bar 3501, clicks the mouse, and can then enter it in the search bar 3501. Without loss of generality, suppose that Zhang San hits the following buttons on the keyboard in turn, each of which has a tap count of 1 and the pressed time is below a threshold: C, A, 0, M, E, I, W, A, N, G.
  • the information provided by each of the above-mentioned taps of Zhang San is regarded as one input information, that is, when the three presses the C key, one input information indicating that the C key is pressed is Computer 31 1 is obtained.
  • the input information is identified by a button or a combination of keys represented by the input information, and is in the form of input information "XX", and the double quoted portion indicates the corresponding key or key combination.
  • input information "C” indicates that the input information of the keys C, A, 0, M, E, I is sequentially pressed, and is referred to as input information "CAMOME".
  • step S3301 may be performed in such a manner that the browser retrieves input information provided by the user through a piece of script or function therein. That is, the functional modules of the input method distributed on the computer 311 for obtaining input information are implemented by a web-based browser. Therefore, if Zhang San uses the corresponding browser that provides this input method, then when downloading and installing the browser, he downloads and installs the function module that implements the input method on the client, and can start based on the input method. Enter text.
  • the computer 311 may install a browser-independent application, which is similar to the client software of the input method.
  • the client software is mainly responsible for Subsequent operations such as acquisition of the above input information and transmission of input information which will be mentioned later.
  • step S3302 the computer 31 1 transmits the input information "C" to the web server.
  • network server the input information specifically depends on depends on the specific configuration of the network shown, as well as the routing algorithm and the like. The most direct way is to specify a network server for each user equipment of the IP address segment.
  • the computer 31 1 clears its own IP address, it can know which network server to send the input information to, or Pre-storing the correspondence between the IP address segment and the network server in the routing device, and after obtaining the IP address of the computer 311 from the IP packet from the computer 311, querying the corresponding relationship, and determining the network server as the destination;
  • input information can be sent to multiple network servers, and then communication between these network servers to determine a network server to perform subsequent operations; as another alternative, Figure 12
  • the network servers shown have a division of labor.
  • the input information sent by the computer 3 11 can carry the corresponding language such as the Chinese identifier, and thus, the Chinese input server ( If any, to be responsible for subsequent operations.
  • the present invention does not limit the manner in which the network server is selected and the manner in which the user equipment interacts with the network server. For example, such interaction may be based on the IP protocol or other communication protocols for the Internet.
  • a network server such as the network server 341 (hereinafter referred to as the server 341) in Fig. 12 in step S3401.
  • the server 341 performs a matching query in the dictionary database based on the input information "C" to generate an alternative input item set.
  • the server 341 uses different algorithms to translate the input information "C", which includes general English alphabet input, Chinese pinyin, Chinese strokes and the like. Take the English input as an example. If you do not consider the Lenovo input, then the server 341 will Generate a set of alternative entries that contain an alternate entry, the English letter c. If you consider association input, then this collection will include at least one word beginning with the letter c. If Chinese Pinyin is used, the set of alternative entries will include the Chinese characters with the first letter C in Pinyin.
  • step S3403 the server 341 sends the generated set of alternative entries back to the computer 311.
  • the sending process in step S3403 can be implemented based on WEB, and thus, the set of alternative input items will be encapsulated in the transmission unit under the http protocol for transmission.
  • the transmission can also be sent in the form of instant messaging (IM), such as the interaction between the small i-robot and the client.
  • IM instant messaging
  • Zhang San performs Chinese input so this collection includes Chinese characters such as "from, this, talent, place, eat, out, into, car, poor", each of which becomes Chinese.
  • An alternate entry or short entry is used for the input information and the set of alternative entries obtained by inputting the input information, in order to clarify their correspondences, so that the above set is called the set "C".
  • the alternate input sent back by the server 341 is sent back to the computer 3 11 and received by the computer 31 1 in step S3303.
  • the computer 31 1 notifies the set to the set.
  • This step may employ any known technical means for the computer to provide human readable information, such as a screen display, a speaker play, and the like. Without loss of generality, this example uses a screen display as an example.
  • the prompt bar will include a button that allows the user to display alternate entries for the next line with a mouse click.
  • the user can also command to display alternative entries for the next line by pressing a designated key on the keyboard, such as the pagedown key.
  • Zhang San will select one of these alternative inputs by mouse and keyboard operation, and confirm by pressing the left mouse button or the corresponding numeric key, thereby giving the computer 31 1 - indication information, for example, when When the mouse is hovered over the Chinese character "From", the left button is clicked, and the computer 311 is given an instruction indicating that "From" is the selected input item.
  • step S3305 the computer 31 1 receives the indication information provided by the third, and accordingly, in step S3306, the Chinese character "slave" is used as the input result of the input, and is displayed when the user inputs the data.
  • the specified location for example, in the browser's search bar.
  • step S3307 the computer 311 also transmits this information to the web server 341.
  • the web server 341 selects "C” from the alternative input item according to Zhang San. (hereinafter referred to as the collection "C"), the information "from” is selected to train and update the dictionary database saved by itself. Therefore, when the input method has a considerable user group, the user's choice can be learned, and the vocabulary can be dynamically updated, for example, a new word with more than a predetermined number of users input for a period of time, such as "sharp brother". In the thesaurus.
  • Zhang San does not make a selection from the set "C”, but instead presses the A key on the keyboard, as is known to those skilled in the art, at this time, an alternative entry
  • the collection will converge. Specifically: as one of the alternatives:
  • the computer 311 performs step S3301 again to obtain input information "A";
  • the computer 31 1 also sends the input information "A" to the web server 341;
  • step S3402 is performed, in which the web server 341 integrates the previous input information "C” with the new input information "A” to obtain an integration result "CA", or a new input information "CA”, and thus, Using this as an entry, the corresponding alternative entries are again retrieved from the thesaurus, and the set "CA” formed by these alternative entries is returned to the computer 31 1 .
  • the process thereafter is similar to the above, and it is not awkward here.
  • the computer 31 1 caches the set “C” when performing the first step S3303, preferably, when the third provides the indication information in step S3305, the cached set “C” "Cleared.
  • computer 31 1 will rely on the set "C” in the cache to respond to the further input of Zhang San.
  • Zhang San enters “C", “A”, "0” in turn and then selects an input item such as "grass”, or Zhang three inputs "C", "A", T, and then Enter the backspace, enter "0", and so on.
  • Zhang San enters “C", “A”, "0” in turn and then selects an input item such as "grass”, or Zhang three inputs "C", "A", T, and then Enter the backspace, enter "0", and so on.
  • each key press triggers the acquisition and transmission of an input message corresponding to a single key.
  • a variation of this example is described below.
  • the computer 31 1 will take the sequence that has been previously input as input information when the user prompts it to intercept the input information, and transmits it to the server 341.
  • the user inputs C, A, 0, M, E, I, W, A, N, G in sequence, and finally presses a space.
  • This operation of pressing a space triggers the computer 3 11 to "C, A, 0, M, E, I, W, A, N, G" are collectively transmitted as input information to the server 341.
  • the screen display on the computer 311 is not lost in general as shown in FIG. 16, wherein the search bar 3501 is the position where the user moves the cursor, and input here, and can be seen.
  • the "caomeiwang" in the input information bar 3502 is also displayed on the screen along with the set of alternative input items returned by the server 341, and the syllable and the syllable are also separated by the superscript ""' to give the user a clearer Experience.
  • the various alternative entries in the prompt bar 3505, such as 3503 and 3504, are identified by sequential numbers, respectively. If the user chooses "Strawberry Net", then “Strawberry Net” The Chinese characters will eventually appear in the search bar 3501. Preferably, the cursor will be located after the "Net” word.
  • the operation of the present input method can also be associated with the identity information of the user, which will be described below with reference to FIGS. 14 and 15.
  • step S3308 is executed to provide a login interface, and the user authenticates to the web server 341 by inputting the user name and password.
  • the step is S3309 sends it to the network server 341.
  • the web server 341 will respond to the user's subsequent text input operation, and the process also considers the identity information of the user.
  • the manner in which the web server 341 obtains the identity information of the user is not limited thereto.
  • the computer 31 may access user identity information in the operating system or other applications, and if such access is allowed, the user can be obtained from the user.
  • the identity information is reported to the server 341, which is more suitable for the case where the computer 31 is used for home or other private purposes.
  • the identity information of Zhang San is received by the server 341 in step S3406.
  • the input history record of Zhang San can be retrieved in step S3407, which is all the characters input by the server 341 in the past period of time, and the input of the input method by the input method, including the three input times from the alternate input. Those inputs selected in the item collection.
  • the input history record retrieved in step S3407 may be applied to the generation process of the set of candidate input items in step S3402, specifically: in step S3402, the computer 31 is first performed in the dictionary database based on the input information such as "C". Matching the query, the preliminary query result is obtained, and the content thereof is the same as the alternative input item set "C" in the above example.
  • the computer 31 1 processes the preliminary query result based on the input history of Zhang San to generate the alternative input set "C” in this example.
  • the computer 311 compares the input history with the preliminary query result, and arranges the content (input) included in the input history in the preliminary query result in a position that takes precedence over other content (input). Take the alternative input set "C” in this example.
  • the input history records may correspond to the input information at the time, and thus, when the preliminary query result is compared with the input history record, only the input history records corresponding to the current input information may be used to arrange each of the input history records. The location of the input item in turn generates a set of alternative input items.
  • help server 341 updates the input history of the user according to the indication information received in step S3404, for example, adding the input indicated by the indication information to the input history. Recording, or, adding the input information to the input history in association with the input indicated by the indication information.
  • FIG. 17 is a user setting for a user to input text to a network server according to the embodiment.
  • the following is a block diagram of the user equipment 31 1 shown in FIG. 12, which includes:
  • Obtaining device 3 111 configured to obtain input information provided by a user
  • a first sending device 3112 configured to send the input information to the network server, where the network server provides feedback information to the user equipment based on the input information;
  • the first receiving device 3113 is configured to receive feedback information sent back by the network server.
  • the notification device 3 114 is configured to notify the user of the feedback information for further human-computer interaction. Further, the user equipment 311 further includes:
  • An identity obtaining device 3115 configured to acquire identity information of the user
  • the first sending device 3112 is further configured to send the identity information of the user to the network server. Further, the user equipment 311 further includes:
  • a second receiving device 3116 configured to receive indication information provided by the user, used to represent an input item selected by the user in the set of candidate input items, and use the input item as the user input Input result;
  • the first sending device 3112 is further configured to send the indication information to the network server.
  • the user equipment 31 1 is for the user to input text in the WEB-based application.
  • the user equipment 31 1 is for the user to input text in the WEB-based browser program.
  • Figure 18 is a block diagram of a network server for assisting a user of a user equipment for text input according to the present embodiment, such as the server 341 shown in Figure 12, which includes:
  • a third receiving device 3411 configured to receive input information provided by the user equipment and sent by the user, and a generating device 3412, configured to perform a matching query in the dictionary database based on the input information, to generate an candidate Input item set;
  • the second sending device 3413 is configured to send the set of candidate input items to the user equipment.
  • the third receiving device 341 1 is further configured to:
  • the generating device 3412 is further configured to: integrate the new input information with previously received input information to obtain an integration result; perform a matching query in the dictionary database based on the integration result to generate a new preparation Select a set of input items;
  • the second sending device 3413 is further configured to send the new set of candidate input items to the user equipment. Further, the third receiving device 3411 is further configured to: receive identity information of the user sent by the user equipment;
  • the server 341 further includes: a retrieval device 3414, configured to retrieve the identifier according to the identity information of the user User's input history;
  • the generating device 3412 further includes: a querying device 34121, configured to perform a matching query in the dictionary database based on the input information to obtain a preliminary query result; and the processing device 34122 is configured to: according to the input history record of the user, The preliminary query results are processed to generate the set of candidate inputs.
  • a querying device 34121 configured to perform a matching query in the dictionary database based on the input information to obtain a preliminary query result
  • the processing device 34122 is configured to: according to the input history record of the user, The preliminary query results are processed to generate the set of candidate inputs.
  • the processing device 34122 is further configured to: compare the input history record with the preliminary query result, and arrange the content that is also included in the input history record in the preliminary query result to be prioritized over other content. a location to generate the set of candidate inputs.
  • the third receiving device 341 1 is further configured to receive indication information from the user equipment, where it is used to indicate an input item selected by the user in the set of candidate input items;
  • the server 341 further includes an updating device 3415, configured to perform at least one of the following items according to the input item indicated by the indication information: - updating an input history record of the user; - storing the network server
  • the dictionary database is trained and updated.
  • a third embodiment of the present invention further provides a system for a user to input characters, including a user equipment for inputting characters and a third embodiment of the present invention for assisting user equipment according to the third embodiment of the present invention. Enter the text of the web server.
  • FIG. 19 illustrates a search for providing input related to input information when a user performs text input according to an aspect of the present embodiment.
  • User equipment 41 for related information That is, when the user performs text input on the user device 41, the user device 41 provides a corresponding input term option according to the user input sequence, and also searches for related search related information, such as advertisement information and webpage information, according to the input sequence of the user device. , travel information or map information.
  • related search related information such as advertisement information and webpage information
  • the user device 41 can be any electronic product that can interact with the user through a keyboard, a remote controller, a touch pad, or a voice control device, such as a computer, a smart phone, a PDA, a game console, or an IPTV.
  • the user equipment 41 includes a first obtaining means 411, a querying means 412, a providing means 413, a storage means 414 for storing the local thesaurus (for the sake of clarity, the following dictionary is referred to as the thesaurus 414), and the user saves the keyword advertisement library.
  • the storage device 414' (hereinafter referred to as the local advertisement library 414', or the search information library 414') for the sake of brevity.
  • storage devices 414 and 414' may be separate or identical, or may be implemented by a set of memory arrays, respectively.
  • the first obtaining device 41 1 acquires an input sequence that the user is input in real time through any interactive device that can perform human-computer interaction with the user.
  • the interactive device can be a keyboard, a remote control, a touchpad or Voice control equipment, etc.
  • the first obtaining means 411 acquires the key sequence of the user's tap in real time (for the sake of brevity, the input sequence is still referred to below).
  • the querying device 412 matches the user input sequence provided by the first obtaining device 41 1 with the thesaurus 414 to obtain one or more matching input term options.
  • the following is a description of the Chinese language.
  • the present invention allows the user to input Chinese in the manner of full spelling, double spelling, and five strokes.
  • the querying device 412 also searches in the keyword advertisement library 414' according to the user input sequence to obtain related one or more advertisement information options.
  • the query device 412 queries in the thesaurus 414 to obtain a combination of terms such as "1 I love the Bund; 2 I love", and at the same time in the advertisement library 414', the query is obtained with the "Bund” "The relevant advertising information includes landmark buildings such as “Bund No. 3” and “Bund 18", so the advertisement information option "3 Bund No. 3; 4 Bund No. 18" is provided.
  • the process of querying the advertisement information (or searching for related information) related to the input sequence may use various intelligent or fuzzy search algorithms that are currently known, and will not be described herein.
  • the process of querying the advertisement information (or searching for related information) related to the input sequence may adopt various intelligent or fuzzy search algorithms that are currently known, and will not be described herein.
  • the providing device 413 then provides the user with one or more matching input term options obtained by the querying device in a certain order and format for selection by the user for specific input. For example, by displaying to the user in an input window bar of the display, multiple entry options can be displayed in the input sequence, and multiple entry options can be included in the next column for the user to select.
  • the specific function keys can be, for example, "+" and "-".
  • the advertisement information option may use different display modes in the entry bar, such as different colors or gray scales.
  • the advertisement information option has a web page IP address or a uniform resource identifier (URL) associated with the advertisement information.
  • the user can select this ad information option by pressing the corresponding number key for the option or by moving the cursor over the mouse to hover or click on the option.
  • the URL targeting device (not shown) in the user device 41 can be directed to its corresponding webpage URL through the network, for example, in a browser open situation, and connected to the web server corresponding to the webpage via the network. And display its web page to the user in the browser.
  • the first obtaining means 411 and the inquiring means 412 and the providing means 413 are continuously operated. Specifically, the first obtaining means 41 1 acquires the input sequence of the user in real time and continuously supplies it to the querying device 412, for example, "w”, "wo” ... “wo” ... "woai” ...
  • the querying device 412 also performs a matching query on the user input sequence continuously provided by the first obtaining means 411 in real time to continuously obtain the term options corresponding to the above input sequences, for example “w” corresponds to “1” 2 ⁇ , 3 grips, 4 nests; “woai” corresponds to “1 I love, 2 ⁇ , 3 grips, 4 nests, and; “woaiwaitan” corresponds to "1 I love the Bund, 2 Bund 3, 3 Bund 18".
  • continuous refers to the action that is always performed before the user finally selects an entry option. For example, the user may pause for a while after tapping the key sequence "woai", such as 0.5 second. Continue to tap the subsequent buttons.
  • the querying device 412 also obtains its respective priority when performing a matching query in the thesaurus 414 and the ad library 414' based on the user input sequence to obtain a plurality of input term options and advertising information options.
  • the providing device 413 displays the plurality of matched input term options and advertisement information options provided by the querying device 412 in the item column to the user in a priority order, wherein the higher the priority, the input term option or the advertisement information The more the option is displayed.
  • the preferred input item options of priority are generally placed at the foremost position, so that the user can select by simply pressing ' 'ENTER' or the space bar, and the advertisement information option is usually placed in each The position of the option at the end of the line.
  • the querying device 412 can query the thesaurus 414 and the advertisement library 414' according to the user characteristics to obtain matching input term options and advertisement information options. After obtaining matching multiple input term options and advertising information options, the priority can also be determined based on user characteristics.
  • the user characteristics include the user's input history, user-set personal preference selection, user attributes, user address, etc.
  • the user attributes include the user's occupation, gender, international, birthplace, age, etc., which reflect personal characteristics.
  • the querying device 412 can also determine the priority level of the vocabulary selection in each term option or the term option in the user input history, and the semantic relevance between each vocabulary in each term option.
  • the querying device 412 can also determine the priority according to the personal preference selected by the user.
  • the query The device in the advertisement library 14' queries for a plurality of landmark buildings or tourist attractions located on the Bund corresponding to "waitan", such as China Merchants Headquarters, HSBC Building, Citibank, Bund No. 3, Bund 18, etc., and then according to users
  • the set personal preference can be judged as the highest priority for shopping, dining and other architectural attractions such as "The Bund No. 3" and "The Bund 18".
  • the querying device 412 can also determine the geographic location of the user equipment according to the current IP address of the user equipment, so as to determine the priority of the vocabulary related to the region in the input sequence, for example, but the user input sequence is
  • the user equipment 4411 can be used to save and store the above-mentioned user in the local storage storage device.
  • the user equipment device 4411 can also input the historical history record, the input and input preferences, and the word vocabulary between the users who are saved and saved. Relevant information such as inter-related relations and other information will be updated.
  • the user equipment preparation unit 4411 further includes a second and second acquisition acquisition device 441155, and a new update device 441166. .
  • the second and second acquisition and acquisition device 441155 pass and interact with the user's user step by step to obtain the user's household to provide the supply device.
  • Set 441133 to provide a selection of options for multiple input and input terms.
  • the newer device is newly updated, and the 441,166 roots are selected according to the user selection option provided by the second and second acquisition and retrieval device 441155 to update the new vocabulary library and input the history of the user account.
  • the second acquisition and acquisition device 441155 can also be self-operated in the Internet. Searching for new words and phrases in the network, and using them to update the new word library 441144 and so on. .
  • the advertisement library 441144'' in the user equipment 4411 can be used in a timely manner with the weekly cycle.
  • the active mode of the mobile subscriber is updated, for example, if the subscriber equipment 4411 is connected to the one or more network equipments via the network, and Depending on the time of day or week cycle . .
  • the advertisement library 414' may be located outside the user equipment 41, for example at a network device or distributed among a plurality of network devices, and the user device 41 may be connected to the network device via the network, thereby querying and The user enters a sequence of related advertising information options.
  • the network 20 illustrates a user device 41 and a network device 42 for simultaneously providing search related information related to input information when a user performs character input according to another aspect of the present embodiment, wherein the user device 41 is associated with the network device 42 via a network.
  • the network can be the Internet, intranet, etc. That is, when the user performs text input on the user equipment 41, the user equipment 41 sends a query request to the network device 42 via the network, requesting the network device 42 to search for relevant search related information, such as advertisement information, webpage information, according to the user input sequence. The travel information or the map information is then provided to the user along with the search related information fed back by the network device together with the input term option obtained by the network device query.
  • relevant search related information such as advertisement information, webpage information
  • the user equipment 41 includes a first obtaining device 411, a first transmitting device 417, a first receiving device 418, and a providing device 413.
  • the network device 42 includes a second receiving device 421, a querying device 422, a second transmitting device 423, a storage device 424 for saving the network lexicon (for the sake of brevity, the following is called a network vocabulary 424) and for storing keyword advertisements.
  • the storage device 424 of the library (for the sake of brevity, hereinafter referred to as the network advertisement library 424').
  • the first obtaining device 41 1 acquires an input sequence that the user is input in real time through any interactive device that can perform human-computer interaction with the user.
  • the interactive device can be a keyboard, a remote control, a touch pad or a voice control device, and the like. Taking the keyboard as an example, when the user taps a button in the keyboard to input, the first obtaining device 411 Get the sequence of keystrokes that the user taps in real time (for simplicity, the following is still called the input sequence).
  • the first transmitting device 417 in the user device 41 transmits the user input sequence provided by the first obtaining device 41 1 to the network device 42 in real time and continuously.
  • the second receiving device 421 in the network device 42 receives the input sequence and provides it to the querying device 422.
  • the querying device 422 performs a matching query on the user input sequence with the thesaurus 424 to obtain one or more matching input term options.
  • the following is a description of the Chinese language.
  • the present invention allows the user to input Chinese in the manner of full spelling, double spelling, and five strokes.
  • the querying device 422 also searches in the keyword advertisement library 424 according to the user input sequence to obtain related one or more advertisement information options.
  • the query device 422 queries the vocabulary 424 to obtain a combination of terms such as “1 I love the Bund; 2 I love", and in the advertisement library 424, the query is obtained with the "Bund".
  • Relevant advertising information includes landmarks such as “Bund No. 3” and “Bund 18", so the advertisement information option "3 Bund No. 3; 4 Bund No. 18" is provided.
  • the process of querying the advertisement information (or searching for related information) related to the input sequence may employ various intelligent or fuzzy search algorithms that are currently known, and will not be described herein.
  • the second transmitting device 423 in the network device 42 also transmits the input term options provided by the querying device 422 to the user device 41 in real time and continuously.
  • the first receiving device 419 in the user device 41 receives the input term option and provides it to the providing device 413 in real time and continuously, and the providing device 413 then selects the obtained one or more matching input term options in a certain order.
  • the format is provided to the user for selection to make specific input. For example, by displaying to the user in an input window column of the display, multiple entry options can be displayed in the input sequence column, and multiple entry options can be included in the next column for the user to select.
  • only one line item option may be displayed in the term column, and the number of the line item option may be default or user-settable, and the previous line or the next line item option is displayed by the user pressing a specific function key.
  • the specific function keys for example, can be "+" and "-".
  • the advertisement information option may use different display modes in the entry bar, such as different colors or gray scales.
  • the advertisement information option has a web page IP address or a uniform resource identifier (URL) associated with the advertisement information.
  • the user can select this ad information option by pressing the corresponding number key for the option or by moving the cursor over the mouse to hover or click on the option.
  • the user equipment 1 can be directed to its corresponding webpage URL through the network, for example, in the browser open situation, connected to the web server corresponding to the webpage via the network, and displaying the webpage to the user in the browser. .
  • the first obtaining means 411, the first transmitting means 417, the first receiving means, and the second receiving means 421, the inquiring means 412 and the second transmitting means 423 of the network device 42 are continuously in between Constantly cooperate with the work.
  • the first obtaining device 411 acquires the input sequence of the user in real time and Continuously provided to the query device 412, such as "w", "wo” ... “wo,,... “woai,,... “woaiwaitan", the first transmitting device 417 is also continuously and continuously Various input sequences are sent to the network device 42.
  • the second receiving device 421 in the network device 42 After receiving the various input sequences sent by the user equipment 41, the second receiving device 421 in the network device 42 also provides the query device 422 in real time and continuously, and the query device 422 continuously continues to the first receiving device 421 in real time.
  • the provided user input sequence performs a matching query to continuously obtain the entry options corresponding to the above input sequences, for example, "w” corresponds to “1 me, 2 ⁇ , 3 grips, 4 nests";"woai” corresponds to "1 I Love, 2 ⁇ , 3 grips, 4 nests;; “woaiwaitan” corresponds to "1 I love the Bund, 2 Bund 3, 3 Bund 18".
  • “continuation” refers to the action that is always performed before the user finally selects an entry option. For example, the user may pause for a while after tapping the key sequence "woai", such as 0.5 second. Continue to tap the subsequent buttons.
  • the querying device 422 also obtains its respective priority when performing a matching query in the network vocabulary 424 and the network advertisement library 424 according to the user input sequence to obtain a plurality of input term options and advertisement information options.
  • the providing device 413 in the user equipment 41 displays a plurality of matching input term options and advertisement information options provided by the network device 42 in the item column to the user in a priority order, wherein the higher the priority, the input word The top option or ad information option is displayed.
  • the preferred input item options of priority are generally placed at the foremost position, so that the user can select by simply pressing ' 'ENTER' or the space bar, and the advertisement information option is usually placed in each The position of the option at the end of the line.
  • the querying device 422 of the network device 42 can acquire the user feature according to the ID of the user login.
  • the ID of the user login For example, user input history, user-specific user lexicon, user-set personal preferences, user attribute information, and the like.
  • the user features may be stored in the network device 42 or in other network devices to which the network device 42 is connected.
  • the querying device 422 can query the network vocabulary 424 and the network advertisement library 424 according to the user characteristics to obtain matching input term options and advertisement information options. Specifically, the querying device 422 can determine the priority level according to the frequency of selection of the vocabulary in each term option or the term option in the user input history, and the semantic relevance between each vocabulary in each term option. The querying device 422 can also determine the priority according to the personal preference selected by the user. For example, when the user sets the input preference to: priority level: shopping > diet > travel, the user input sequence "woaiwaitan" is obtained, and the query is performed.
  • the device queries the network advertisement library 424 to obtain a plurality of landmark buildings or tourist attractions located on the Bund corresponding to "waitan", such as China Merchants Headquarters, HSBC Building, Citibank, Bund No. 3, Bund 18, etc., and then according to users.
  • the set personal preference can be judged as the highest priority for shopping, dining and other architectural attractions such as "The Bund No. 3" and "The Bund 18".
  • the querying device 422 can also determine the region in which the user equipment is located according to the IP address of the current user equipment, thereby To determine the priority of the vocabulary associated with the region in the input sequence, for example, but the user input sequence is "woxihuanbund", where the translation of "bund” has “1 Embankment 2 Dock 3 League 4 (Shanghai) Bund", when the query device 412 According to the IP address of the user equipment, it is currently located in Shanghai, China, so that it can be determined that the "Bund” corresponding translation has the highest priority of "Shanghai Bund” or "The Bund", so the following input terms can be provided: “I like to go to the Bund; 2 I like the Bund; 3 I like the pier; 4 I like the embankment; 5 I like the league.”
  • user features include, but are not limited
  • network device 42 may also update information such as saved user input history, input preferences, and inter-vocabulary associations.
  • the user equipment 41 further includes a second obtaining means 415 and a third transmitting means 418.
  • the network device 42 further includes a second receiving means 425 and an updating means 426.
  • the second obtaining means 415 of the user equipment 41 obtains the selection of the plurality of input term options provided by the providing means 413 by the user through further interaction with the user, and is transmitted by the third transmitting means 418 to the network device.
  • the update device 426 updates the thesaurus and the user input history, the association between the vocabularies, and the like according to the user selection received by the second receiving device 425.
  • network device 42 may also include a third acquisition device (not shown) that may also search for new term combinations on the Internet and update network lexicon 424 and the like.
  • the advertisement library 414' may be located outside of the network device 42, for example at another network device or distributed among other network devices, and the network device 42 may be connected to the other network via the network.
  • the devices are connected to query the advertising information options associated with the user input sequence.
  • FIG. 21 shows another preferred example according to the present embodiment, wherein the user equipment 41 itself also includes a query device 412 and a memory 414 (hereinafter referred to as a local thesaurus 414) for storing a local thesaurus, a local thesaurus 414 and can be used at any time. Or periodically synchronizing with the user-specific user vocabulary in the network vocabulary of network device 42.
  • a query device 412 and a memory 414 (hereinafter referred to as a local thesaurus 414) for storing a local thesaurus, a local thesaurus 414 and can be used at any time. Or periodically synchronizing with the user-specific user vocabulary in the network vocabulary of network device 42.
  • a local thesaurus 414 for storing a local thesaurus, a local thesaurus 414 and can be used at any time. Or periodically synchronizing with the user-specific user vocabulary in the network vocabulary of network device 42.
  • the first obtaining device 411 may first provide the user input sequence to the query device 412 of the user equipment 41 for matching query.
  • the specific query process is as described above with reference to FIGS. 19-20.
  • the first obtaining device 411 can also send the user input sequence to the network device 42 through the third sending device 418, and the query device 422 performs a matching query to obtain one or more
  • the input term option and the advertisement information option related to the user input sequence are specifically referred to as described above with reference to FIG. 20, and the content reference is not described herein.
  • User device 41 also includes a merging device 420 that will input one or more inputs from its own query device 412.
  • the term option is merged with one or more input term options provided by the query device 422 of the network device 42, the duplicate option is deleted, and a plurality of term options obtained from the final merge are determined according to certain rules and from the network.
  • the priority order of the advertisement information options associated with the input sequence fed back by device 42 is then provided to providing means 413 for providing to the user in a corresponding priority order.
  • the input term option provided by the network device 42 should be more accurate, so the priority is higher than the local query to obtain the input term option, and similarly, in order not to affect the user's text input, the advertisement information option is usually placed on each line. The option position at the end of the middle.
  • Fig. 22 is a flowchart showing a method of simultaneously providing search related information related to input information when a user performs character input in a user equipment according to an aspect of the embodiment. That is, when the user performs text input on the user device 41, the user device 41 provides a corresponding input term option according to the user input sequence, and also searches for related search related information, such as advertisement information and webpage information, according to the input sequence of the user device. , travel information or map information.
  • related search related information such as advertisement information and webpage information
  • step S41 the user device 41 acquires the input sequence that the user is input in real time through any interactive device that can perform human-computer interaction with the user.
  • the interactive device can be a keyboard, a remote control, a touch pad or a voice control device. Taking the keyboard as an example, when the user taps a button on the keyboard to input, the user device 41 actually acquires the key sequence of the user's tap (for the sake of simplicity, the following is still referred to as an input sequence).
  • step S42 the user equipment 41 performs a matching query with the locally saved thesaurus (hereinafter referred to as the local thesaurus) according to the obtained user input sequence, and obtains one or more matching input term options.
  • the local thesaurus the locally saved thesaurus
  • the following takes the Chinese as an example for description. This embodiment allows the user to input Chinese in the manner of full spelling, double spelling, and five strokes.
  • the user equipment also searches in a locally stored keyword advertisement library (hereinafter referred to as a local advertisement library) according to the user input sequence, and obtains one or more related advertisement information options.
  • a local advertisement library hereinafter referred to as a local advertisement library
  • the user device 41 queries in the local vocabulary to obtain a combination of terms such as “1 I love the Bund; 2 I love”, and at the same time in the advertisement library 414', the query is obtained with the "Bund” "The relevant advertising information includes landmark buildings such as “Bund No. 3” and “Bund 18", so the advertisement information option "3 Bund No. 3; 4 Bund No. 18" is provided.
  • the process of querying the advertisement information (or searching for related information) associated with the input sequence may employ various intelligent or fuzzy search algorithms as are currently known, and will not be described herein.
  • the user device 41 provides the obtained one or more matching input term options to the user in a certain order and format for selection for specific input. For example, by displaying to the user in an input window column of the display of the user device 41, multiple entry options and the input sequence can be displayed in columns, and multiple entry options can be included in the next column for the user to select. .
  • only one line item option may be displayed in the entry bar, and the line item option number ⁇ ] may be default or user-settable, by the user pressing
  • the specific function key displays the previous or next line item option, and the specific function key can be, for example, "+" and "-,".
  • the advertisement information option can be displayed differently in the entry column.
  • a method such as a different color or grayscale, and a web page IP address or a uniform resource identifier (URL) associated with the advertisement information is built in the advertisement information option.
  • URL uniform resource identifier
  • the user and user device 41 can make further human interactions based on the provided input term options.
  • the user can select the advertisement information option by pressing the corresponding numeric key of the option on the keyboard of the user device 41 or by moving the cursor of the user device 41 to hover or click at the option.
  • the user equipment 41 can be directed to its corresponding webpage URL through the network, for example, in a browser open situation, connected to the web server corresponding to the webpage via the network, and the webpage is displayed in the browser. user.
  • the steps S41 to S43 are continuously cycled.
  • the user equipment 41 acquires the input sequence continuously input by the user in real time and continuously queries locally, for example, the user continuously inputs "w", “wo” ... “wo”.. "woai” ...
  • step S42 the user equipment 41 also performs a matching query on the user input sequence according to the continuous acquisition in real time to continuously acquire the term options corresponding to the above input sequences, for example "w” corresponds to “1 me, 2 ghosts, 3 grips, 4 nests”; “woai” corresponds to “1 I love, 2 ⁇ , 3 grips, 4 nests”; “woaiwaitan” corresponds to "1 I love the Bund, 2 Bund three No. 3, No. 18 on the Bund.
  • continuous refers to the action that is always performed before the user finally selects an entry option. For example, the user may pause for a while after tapping the key sequence "woai", such as 0.5 seconds. Continue to tap the subsequent buttons.
  • the user equipment 41 also obtains its respective priority when obtaining a plurality of input term options and advertisement information options by performing a matching query in the thesaurus and the advertisement library according to the user input sequence.
  • the user equipment 41 displays the plurality of matched input term options and advertisement information options obtained by the query in the item column to the user in priority order, wherein the higher the priority, the input term option Or the higher the ad information option is displayed.
  • the preferred input entry options are generally placed at the foremost position, so that the user can select by simply pressing "ENTER" or the space bar, and the advertisement information option is usually placed in each line. The option position at the end of the middle.
  • the user equipment 41 may further perform a query in the thesaurus and the advertisement library according to the user characteristics, and obtain matching input term options and advertisement information options.
  • the priority level can also be determined based on user characteristics.
  • User characteristics include the user's input history, user-set personal preference selection, user attributes, user address, etc.
  • User attributes include the user's occupation, gender, international, place of birth, age, etc., reflecting personal characteristics.
  • the device can input the words in the selection history item or the word item selection item in each historical word history record according to the user account input.
  • the vocabulary selection of the vocabulary selection the linguistic meaning of each vocabulary between each word entry option is determined to determine its superiority.
  • Priority level is high and low. .
  • the input and input preferences are as follows:: Priority priority level: high and low:: Shopping items>>Drinking food>>Travel tour, then get access to the user After inputting the sequence of the input sequence ""wwooaaiiwwaaiittaann"", the user equipment set 4411 is checked in the wide advertisement library and the enquiry is obtained in correspondence with the ""wwaaiittaann"".
  • One of the landmark buildings located in the outer and outer beaches, or the tourist attractions such as the General Headquarters of the China Merchants Bureau, the Huifeng Fengfeng Building, and the Citigroup Qiyin Bank Bank, No. 3, No. 3, Outer Beach, and No. 1188, Waitan Beach, etc., will be judged with the subsequent preference of the individual according to the user's preference.
  • the top priority of building and building scenic spots which is mainly based on the purchase of shopping objects and food and beverages, is the highest priority.
  • the user equipment set 4411 can also be judged to be broken according to the IIPP address address of the current user equipment equipment.
  • the geographical area in which the location is located from which it is possible to determine the priority of the word vocabulary associated with the local area of the fixed input sequence sequence, For example, for example, but the input sequence of the user input is listed as ""wwooxxiihhuuaannbbuunndd”", in which the translation of "bbuunndd”" has "" 11 embankment shore 22 yards dock Head 33 is the same as the Allied League 44 ((Shanghai Hai)) Bund beach "", Dangdang user equipment equipment 4411 roots according to the user equipment equipment IIPP address address is known.
  • the address address (or (or the user's home address) and so on are all referred to as the characteristics of the user, and the personnel of the technical field technicians should be able to understand,
  • the user characteristic feature package includes, but is not limited to, the above content. .
  • the user equipment 4411 can be used to save and store the above-mentioned user in the local storage storage device.
  • the user equipment device 4411 can also input the historical history record, the input and input preferences, and the word vocabulary between the users who are saved and saved. Relevant information such as inter-related relations and other information will be updated.
  • step SS4455 (not shown), the user equipment set 4411 is also passed through the interaction with the user's user.
  • the advertisement advertisement library library cocoa in the user equipment preparation unit 4411 is actively carried out at any time and with the weekly cycle.
  • the operation is updated, for example, if the user equipment 4411 is connected to the one or more network devices via the network, and is always available. Sexually and with the network . .
  • the advertisement library may be located outside the user equipment 41, for example at a network device or distributed among a plurality of network devices.
  • the user device 41 may be connected to the network device via a network. Thereby querying the advertising information options associated with the user input sequence.
  • FIG. 23 is a diagram of a user equipment in accordance with another aspect of the present embodiment cooperating with a network device when the user performs A flow chart of a method for simultaneously providing search related information related to input information when text is input.
  • the user equipment 41 is connected to the network device 42 via a network, and the network may be the Internet, an intranet, or the like. That is, when the user performs text input on the user equipment 41, the user equipment 41 sends a query request to the network device 42 via the network, requesting the network device 42 to search for relevant search related information, such as advertisement information, webpage information, according to the user input sequence.
  • relevant search related information such as advertisement information, webpage information, according to the user input sequence.
  • the travel information or the map information is then provided to the user along with the search related information fed back by the network device together with the input term option obtained by the network device query.
  • network device 42 maintains a network vocabulary and keyword advertisement library (for the sake of clarity, hereinafter referred to as a network advertisement library or a search related information base).
  • a network vocabulary and keyword advertisement library for the sake of clarity, hereinafter referred to as a network advertisement library or a search related information base.
  • step S41 the user device 41 acquires an input sequence that the user is input in real time through any interactive device that can perform human-computer interaction with the user.
  • the interactive device can be a keyboard, a remote control, a touch pad or a voice control device.
  • the keyboard when the user taps a button on the keyboard to input, the user device 1 acquires the key sequence of the user's tap in real time (for the sake of simplicity, the following is still referred to as the input sequence).
  • step S42 the user equipment 41 transmits the acquired user input sequence to the network device 42 in real time and continuously.
  • step S43 the network device 42 performs a matching query in the network lexicon according to the received user input sequence to obtain one or more matching input term options.
  • the following takes Chinese as an example for description.
  • the present invention allows the user to input Chinese in the manner of full spelling, double spelling, and five strokes.
  • the network device 42 also searches in the network advertisement library according to the user input sequence to obtain related one or more advertisement information options. For example, when the user taps the button to input "woaiwaitan", the network device 42 queries in the network vocabulary to obtain a combination of terms such as "1 I love the Bund; 2 I love", and at the same time in the online advertising library, the query is obtained with the "Bund".
  • Relevant advertising information includes landmarks such as "Bund No.
  • the network device 42 also transmits the queried input term options to the user device 41 in real time and continuously.
  • the user equipment 41 will input the input term option from the network device 42 in real time and continuously to the user, and the user device 41 may press the obtained one or more matching input term options to be determined.
  • the order and format are provided to the user for selection or specific interaction. For example, by displaying to the user in an input window column of the display, multiple entry options can be displayed in the input sequence column, and multiple entry options can be included in the next column for the user to select. Preferably, You can display only one line of terms in the entry bar. The number of line item options can be either default or user-settable. By pressing the specific function key by the user, the previous or next line item option is displayed.
  • the function keys can be, for example, "+" and "-".
  • the advertisement information option may use different display modes in the entry bar, such as different colors or gray scales.
  • the advertisement information option has a web page IP address or a uniform resource identifier (URL) associated with the advertisement information.
  • URL uniform resource identifier
  • step S48 the user and user device 41 can make further human interactions based on the provided input term options.
  • the user can select this ad information option by pressing the corresponding number key for the option or by moving the cursor over the mouse to hover or click on the option.
  • the user device 41 can be directed to its corresponding webpage URL through the network, for example, in a browser open situation, connected to the web server corresponding to the webpage via the network, and displaying the webpage in the browser. To the user.
  • the steps S41 to S47 are continuously cycled.
  • the user equipment 41 acquires the input sequence of the user in real time and continuously transmits it to the network device 42, for example, "w”, "wo” ... “wo” ... “woai” ..
  • the network device 42 also performs the matching query in real time and continuously according to the user input sequence, and continuously sends the queried input term sequence back to the user device 41, for example, "w” corresponds to “1” , 2 ⁇ , 3 grips, 4 nests;; “woai” corresponds to “1 I love, 2 ⁇ , 3 grips, 4 nests”; “woaiwaitan” corresponds to "1 I love the Bund, 2 Bund 3, 3 Bund 18" .
  • continuous refers to the action that is always performed before the user finally selects an entry option. For example, the user may pause for a while after tapping the key sequence "woai", such as 0.5 second. Continue to tap the subsequent buttons.
  • the network device 42 also obtains its respective priority when performing a matching query in the network thesaurus and the network advertisement library according to the user input sequence to obtain a plurality of input term options and advertisement information options.
  • the user equipment 41 displays the plurality of matched input term options and advertisement information options provided by the network device 42 in the entry column to the user in a priority order, wherein the higher the priority, the input word The top option or ad information option is displayed.
  • the preferred input item option of priority is generally placed at the foremost position, so that the user can select by pressing "ENTER" or the space bar, and the advertisement information option is usually placed in each position. The position of the option at the end of the line.
  • the network device 42 can also acquire the user feature according to the ID of the user login.
  • the user features may be stored in the network device 42 or in other network devices to which the network device 42 is connected.
  • the network device 42 can be in the network vocabulary and the network according to the user characteristics. Query in the library to get matching input terms and advertising options. Specifically, in step S43, the network device 42 may determine, according to the frequency of selection of the vocabulary in each term option or the term option in the user input history record, and the semantic relevance between each vocabulary in each term option. Its priority is high. The network device 42 can also determine the priority according to the personal preference selected by the user.
  • the network The device 42 queries the network advertisement library to obtain a plurality of landmark buildings or tourist attractions located on the Bund corresponding to "waitan", such as China Merchants Headquarters, HSBC Building, Citibank, Bund No. 3, Bund 18, etc., and then according to users.
  • the set personal preference can be judged as the highest priority for shopping, dining and other architectural attractions such as "The Bund No. 3" and "The Bund 18".
  • the network device 42 may further determine the region in which the user device is located according to the IP address of the current user device, so that the priority of the vocabulary related to the region in the input sequence may be determined, for example, but the user input sequence is "woxihuanbund", where "bund” has “1 Embankment 2 Terminal 3 Alliance 4 (Shanghai) Bund", in step S43, when network device 42 is known based on the IP address of the user equipment, it is currently located in Shanghai, China, and thus can be determined"Bund” corresponds to the translation of "Shanghai Bund” or "Bund” with the highest priority, so the following entry options are available: 1 I like Shanghai International; 2 I like the Bund; 3 I like the pier; 4 I like the embankment; 5 I Like the league.”
  • network device 42 may also update information such as saved user input history, input preferences, and inter-vocabulary associations.
  • the user device 41 acquires the user's selection of the provided plurality of input term options by further interaction with the user, and transmits the selection to the network device; in step S410 (not shown)
  • the network device 42 updates the thesaurus and the user input history, the association between the vocabularies, and the like according to the received user selection.
  • the new term option and the existing term option may be added to the network thesaurus. feature.
  • the network device 42 can also search for a new combination of terms on the Internet and update the network vocabulary and the like.
  • the network advertisement library may be located outside the network device 42, for example at another network device or distributed at other multiple network devices, and the network device 42 may be connected to the other network device via the network, thereby querying and querying the user. Enter the ad information options associated with the sequence.
  • Fig. 24 shows another preferred example according to the present embodiment, in which the user equipment 41 itself also holds a local vocabulary, and can synchronize with the user-specific user vocabulary in the network vocabulary of the network device 42 at any time or periodically. .
  • step S44 the user equipment 41 is in the local lexicon according to the user input sequence.
  • the matching query is performed, and the specific query process is as described in the foregoing step S42 described with reference to FIG. 44, and the content reference is not mentioned here.
  • Steps S41 to S43 are the contents of steps S41 to S43 described above with reference to FIG. 23, and the content references are not described herein. It should be understood by those skilled in the art that steps S41 to S43 and step S44 may be performed synchronously, and the completion time mainly depends on the processing speed of the user equipment 41 and the network device 42 and the network transmission delay between the user equipment 41 and the network device 42. Time.
  • step S46 the user equipment 41 merges one or more input term options that are locally queried with one or more input term options from the network device 42, deletes the repeating options therein, and according to certain rules. Determining a priority order of the plurality of terms of the final merge and the advertisement information options related to the input sequence fed back from the network device 42, and then, in step S48, pressing the input term option and the advertisement information option accordingly
  • the priority order is provided to the user for selection or further human interaction.
  • the input term option provided by the network device 42 should be more accurate, so the priority is higher than the local query to obtain the input term option, and similarly, in order not to affect the user's text input, the advertisement information option is usually placed on each line. The option position at the end of the middle.
  • the network vocabulary in the network device 42 may also be the user.
  • a specific user vocabulary may also be the user.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Description

用于确定资源候选项的排序结果的方法、 装置及设备 本申请要求了申请日为 2011年 4月 13 日, 申请号为 2011 10092452.3 , 发明名 称为 "用于确定资源候选项的排序结果的方法、 装置及设备"的中国专利申请的优先 权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机领域,尤其涉及一种用于确定资源候选项的排序结果的方法、 装置及设备。
背景技术
现有的检索技术中, 当获取到来自用户输入的输入序列后, 检索设备基于整个 输入序列来进行检索, 并对检索所得的各个资源候选项进行排序以获得排序结果后 提供给用户。
然而, 由于输入序列中往往包含用户重点关注的信息及非重点关注的信息, 因 此, 现有技术中基于整个输入序列进行检索的方式难以区分用户希望检索的重点内 容。
发明内容
本发明的目的是提供一种用于确定资源候选项的排序结果方法、 装置及设备。 根据本发明的一个方面, 提供一种计算机设备实现的用于确定资源候选项的排 序结果的方法, 其中, 该方法包括以下步骤:
a 由来自用户的输入序列中获取检索信息及调整信息;
b 根据所述检索信息进行检索, 以获得多个资源候选项;
c 根据所述调整信息, 确定所述多个资源候选项的排序结果;
d 根据所述排序结果来生成展现信息, 以提供给所述用户。
根据本发明的另一个方面, 还提供了一种用于确定资源候选项的排序结果的排 序确定装置, 其中, 该排序确定装置包括:
第一获取装置、 用于由来自用户的输入序列中获取检索信息及调整信息; 检索装置、 用于根据所述检索信息进行检索, 以获得多个资源候选项; 排序装置、 用于根据所述调整信息, 确定所述多个资源候选项的排序结果; 提供装置、 用于根据所述排序结果来生成展现信息, 以提供给所述用户。
根据本发明的再一个方面, 还提供一种计算机设备, 该计算机设备包括所述排 序确定装置。 与现有技术相比, 本发明具有以下优点: 1 )根据本发明的方法, 由输入序列 中选择检索信息来进行检索, 确保了非重点关注的信息对检索结果的影响; 2 )根 据本发明的方法, 能够根据由输入序列中获得的调整信息来获得检索所得资源候选 项的排序结果, 进一步提高了用户获得所需资源候选项的可能性; 3 ) 根据本发明 的方法, 适用于各种根据用户输入序列来进行检索的场合, 例如, 在 B2B/B2C网站 中用于根据用户输入的输入序列来提供相应商品的资源候选项, 在搜索引擎中用于 根据用户输入的输入序列来提供相应的资源候选项等。 附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述, 本发明的其 它特征、 目的和优点将会变得更明显:
图 1为本发明一个方面的用于确定资源候选项的排序结果的对检索结果进行排 序的方法的流程图;
图 2为本发明另一个优选实施例的确定资源候选项的排序结果用于对检索结果 进行排序的方法的流程图;
图 3为本发明又另一个优选实施例的用于对检索结果进行排序确定资源候选项 的排序结果的方法的流程图;
图 4为本发明再一个优选实施例的用于对检索结果进行排序确定资源候选项的 排序结果的方法的流程图;
图 5为本发明一个方面的用于确定资源候选项的排序结果的排序确定装置示意 图;
图 6为本发明一个优选实施例的用于确定资源候选项的排序结果对检索结果进 行排序的排序确定装置示意图;
图 7为本发明另一个优选实施例的用于确定资源候选项的排序结果的排序确定 装置示意图;
图 8为本发明再一个优选实施例的用于确定资源候选项的排序结果的排序确定 装置示意图;
附图中相同或相似的附图标记代表相同或相似的部件。 具体实施方式
下面结合附图对本发明作进一步详细描述。 图 1示出了本发明一个方面的用于确定资源候选项的排序结果的方法流程图。 其中, 用户设备 2可以是任何一种可与用户通过键盘、 鼠标、 遥控器、 触摸板、 或 声控设备进行人机交互的电子产品, 包括但不限于计算机、智能手机、 PDA,或 IPTV 等; 计算机设备 3可以是任何一种可与用户设备 2进行通讯的电子产品, 包括但不限 于: 单个网络服务器、 多个网络服务器组成的服务器组或基于云计算 (Cloud Computing ) 的由大量计算机或网络服务器构成的云, 其中, 云计算是分布式计算 的一种, 由一群松散耦合的计算机集组成的一个超级虚拟计算机。 根据本发明的用 于对检索结果进行排序的方法主要由排序确定装置的操作系统或安装在其中的处 理控制器完成, 为简洁起见, 以下将所述排序确定装置中的操作系统或处理控制器 统称为排序确定装置。
在步骤 S 1中, 用户设备 2通过任何一种可与用户 1进行人机交互的交互设备来输 入输入序列, 该交互设备可以是键盘、 鼠标、 遥控器、 触摸板或声控设备等。 以键 盘为例, 用户 1通过键盘在用户设备 2所显示的搜索页面中的信息输入栏中输入需要 检索的信息, 例如, 输入"李弘基主演的电视剧"。
接着, 在步骤 S2中, 所述用户设备 2将用户 1输入的输入序列, 例如, "李弘基 主演的电视剧", 发送至计算机设备 3。 其中, 用户设备 2可以通过网络向计算机设备 3发送输入序列, 该网络包括但不限于: 互联网、 广域网、 城域网、 局域网、 VPN 网络、 无线自组织网络(Ad Hoc网络)等。 其中, 所述用户设备 2向计算机设备 3发 送输入序列的方式包括但不限于: 1 )通过网络直接将所述输入序列发送给所述计 算机设备 3 ; 2 ) 经由网络中的一个或多个设备将所述输入序列发送给所述计算机设 备 3等。
接着, 在步骤 S3中, 所述排序确定装置由计算机设备 3所接收的所述输入序列 中获取检索信息及调整信息。
例如, 所述排序确定装置根据所述输入序列在预定常用词库中进行查询, 获得 "电视剧"和"主演"为常用词, 并分析输入序列, 确定"的"为助词, 则排序确定装置 确定由所述输入序列 "李弘基主演的电视剧 "中所获得的检索信息包括"李弘基", 调 整信息包括"主演"和"电视剧"。 其中, 所述预定常用词库中包括多个常用词。
接着, 在步驟 S4中, 所述排序确定装置根据所述检索信息进行检索, 以获得多 个资源候选项。 其中, 一个资源候选项对应一个或多个链接, 该资源候选项包含对 该一个或多个链接所指向的一个或多个网站提供的资源的描述信息, 该描述信息包 括但不限于: 资源的标题、 资源的内容摘要、 资源的全部文本内容等。 例如, 所述排序确定装置根据检索信息"李弘基 "和"电视剧 "进行检索, 获得的 资源候选项包括: 资源候选项 A和资源候选项 B。
接着, 在步骤 S6中, 所述排序确定装置根据所述调整信息, 确定所述多个资源 候选项的排序结果。
例如, 所述排序确定装置分析获得在资源候选项 A包含调整信息 "主演", 在资 源候选项 B中不包含调整信息"主演", 由此, 排序确定装置将两个资源候选项排序 如下:
资源候选项 A;
资源候选项 B。
接着, 在步骤 S7中, 所述排序确定装置根据所述排序结果来生成展现信息, 以 提供给所述用户 1。
例如, 所述排序确定装置基于资源候选项 A和资源候选项 B的排序结果,确定资 源候选项 A所对应的展现信息 A与资源候选项 B所对应的展现信息 B的排序如下所示, 并通过用户设备 2将排序后的展现信息提供给用户 1 :
展现信息 A;
展现信息^
需要说明的是, 排序确定装置可根据实际情况, 例如, 用户设备 2所请求的展 现信息数量少于资源候选项数量等, 选择部分资源候选项来生成展现信息, 提供给 用户 1。
作为本发明的优选方案之一, 前述步骤 S3还包括所述排序确定装置先获取用于 确定所述检索信息及所述调整信息的第一类型确定信息, 进而再根据所述第一类型 确定信息, 由来自用户的输入序列中获取所述检索信息及所述调整信息的步驟。 其 中, 所述第一类型确定信息包括但不限于:
1 )根据所述输入序列在预定关键词类型库中查询所得的信息单元及其类型。 其中,所述预定关键词类型库包括多个信息单元,且每个信息单元均对应一个类型。
例如, 所述排序确定装置根据输入序列"李弘基主演的电视剧 "来在预定关键词 类型库中进行查询, 获得的信息单元包括: "李弘基"、 "主演"、 及"电视剧", 其中, 信息单元 "李弘基"及"电视剧"的类型均为检索类型, 而信息单元"主演"的类型为调 整类型, 则所述排序确定装置基于在预定关键词类型库中查询所得的信息单元及其 类型的结果,确定输入序列 "李弘基主演的电视剧 "的检索信息包括"李弘基"与"电视 剧", 调整信息包括 "主演"。 再例如, 对于输入序列 "李弘基主演的电视剧", 若所述 排序确定装置在预定关键词类型库中查询所得的信息单元包括"李弘基主演"及"主 演电视剧", 其中, 信息单元"李弘基主演"的类型为检索类型, 信息单元"主演电视 剧"的类型为调整类型,则所述排序确定装置基于在预定关键词类型库中所得的信息 单元及其类型查询的结果,确定输入序列 "李弘基主演的电视剧 "的检索信息包括"李 弘基主演" , 调整信息包括"主演电视剧"。
需要说明的是,检索信息和调整信息可部分重叠,如上例所示,上述检索信息: "李弘基主演"中的 "主演 "也出现在调整信息"主演电视剧 "中。
2 )根据所述输入序列来获得的语义分析结果。
其中, 所述语义分析结果包括但不限于:
a )基于词性的语义分析结果。 所述词性包括但不限于: 名词、 形容词、 副词、 动词等。 例如 , 所述排序确定装置根据输入序列 "李弘基主演的电视剧 "进行切词所 得的 4个词"李弘基"、 "主演"、 "的"及"电视剧 "进行分析, 获得语义分析结果包括: "李弘基"及"电视剧 "为名词, "主演 "为动词, "的"为助词; 则所述排序确定装置基 于该语义分析结果, 将名词作为检索信息, 将动词作为调整信息, 确定检索信息包 括"李弘基 "及"电视剧", 调整信息包括"主演"。
b )基于句型的语义分析结果。 例如, 对于输入序列 "李弘基主演的电视剧", 所 述排序确定装置基于句型将其分为 3个部分: "李弘基"、 "主演"、 及"电视剧", 并基 于"李弘基 "在输入序列中位于句首来确定其为主语, 基于"主演"在输入序列中位于 句中来确定其为谓语, 基于"电视剧"在输入序列中位于句尾来确定其为宾语; 根据 前述语义分析结果, 所述排序确定装置将主语和宾语作为检索信息, 将谓语作为调 整信息, 确定检索信息包括"李弘基 "及"电视剧", 调整信息包括 "主演"。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何获取用于确定所述检索信息及所述调整信息 的第一类型确定信息, 并根据所述第一类型确定信息, 由来自用户的输入序列中获 取所述检索信息及所述调整信息的实现方式, 均应包含在本发明的范围内。 例如, 综合词性和句型两者的语义分析结果, 例如, "李弘基"为名词且位于句首, "主演" 为动词且位于 "李弘基"之后, "电视剧"为名词且位于 "主演 "之后等, 来获取检索信 息和调整信息; 或者, 所述基于句型的语义分析结果不仅包括主语、 谓语、 宾语, 还进一步包括各个部件间的位置关系, 例如, 定语位于主语之前、 定语位于宾语之 前、 状语位于谓语之前等; 或者, 所述排序确定装置仅将主语作为检索信息, 将谓 语和宾语作为调整信息等。 作为本发明的优选方案之一, 根据本发明的方案还包括在获取来自用户的输入 序列后, 先去除所述输入序列中的无效信息, 以获得可用信息, 再由所述可用信息 中获取所述检索信息及所述调整信息的步骤。 其中, 所述无效信息包括但不限于: 1 )助词; 2 ) 空格; 3 )标点符号; 4 ) 包含于预定无效词典中的信息单元等。
例如, 对于所获取的输入序列"李弘基主演的电视剧", 所述排序确定装置先去 除所述输入序列中的无效信息, 例如, 去除助词"的", 以获得可用信息"李弘基主演 电视剧", 接着, 再由所述可用信息中获取所述检索信息及所述调整信息。 其中, 由 可用信息中获取所述检索信息及所述调整信息的方式与前述步驟 S 3中由输入序列 中获取检索信息及调整信息的方式相同或相似, 并以引用的方式包含于此, 不再赘 述。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何获取用于确定所述检索信息及所述调整信息 的第一类型确定信息, 并根据所述第一类型确定信息, 由来自用户的输入序列中获 取所述检索信息及所述调整信息的实现方式, 均应包含在本发明的范围内。
图 2示出了本发明一个优选实施例的用于确定资源候选项的排序结果的方法流 程图。
具体的, 步驟 S 1和 S2已在图 1所示的实施例中予以详述, 并以引用的方式包含 于此, 不再赘述。
接着, 在步骤 S3 '中, 所述排序确定装置由计算机设备 3所接收的输入序列中获 取检索信息及调整信息。 其中, 所述检索信息包括一个或多个检索单元, 所述调整 信息包括一个或多个调整单元。
例如, 对于输入序列 "李弘基主演和参演的电视剧", 所述排序确定装置由该输 入序列中获取的检索信息包括检索单元"李弘基 "和检索单元 "电视剧", 调整信息包 括调整单元"主演"和调整单元"参演"。 其中, 所述排序确定装置由输入序列中获取 检索信息及调整信息的方式已在参照图 1所示实施例中予以详述, 并以引用的方式 包含于此, 不再赞述。
接着, 在步骤 S4,中, 排序确定装置根据所述检索信息进行检索, 以获得多个资 源候选项。
例如, 排序确定装置根据检索信息 "李弘基"和"电视剧 "进行检索, 获得的资源 候选项包括: 资源候选项 C、 资源候选项 D、 资源候选项 E等等。
接着, 在步骤 S5中, 所述排序确定装置获取用于辅助确定所述排序结果的第一 排序辅助信息。 其中, 所述第一排序辅助信息包括但不限于以下至少一项:
1 )各个调整单元的权重信息。
例如, 排序确定装置 3获取调整单元"主演"的权重信息, 例如为 5 ; 获取调整 单元"参演"的权重信息, 例如为 1。 本领域技术人员应该理解, 上述权重信息以数 值来表示仅仅只是列示, 而非用于限制本发明, 事实上, 权重信息也可以以其他方 式来表示, 例如, 以等级来表示等等。
2 )所述多个资源候选项中每个资源候选项的调整单元分布信息。 其中所述调 整单元分布信息包括但不限于以下至少一项:
a )该调整单元分布信息所对应的资源候选项中各个调整单元的出现次数。 例如,所述排序确定装置获得的资源候选项包括:资源候选项 C、资源候选项 D、 及资源候选项 E, 并统计获得各资源候选项中调整单元的出现次数为:
资源候选项 C: 调整单元"主演"出现 2次; 调整单元"参演"出现 0次; 资源候选项 D : 调整单元"主演"出现 0次, 调整单元"参演"出现 1次; 资源候选项 E : 调整单元"主演"出现 0次, 调整单元"参演"出现 0次。
b )该调整单元分布信息所对应的资源候选项中各个调整单元的出现位置。 其中, 所述出现位置包括但不限于: 标题、 摘要、 正文、 诸如 UGC等的多媒体 资源说明性内容等, 所述出现位置可通过资源候选项所对应信息的标签或所对应信 息所包含的文本信息, 例如, <title>、 "摘要 "等, 来识别。
例如, 排序确定装置所获得的资源候选项包括资源候选项 F及资源候选项 G, 并根据资源候选项所对应信息的标签获得各个资源候选项中各个调整单元的出现 位置为:
资源候选项 F的标题中包含调整单元 "主演";
资源候选项 G的摘要中包含调整单元 "参演"。
c )该调整单元分布信息所对应的资源候选项中不同调整单元的数量。
例如, 排序确定装置所获得的资源候选项包括资源候选项 H和资源候选项 I , 并获得资源候选项 H中包括调整单元 "主演 "和调整单元 "参演 "两个调整单元; 资源 候选项 I中包括调整单元"参演"一个调整单元; 则排序确定装置确定各资源候选项 中不同调整单元的数量为:
资源候选项 H : 2;
资源候选项 I: 1。
3 )所述多个资源候选项中每个资源候选项的预定质量信息。 其中, 所述质量信息包括但不限于以下至少一项:
a )该盾量信息所对应的资源候选项的权威性。
其中, 所述排序确定装置获得资源候选项的权威性的方式包括但不限于以下至 少一项:
1 )获得预设的与资源候选项相对应的权威性;
ϋ )基于预存的权威网站信息来判断资源候选项的权威性;
iii )基于用户的点击率来判断资源候选项的权威性等。
例如, 排序确定装置所获得的资源候选项包括资源候选项 J和资源候选项 K, 其 中, 资源候选项 J对应网站 L 资源候选项 K对应网站 K, 且排序确定装置获得网站 J 为预定权威网站, 网站 K为预定普通网站, 则排序确定装置确定资源候选项 J的权威 性为"权威"级别, 资源候选项 K的权威性为"普通"级别。本领域技术人员应该理解, 上述采用等级来表述盾量信息的方式仅为例示, 而非对本发明的限定, 质量信息也 可以以其他方式来表示, 例如, 以值来表示等等。
b )该质量信息所对应的资源候选项的优质度。
其中, 所述排序确定装置获得各资源候选项的优质度的方式包括但不限于以下 至少一项:
1 )获得预设的与资源候选项相对应的优质度;
2 )通过对该资源候选项所对应的网站所包含的内容信息进行分析, 来获得所 述优质度。 其中, 对所述内容信息进行分析时所参考的因素包括以下至少一项: i ) 是否包含广告信息; ii )网站所提供的资源盾量, 例如, 图片清晰度、视频清晰度、 歌曲音质等; iii )网站所提供的资源数量等。 例如, 对于所获得的资源候选项 L和资 源候选项 M, 其中资源候选项 L对应网站 L , 资源候选项 M对应网站 M; 排序确定装 置获取资源候选项 L所对应的网站所包含的内容信息, 并分析得到该内容信息中不 包含广告信息且该网站提供的图片平均像素高于第一预定阈值, 则排序确定装置判 断该网站 L的优质度为优, 并确定资源候选项 L的优质度等级为"优"; 并且, 排序确 定装置获取资源候选项 M所对应的网站所包含的内容信息, 并分析得到该内容信息 中包含广告信息且该网站提供的音乐资源数量高于第二预定阈值, 则排序确定装置 判断该网站 L的优质度为优, 并确定资源候选项 L的优质度等级为 "优"。 本领域技术 人员应该理解, 上述采用等级来表述优质度的实现方式仅为列示, 而非对本发明的 限定, 事实上, 优质度也可以以其他方式来表示, 例如, 以值来表示等等。
接着, 在步驟 S6'中, 所述排序确定装置根据所有调整单元, 并结合所述第一排 序辅助信息, 来确定所述多个资源候选项的排序结果。
具体地, 排序确定装置确定所述排序结果的方式包括但不限于:
1 )先根据所述第一排序辅助信息中的一项来确定所述多个资源候选项的初始 排序结果, 然后再根据所述第一排序辅助信息中的至少一项来调整该初始排序结果 以获得所述排序结果;
2 )仅根据所述第一排序辅助信息中的一项来确定所述多个资源候选项的排序 结果;
3 )根据所述第一排序辅助信息中的多项来直接确定所述多个资源候选项的排 序结果等。
例如, 所述排序确定装置根据所有调整单元, 并结合所述第一排序辅助信息中 的调整单元的权重信息, 来确定各资源候选项的排序结果。 例如, 所述排序确定装 置获得"主演"的权重信息为 5 , "参演"的权重信息为 1 , 资源候选项 C中包含调整信息 单元"主演", 资源候选项 D中包含调整信息单元 "参演", 由此, 排序确定装置根据 各个调整单元的权重信息确定资源候选项 C和资源候选项 D的排序结果为:
资源候选项 C;
资源候选项 D。
再例如, 所述排序确定装置根据所有调整单元, 并结合所述第一排序辅助信息 中的调整单元的权重信息及每个资源候选项的调整单元分布信息, 来确定各资源候 选项的排序结果。 例如, 所述排序确定装置获得"主演"的权重信息为 5, "参演 "的权 重信息为 1 , 资源候选项 C的标题中包含调整信息单元"主演", 资源候选项 D的摘要 中包含调整信息单元 "主演", 资源候选项 E包含调整单元 "参演", 由此, 所述排序确 定装置先根据各调整单元的权重信息将包含调整单元 "主演 "的资源候选项 C和资源 候选项 D排序在包含调整单元"参演"的资源候选项 E之前, 再根据调整单元"主演"的 出现位置信息, 将标题中出现调整单元"主演"的资源候选项 C排序在摘要中出现调 整单元"主演"的资源候选项 D之前, 以获得如下排序结果:
资源候选项 C;
资源候选项 D;
资源候选项 E。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所有调整单元, 并结合所述第一排序辅 助信息, 来确定所述多个资源候选项的排序结果的实现方式, 均应包含在本发明的 范围内。 例如: 1 )排序确定装置基于调整单元出现次数的由多至少来排序; 2 )排 序确定装置基于调整单元的数量的由多至少来排序; 3 )排序确定装置根据资源候 选项所对应网站的权威性或优盾度由高至低进行排序; 4 )排序确定装置先基于调 整单元出现的次数的由高至低来对各资源候选项进行排序, 再基于优质度由高至低 来对调整单元出现的次数相同的各资源候选项进行排序等等; 5 ) 当第一排序辅助 信息中的每一项均采用值来表示, 排序确定装置根据第一排序辅助信息中各项的值 来获得各个资源候选项的评价值, 并根据该评价值来对各个资源候选项进行排序等。
接着, 在步骤 S7'中, 所述排序确定装置根据所述排序结果来生成展现信息, 并 通过用户设备 2提供给所述用户 1。
例如, 所述排序确定装置根据资源候选项 (、 资源候选项 D及资源候选项 E的排 序结果, 确定资源候选项 C对应的展现信息 C、 资源候选项 D对应的展现信息 D及资 源候选项 E对应的展现信息 E的排序如下所示, 并通过用户设备 2将排序后的展现信 息提供给所述用户 1 :
展现信息 C;
展现信息 D ;
展现信息∑。
需要说明的是, 排序确定装置可根据实际情况, 例如, 用户设备 2所请求的展 现信息数量少于资源候选项数量等, 选择部分资源候选项来生成展现信息, 提供给 用户 1。
图 3示出了本发明另一个优选实施例的用于确定资源候选项的排序结果的方法 流程图。
具体的, 步骤 S1和 S2已在参照图 1所示的实施例中予以详述, 并以引用的方式 包含于此, 不再赘述。
接着, 在步骤 S3 "中, 所述排序确定装置由计算机设备 3所接收的所述输入序列 中获取检索信息及调整信息。
例如, 所述排序确定装置由计算机设备 3所接收的输入序列 "李弘基主演的电视 剧"中获取检索信息"李弘基 "与"电视剧 "以及包调整信息"主演"。 其中, 所述排序确 定装置由输入序列中获取检索信息及调整信息的方式已在参照图 1所示实施例中予 以详述, 并以引用的方式包含于此, 不再赘述。
接着, 在步骤 S4"中, 所述排序确定装置根据所述检索信息进行检索, 以获得 多个资源候选项。 例如,所述排序确定装置根据检索单元"李弘基 "与检索单元"电视剧"进行检索, 获得的资源候选项包括: 资源候选项 Al、 资源候选项 B 1和资源候选项 Cl。
接着, 在步骤 S6"中, 所述排序确定装置根据所述调整信息及所述检索信息, 确定所述多个资源候选项的排序结果。
例如, 在前述步骤 S4"中, 所述排序确定装置获得资源候选项 Al、 资源候选项 B 1和资源候选项 C 1 ,并获得资源候选项 A1包含检索信息"李弘基"与 "电视剧 "及调整 信息"主演", 资源候选项 B1包含检索信息 "李弘基"与"主演", 资源候选项 C1包含检 索信息 "李弘基", 则所述排序确定装置确定同时包含检索信息和调整信息的资源候 选项 A1排序位于资源候选项 B 1和资源候选项 C 1之前, 并对仅包含检索信息的资源 候选项 B 1及资源候选项 C1随机排序, 获得排序结果如下:
资源候选项 A1 ;
资源候选项 B1 ;
资源候选项 Cl。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所述调整信息及所述检索信息, 确定所 述多个资源候选项的排序结果的实现方式, 均应包含在本发明的范围内。
接着, 在步驟 S7"中, 所述排序确定装置根据所述排序结果来生成展现信息以 提供给所述用户 1。 其中, 排序确定装置根据所述排序结果来生成展现信息以提供 给所述用户 1的方式已在参照图 1所示实施例的步骤 S7中予以详述, 并以引用的方式 包含于此, 不再赞述。
图 4示出了本发明再一个优选实施例的用于确定资源候选项的排序结果的方法 流程图。
具体的, 步驟 S1和 S2已在图 1所示的实施例中予以详述, 并以引用的方式包含 于此, 不再赘述。
接着, 在步骤 S3 ' "中, 所述排序确定装置由计算机设备 3所接收的输入序列中 获取检索信息及调整信息。 其中, 所述调整信息包括一个或多个调整单元; 所述检 索信息包括一个或多个检索单元。
例如, 所述排序确定装置由计算机设备 3所接收的输入序列 "李弘基主演和参演 的电视剧"中获取包括检索单元 "李弘基"与检索单元"电视剧"的检索信息以及包括 调整单元"主演"与调整单元"参演"的调整信息。
上述所述排序确定装置由计算机设备 3所接收的输入序列中获取检索信息及调 整信息的方式与图 1所示的步骤 S 3中获取检索信息及调整信息的方式相同或相似, 并以引用的方式包含于此, 不再赘述。
接着, 在步骤 S4,"中, 排序确定装置根据所述检索信息进行检索, 以获得多个 资源候选项。
例如,所述排序确定装置根据检索单元"李弘基 "与检索单元"电视剧"进行检索, 获得的资源候选项包括: 资源候选项 C l、 资源候选项 D l、 资源候选项 El等等。
接着,在步骤 S5'中, 所述排序确定装置获取用于辅助确定所述排序结果的第二 排序辅助信息。 其中, 所述第二排序辅助信息包括但不限于以下至少一项:
1 )各个调整单元的权重信息。 该内容已在图 2所示的步骤 S5中予以详述, 并以 引用的方式包含于此, 不再赘述。
2 )所述多个资源候选项中每个资源候选项的调整单元分布信息。 该内容已在 图 2所示的步骤 S5中予以详述, 并以引用的方式包含于此, 不再赞述。
3 )各个检索单元的权重信息。 例如, 所述排序确定装置 3获取检索单元 "李弘 基"的权重信息, 例如为 5 , 获取检索单元 "电视剧"的权重, 例如为 1。 本领域技术 人员应该理解,上述采用数值来表示权重信息仅仅只是列示, 而非对本发明的限定, 事实上, 权重信息也可以以其他方式来表示, 例如, 以等级来表示等等。
4 )所述多个资源候选项中每个资源候选项的检索单元分布信息。 其中, 所述 检索单元分布信息包括但不限于以下至少一项:
a )该检索单元分布信息所对应的资源候选项中各个检索单元的出现次数。 例如, 所述排序确定装置获得的资源候选项包括资源候选项 Cl、 资源候选项 D1 及资源候选项 E1 , 并统计获得各资源候选项中调整单元的出现次数为:
资源候选项 C1 : 检索单元"李弘基 "出现 2次, 检索单元"电视剧 "出现 2次; 资源候选项 D1 : 检索单元"李弘基 "出现 1次, 检索单元"电视剧 "出现 1次; 资源候选项 E1 : 检索单元"李弘基 "出现 1次, 检索单元"电视剧 "出现 1次。
b )该检索单元分布信息所对应的资源候选项中各个检索单元的出现位置。 其中, 所述出现位置包括但不限于: 标题、 摘要、 正文、 诸如 UGC等的多媒体 资源说明性内容等, 所述位置可通过资源候选项所对应信息的标签或文本信息, 例 如, <title>、 "摘要 "等, 来识别。
例如, 所述排序确定装置所获得的资源候选项包括资源候选项 F 1及资源候选 项 G 1, 并根据资源候选项所对应信息的标签获得各个资源候选项中各个检索单元 的出现位置为: 资源候选项 F 1的标题中包含检索单元 "李弘基"与检索单元"电视剧"; 资源候选项 G 1的标题中包含检索单元 "李弘基",摘要中包含检索单元 "电视剧"。 c ) 该检索单元分布信息所对应的资源候选项中不同检索单元的数量。
例如, 所述排序确定装置所获得的资源候选项包括资源候选项 H 1和资源候选 项 I 1 , 并获得资源候选项 H I中包括检索单元"李弘基 "和"电视剧", 资源候选项 I 1 中包括检索单元 "李弘基"; 则排序确定装置确定各资源候选项中不同检索单元的数 量为:
资源候选项 H I : 2;
资源候选项 I I: 1 。
3 ) 所述多个资源候选项中每个资源候选项的预定质量信息。 该预定质量信息 已在图 2所示的实施例的步骤 S5中予以详述, 并以引用的方式包含于此, 不再赞述。
接着, 在步骤 S6' "中, 所述排序确定装置根据所有调整单元及所有检索单元, 并结合所述第二排序辅助信息, 来确定所述多个资源候选项的排序结果。
具体地, 排序确定装置根据所述第二排序辅助信息中的至少一项来确定所述多 个资源候选项的排序结果。
具体地, 排序确定装置确定所述排序结果的方式包括但不限于:
1 ) 先根据所述第二排序辅助信息中的一项来确定所述多个资源候选项的初始 排序结果, 然后再根据所述第二排序辅助信息中的至少一项来调整该初始排序结果 以获得所述排序结果;
2 ) 仅根据所述第二排序辅助信息中的一项来确定所述多个资源候选项的排序 结果;
3 ) 根据所述第二排序辅助信息中的多项来直接确定所述多个资源候选项的排 序结果等。
例如,所述排序确定装置在前述步骤 S5'中获得检索单元 "李弘基"在资源候选项 C1中出现 2次、 在资源候选项 D1中出现 1次、 在资源候选项 E1中出现 1次, 且检索 单元"电视剧"在资源候选项 C1中出现 2次、 在资源候选项 D1中出现 1次、 在资源候选 项 E1中出现 1次, 则所述排序确定装置确定两个检索单元在资源候选项 C1中出现四 次, 两个检索单元在资源候选项 D1中出现两次, 两个检索单元在资源候选项 E1中出 现两次, 且资源候选项 D1包含 2个调整单元, 资源候选项 E1包含一个调整单元的信 息, 则排序确定装置先根据检索单元的出现次数进行排序以获得初始排序结果, 再 根据调整单元数量来调整该初始排序结果, 获得资源候选项 Cl、 资源候选项 D1与资 源候选项 E 1的排序结果如下:
资源候选项 C1 ;
资源候选项 D1 ;
资源候选项 El。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所有调整单元及所有检索单元, 并结合 所述第二排序辅助信息, 来确定所述多个资源候选项的排序结果的实现方式, 均应 包含在本发明的范围内。 例如: 1 ) 所述排序确定装置基于检索单元在各资源候选 项中的出现位置来排序, 例如, 将检索单元出现在标题位置的资源候选项排序在检 索单元出现在摘要位置的资源候选项之前; 2 ) 所述排序确定装置基于各资源候选 项中的不同检索单元的数量来排序, 例如, 将包含检索单元的数量多的资源候选项 排序在包含检索单元的数量少的资源候选项之前; 3 ) 所述排序确定装置基于每个 资源候选项的质量信息来排序, 例如, 将权威网站或者优质网站对应的资源候选项 排序在前; 4 )所述排序确定装置同时基于检索单元和调整单元的权重信息来排序, 例如, 权重信息包括权重值, 排序确定装置将各资源候选项中各自包含的检索单元 权重值和调整单元权重值相乘获得总权重值, 再基于总权重值来对各资源候选项排 序; 5 )所述排序确定装置基于检索单元的的分布信息来对各资源候选项进行排序, 对分布信息相同者, 再基于调整单元的分布信息或者资源候选项的质量信息来排序 等等。
接着, 在步骤 S7" '中, 所述排序确定装置根据所述排序结果来生成展现信息, 并通过用户设备 2提供给所述用户 1。
其中, 所述排序确定装置根据所述排序结果来生成展现信息, 并通过用户设备 2提供给所述用户 1的方式已在前述步骤 S7'中予以详述, 并以引用的方式包含于此, 不再赘述。
作为本发明的优选方案之一, 根据本发明的方法还包括排序确定装置获取关键 词单元及其类型, 并根据所述关键词单元及其类型, 建立或更新所述预定关键词类 型库的步骤。
其中, 所述排序确定装置获取关键词单元及其类型的步骤进一步包括排序确定 装置获取所述关键词单元, 获取用于确定所述关键词单元类型的第二类型确定信息 , 并根据所述第二类型确定信息来确定该关键词单元的类型的步骤。
在获取所述关键词单元的步骤中, 所述排序确定装置获取关键词单元的方式包 括但不限于:
1 ) 由用户输入序列中获取。 例如, 排序确定装置由多个用户输入的输入序列 中获得关键词单元。 例如, 排序确定装置由计算机设备 3所接收的用户 A输入的输入 序列 "电视剧 A"、 用户 B输入的输入序列 "电视剧 B"中获得两个输入序列中相同的部 分"电视剧", 并将该相同部分"电视剧 "作为关键词单元。 再例如, 排序确定装置将 输入序列"演员李弘基 "进行切词,获得"演员"和"李弘基",将"演员"和"李弘基 "作为 关键词单元。
2 ) 由输入法的词库中获取。 例如, 排序确定装置由输入法词库中获取"主演" 和"参演", 并将"主演"和"参演"作为关键词单元。
3 )获取预设的关键词单元等。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何获取所述关键词单元的实现方式, 均应包含 在本发明的范围内。
接着, 排序确定装置获取用于确定所述关键词单元类型的第二类型确定信息。 其中, 所述第二类型确定信息包括但不限于以下至少一项:
1 ) 关键词单元在预定语料库中的分布集中度。 其中, 所述预定预料库中包含 多个语料。 其中, 所述分布集中度表示关键词单元在预定语料库的多个语料中的分 布集中程度, 该分布集中度根据该关键词单元在预定语料库中的出现信息及包含该 关键词单元的不同语料的数量信息来获得。
其中, 所述出现信息以下至少一项:
1 )该关键词单元在预定语料库中的出现次数;
2 )该关键词单元在预定语料库中的出现次数占语料库中所有关键词数量的比 例;
所述数量信息包括以下至少一项:
1 ) 包含该关键词单元的不同语料的数量;
2 ) 包含该关键词单元的不同语料的数量占所有语料的数量的比例。
例如, 排序确定装置获得关键词单元"李弘基 "在所述预定语料库的各语料中共 出现 1000次,且预定语料库中包含关键词单元"李弘基"的不同语料的数量为 500 , 则 分布集中度 =1000/500=2等。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据该关键词单元在预定语料库中的出现信 息及包含该关键词单元的不同语料的数量信息来获得分布集中度的实现方式, 均应 包含在本发明的范围内。
2 )根据该关键词单元来获得的语义分析结果。 其中, 所述语义分析包括但不 限于关键词单元的词性, 如名词、 动词, 形容词等等。 例如, 所述排序确定装置对 关键词单元"李弘基"进行词性分析获得语义分析结果为名词。
3 ) 包含该关键词单元且匹配同一语料的用户历史输入序列数量。
其中, 所述匹配同一语料的用户历史输入序列为检索结果中包含同一语料的用 户历史输入序列, 例如, 三个用户历史输入序列" iphone4签售"、 "iphone4发售"、 "iphone4开卖"的检索结果中均包含同一语料 "iphone4的销售额已突破 ... ", 则
"iphone4签售 "、 "iphone4发售"及" iphone4开卖"为匹配同一语料的用户历史输入序列。
则对于关键词单元" iphone4" ,若包含其的用户历史输入序列包括" iphone4签售"、 "iphone4发售' '、 "iphone4开卖 "、 "iphone4游戏"及" iphone4候乐 ", 其中, "iphone4签 售, '、 "iphone4发售,,及" iphone4开卖',匹 己同一语料, "iphone4游戏, '及" iphone4娱乐,, 匹配同一语料, 则包含该关键词单元" iphone4"且匹配同一语料的用户历史输入序列 数量为 5。
接着, 排序确定装置根据所述第二类型确定信息来确定该关键词单元的类型。 其中, 所述类型包括: 检索类型和调整类型等; 优选地, 还包括需要从输入序列中 去除的无效类型等。
例如, 所述排序确定装置获得该关键词单元"李弘基 "在预定预料库中的分布集 中度为 6.5 , 并判断该分布集中度 6.5超过分布预定阈值 4, 则排序确定装置确定该关 键词单元"李弘基"的类型为检索类型。
又例如, 所述排序确定装置获得关键词单元"电视剧 "的语义分析结果为名词, 则排序确定装置基于该语义分析结果将该关键词单元"电视剧"的类型确定为检索类 型。 再例如, 所述排序确定装置获得关键词单元 "主演 "的语义分析结果为动词, 则 排序确定装置基于语义分析结果将该关键词单元"主演"确定为调整类型。 再例如, 所述排序确定装置获得关键词单元"的 "的语义分析结果为助词, 则排序确定装置基 于语义分析结果将该关键词单元"的"确定为无效类型。
又例如, 排序确定装置获得该关键词单元"李弘基"且匹配同一语料的用户历史 输入序列数量为 1000 , 高于预定判断阈值, 则排序确定装置确定将该关键词单元"李 弘基"的类型为检索类型。
又例如, 排序确定装置获得该关键词单元"李弘基"的语义分析结果为名词且匹 配同一语料的用户历史输入序列数量为 1000 , 则排序确定装置根据当名词的匹配同 一语料的用户历史输入序列数量超所 900时将其该名词定为检索类型的预定规则, 确定该关键词单元"李弘基 "的类型为检索类型。
又例如, 排序确定装置获得该关键词单元"李弘基 "的分布集中度为 6.5且匹配同 一语料的用户历史输入序列数量为 1000 , 则排序确定装置先将分布集中度及匹配同 一语料的用户历史输入序列数量归一化, 然后相加以获得关键词单元 "李弘基"的综 合评价值为 1.2, 高于综合预定阈值, 则确定将该关键词单元"李弘基"的类型为检索 类型。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所述第二类型确定信息来确定该关键词 单元的类型的实现方式, 均应包含在本发明的范围内。
图 5示出了本发明一个方面的用于确定资源候选项的排序结果的排序确定装置 示意。 其中, 所述排序确定装置包括: 第一获取装置 31、 检索装置 32、 排序装置 33 及提供装置 34。
用户设备 2通过任何一种可与用户 1进行人机交互的交互设备来输入输入序列, 该交互设备可以是键盘、 鼠标、 遥控器、 触摸板或声控设备等。 以键盘为例, 用户 1通过键盘在用户设备 2所显示的搜索页面中的信息输入栏中输入需要检索的信息, 例如, 输入"李弘基主演的电视剧"。
接着,所述用户设备 2将用户 1输入的输入序列,例如, "李弘基主演的电视剧", 发送至计算机设备 3。其中,用户设备 2可以通过网络向计算机设备 3发送输入序列, 该网络包括但不限于: 互联网、 广域网、 城域网、 局域网、 VPN网络、 无线自组织 网络( Ad Hoc网络)等。 其中, 所述用户设备 2向计算机设备 3发送输入序列的方式 包括但不限于: 1 )通过网络直接将所述输入序列发送给所述计算机设备 3 ; 2 ) 经 由网络中的一个或多个设备将所述输入序列发送给所述计算机设备 3等。
接着, 所述第一获取装置 31由计算机设备所接收的所述输入序列中获取检索信 息及调整信息。
例如, 所述第一获取装置 31根据所述输入序列在预定常用词库中进行查询, 获 得"电视剧 "和"主演"为常用词, 并分析输入序列, 确定"的"为助词, 则第一获取装 置 31确定由所述输入序列 "李弘基主演的电视剧 "中所获得的检索信息包括"李弘基" , 调整信息包括 "主演 "和"电视剧"。 其中, 所述预定常用词库中包括多个常用词。
接着, 所述检索装置 32根据所述检索信息进行检索, 以获得多个资源候选项。 其中, 一个资源候选项对应一个或多个链接, 该资源候选项包含对该一个或多个链 接所指向的一个或多个网站提供的资源的描述信息, 该描述信息包括但不限于: 资 源的标题、 资源的内容摘要、 资源的全部文本内容等。
例如, 所述检索装置 32根据检索信息 "李弘基"和"电视剧 "进行检索, 获得的资 源候选项包括: 资源候选项 A和资源候选项 B。
接着, 所述排序装置 33根据所述调整信息, 确定所述多个资源候选项的排序结 果。
例如, 所述排序装置 33分析获得在资源候选项 A包含调整信息 "主演", 在资源 候选项 B中不包含调整信息"主演",由此,排序装置 33将两个资源候选项排序如下: 资源候选项 A;
资源候选项 B。
接着, 所述提供装置 34根据所述排序结果来生成展现信息, 以提供给所述用户
1。
例如, 所述提供装置 34基于资源候选项 A和资源候选项 B的排序结果, 确定资源 候选项 A所对应的展现信息 A与资源候选项 B所对应的展现信息 B的排序如下所示, 并通过用户设备 2将排序后的展现信息提供给用户 1 :
展现信息 A;
展现信息^
需要说明的是, 提供装置 34可根据实际情况, 例如, 用户设备 2所请求的展现 信息数量少于资源候选项数量等, 选择部分资源候选项来生成展现信息, 提供给用 户 1。
作为本发明的优选方案之一, 所述排序确定装置还包括第四获取装置 (未予图 示) ; 所述第一获取装置 31还包括第一子获取装置 (未予图示) 。 其中, 所述第四 获取装置获取用于确定所述检索信息及所述调整信息的第一类型确定信息; 所述第 一子获取装置根据所述第一类型确定信息, 由来自用户的输入序列中获取所述检索 信息及所述调整信息。 其中, 所述第一类型确定信息包括但不限于:
1 )根据所述输入序列在预定关键词类型库中查询所得的信息单元及其类型。 其中,所述预定关键词类型库包括多个信息单元,且每个信息单元均对应一个类型。
例如, 所述第四获取装置根据输入序列"李弘基主演的电视剧 "在预定关键词类 型库中进行查询, 获得的信息单元包括: "李弘基"、 "主演"、 及"电视剧", 其中, 信息单元 "李弘基"及"电视剧"的类型均为检索类型, 而信息单元"主演"的类型为调 整类型, 则所述第一子获取装置基于在预定关键词类型库中查询所得的信息单元及 其类型的结果,确定输入序列 "李弘基主演的电视剧 "的检索信息包括"李弘基"与 "电 视剧", 调整信息包括 "主演"。 再例如, 对于输入序列 "李弘基主演的电视剧", 若所 述第四获取装置在预定关键词类型库中查询所得的信息单元包括"李弘基主演"及 "主演电视剧", 其中, 信息单元"李弘基主演"的类型为检索类型, 信息单元"主演电 视剧"的类型为调整类型,则所述第一子获取装置基于在预定关键词类型库中所得的 信息单元及其类型查询的结果, 确定输入序列 "李弘基主演的电视剧 "的检索信息包 括"李弘基主演 ", 调整信息包括"主演电视剧 "。
需要说明的是,检索信息和调整信息可部分重叠,如上例所示,上述检索信息: "李弘基主演"中的 "主演 "也出现在调整信息"主演电视剧,,中。
2 )根据所述输入序列来获得的语义分析结果。
其中, 所述语义分析结果包括但不限于:
a )基于词性的语义分析结果。 所述词性包括但不限于: 名词、 形容词、 副词、 动词等。 例如, 所述第四获取装置根据输入序列 "李弘基主演的电视剧 "进行切词所 得的 4个词"李弘基"、 "主演"、 "的"及"电视剧 "进行分析, 获得语义分析结果包括: "李弘基"及"电视剧 "为名词, "主演 "为动词, "的"为助词; 则所述第一子获取装置 基于该语义分析结果, 将名词作为检索信息, 将动词作为调整信息, 确定检索信息 包括"李弘基 "及"电视剧", 调整信息包括 "主演"。
b )基于句型的语义分析结果。 例如, 对于输入序列 "李弘基主演的电视剧", 所 述第四获取装置基于句型将其分为 3个部分: "李弘基"、 "主演"、 及"电视剧", 并基 于"李弘基 "在输入序列中位于句首来确定其为主语, 基于"主演"在输入序列中位于 句中来确定其为谓语, 基于"电视剧"在输入序列中位于句尾来确定其为宾语; 根据 前述语义分析结果, 所述第一子获取装置将主语和宾语作为检索信息, 将谓语作为 调整信息, 确定检索信息包括"李弘基"及"电视剧 " , 调整信息包括"主演 "。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何获取用于确定所述检索信息及所述调整信息 的第一类型确定信息, 并根据所述第一类型确定信息, 由来自用户的输入序列中获 取所述检索信息及所述调整信息的实现方式, 均应包含在本发明的范围内。 例如, 综合词性和句型两者的语义分析结果, 例如, "李弘基"为名词且位于句首, "主演" 为动词且位于 "李弘基"之后, "电视剧"为名词且位于 "主演 "之后等, 来获取检索信 息和调整信息; 或者, 所述基于句型的语义分析结果不仅包括主语、 谓语、 宾语, 还进一步包括各个部件间的位置关系, 例如, 定语位于主语之前、 定语位于宾语之 前、 状语位于谓语之前等; 或者, 所述排序确定装置仅将主语作为检索信息, 将谓 语和宾语作为调整信息等。
作为本发明的优选方案之一, 所述第一获取装置还包括输入序列获取装置 (未 予图示) 、 去除装置 (未予图示) 与第二子获取装置 (未予图示) 。 其中, 所述输 入序列获取装置获取来自用户的输入序列; 接着, 所述去除装置去除所述输入序列 中的无效信息, 以获得可用信息; 接着, 所述第二子获取装置由所述可用信息中获 取所述检索信息及所述调整信息。其中,所述无效信息包括但不限于: 1 )助词; 2 ) 空格; 3 ) 标点符号; 4 ) 包含于预定无效词典中的信息单元等。
例如, 对于所述输入序列获取装置所获取的输入序列 "李弘基主演的电视剧" , 所述去除装置去除所述输入序列中的无效信息, 例如, 去除助词"的", 以获得可用 信息"李弘基主演电视剧", 接着, 第二子获取装置再由所述可用信息中获取所述检 索信息及所述调整信息。 其中, 由可用信息中获取所述检索信息及所述调整信息的 方式与前述步骤 S 3中由输入序列中获取检索信息及调整信息的方式相同或相似, 并 以引用的方式包含于此, 不再赘述。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何获取用于确定所述检索信息及所述调整信息 的第一类型确定信息, 并根据所述第一类型确定信息, 由来自用户的输入序列中获 取所述检索信息及所述调整信息的实现方式, 均应包含在本发明的范围内。
图 6示出了本发明一个优选实施例的用于确定资源候选项的排序结果的排序确 定装置示意图。 其中, 所述排序确定装置包括: 第一获取装置 31、 检索装置 32、 排 序装置 33、 第二获取装置 35及提供装置 3 , 所述排序装置 33还包括第一子排序装置 33 1。
具体的, 用户设备 2将用户 1输入的输入序列发送至计算机设备的过程已在图 5 所示的实施例中予以详述, 并以引用的方式包含于此, 不再赞述。
接着, 所述第一获取装置 31由计算机设备所接收的输入序列中获取检索信息及 调整信息。 其中, 所述检索信息包括一个或多个检索单元, 所述调整信息包括一个 或多个调整单元。
例如, 对于输入序列 "李弘基主演和参演的电视剧", 所述第一获取装置 3 1由该 输入序列中获取的检索信息包括检索单元"李弘基 "和检索单元"电视剧 ", 调整信息 包括调整单元 "主演 "和调整单元 "参演"。 其中, 所述第一获取装置 3 1由输入序列中 获取检索信息及调整信息的方式已在参照图 5所示实施例中予以详述, 并以引用的 方式包含于此, 不再赘述。
接着, 检索装置 32根据所述检索信息进行检索, 以获得多个资源候选项。 例如, 检索装置 32根据检索信息 "李弘基"和"电视剧 "进行检索, 获得的资源候 选项包括: 资源候选项 C、 资源候选项 D、 资源候选项 E等等。
接着, 所述第二获取装置 35获取用于辅助确定所述排序结果的第一排序辅助信 息。 其中, 所述第一排序辅助信息包括但不限于以下至少一项:
1 )各个调整单元的权重信息。
例如, 第二获取装置 35获取调整单元"主演"的权重信息, 例如为 5 ; 获取调整 单元"参演"的权重信息, 例如为 1。 本领域技术人员应该理解, 上述权重信息以数 值来表示仅仅只是列示, 而非用于限制本发明, 事实上, 权重信息也可以以其他方 式来表示, 例如, 以等级来表示等等。
2 )所述多个资源候选项中每个资源候选项的调整单元分布信息。 其中所述调 整单元分布信息包括但不限于以下至少一项:
a )该调整单元分布信息所对应的资源候选项中各个调整单元的出现次数。 例如, 检索装置 32所获得的资源候选项包括: 资源候选项 C、 资源候选项 D、 及 资源候选项 E , 第二获取装置 35统计获得各资源候选项中调整单元的出现次数为: 资源候选项 C: 调整单元"主演"出现 2次; 调整单元"参演"出现 0次; 资源候选项 D : 调整单元"主演"出现 0次, 调整单元"参演"出现 1次; 资源候选项 E : 调整单元"主演"出现 0次, 调整单元"参演"出现 0次。
b )该调整单元分布信息所对应的资源候选项中各个调整单元的出现位置。 其中, 所述出现位置包括但不限于: 标题、 摘要、 正文、 诸如 UGC等的多媒体 资源说明性内容等, 所述出现位置可通过资源候选项所对应信息的标签或所对应信 息所包含的文本信息, 例如, <title>、 "摘要 "等, 来识别。
例如, 检索装置 32所获得的资源候选项包括资源候选项 F及资源候选项 G , 第 二获取装置 35根据资源候选项所对应信息的标签获得各个资源候选项中各个调整 单元的出现位置为:
资源候选项 F的标题中包含调整单元 "主演";
资源候选项 G的摘要中包含调整单元 "参演"。
c )该调整单元分布信息所对应的资源候选项中不同调整单元的数量。
例如, 检索装置 32所获得的资源候选项包括资源候选项 H和资源候选项 I , 第 二获取装置 35获得资源候选项 H中包括调整单元 "主演 "和调整单元 "参演 "两个调整 单元; 资源候选项 I中包括调整单元"参演"一个调整单元; 则第二获取装置 35确定 各资源候选项中不同调整单元的数量为:
资源候选项 H : 2;
资源候选项 I: 1。
3 )所述多个资源候选项中每个资源候选项的预定质量信息。
其中, 所述质量信息包括但不限于以下至少一项:
a )该质量信息所对应的资源候选项的权威性。
其中, 所述第二获取装置 35获得资源候选项的权威性的方式包括但不限于以下 至少一项:
1 )获得预设的与资源候选项相对应的权威性;
ϋ )基于预存的权威网站信息来判断资源候选项的权威性;
iii )基于用户的点击率来判断资源候选项的权威性等。
例如,检索装置 32所获得的资源候选项包括资源候选项 J和资源候选项 K,其中, 资源候选项 J对应网站 J, 资源候选项 K对应网站 K, 且第二获取装置 35获得网站 J为 预定权威网站, 网站 K为预定普通网站, 则第二获取装置 35确定资源候选项 J的权威 性为"权威"级别, 资源候选项 K的权威性为"普通"级别。本领域技术人员应该理解, 上述釆用等级来表述质量信息的方式仅为例示, 而非对本发明的限定, 质量信息也 可以以其他方式来表示, 例如, 以值来表示等等。
b )该质量信息所对应的资源候选项的优质度。
其中, 所述第二获取装置 35获得各资源候选项的优质度的方式包括但不限于以 下至少一项:
1 )获得预设的与资源候选项相对应的优质度;
2 )通过对该资源候选项所对应的网站所包含的内容信息进行分析, 来获得所 述优质度。 其中, 对所述内容信息进行分析时所参考的因素包括以下至少一项: i ) 是否包含广告信息; ii )网站所提供的资源质量, 例如, 图片清晰度、视频清晰度、 歌曲音质等; iii )网站所提供的资源数量等。 例如, 对于所获得的资源候选项 L和资 源候选项 M, 其中资源候选项 L对应网站 L , 资源候选项 M对应网站 M; 第二获取装 置 35获取资源候选项 L所对应的网站所包含的内容信息, 并分析得到该内容信息中 不包含广告信息且该网站提供的图片平均像素高于第一预定阔值, 则第二获取装置 35判断该网站 L的优质度为优, 并确定资源候选项 L的优质度等级为"优"; 并且, 第 二获取装置 35获取资源候选项 M所对应的网站所包含的内容信息, 并分析得到该内 容信息中包含广告信息且该网站提供的音乐资源数量高于第二预定阈值, 则第二获 取装置 35判断该网站 L的优质度为优, 并确定资源候选项 L的优质度等级为 "优"。 本 领域技术人员应该理解, 上述采用等级来表述优质度的实现方式仅为列示, 而非对 本发明的限定, 事实上,优质度也可以以其他方式来表示,例如,以值来表示等等。
接着, 所述第一子排序装置 331根据所有调整单元, 并结合所述第一排序辅助 信息, 来确定所述多个资源候选项的排序结果。
具体地, 第一子排序装置 331确定所述排序结果的方式包括但不限于: 1 )先根据所述第一排序辅助信息中的一项来确定所述多个资源候选项的初始 排序结果, 然后再根据所述第一排序辅助信息中的至少一项来调整该初始排序结果 以获得所述排序结果;
2 )仅根据所述第一排序辅助信息中的一项来确定所述多个资源候选项的排序 结果;
3 )根据所述第一排序辅助信息中的多项来直接确定所述多个资源候选项的排 序结果等。
例如, 所述第一子排序装置 331根据所有调整单元, 并结合所述第一排序辅助 信息中的调整单元的权重信息, 来确定各资源候选项的排序结果。 例如, 所述第二 获取装置 35获得"主演"的权重信息为 5 , "参演"的权重信息为 1 , 所述第一子排序装 置 331根据资源候选项 C中包含调整信息单元"主演", 资源候选项 D中包含调整信息 单元"参演" ,并根据各个调整单元的权重信息确定资源候选项 C和资源候选项 D的排 序结果为:
资源候选项 C;
资源候选项 D。
再例如, 所述第一子排序装置 331根据所有调整单元, 并结合所述第一排序辅 助信息中的调整单元的权重信息及每个资源候选项的调整单元分布信息, 来确定各 资源候选项的排序结果。例如,所述第二获取装置 35获得"主演"的权重信息为 5 , "参 演"的权重信息为 1 , 第一子排序装置 331根据资源候选项 C的标题中包含调整信息单 元"主演", 资源候选项 D的摘要中包含调整信息单元"主演", 资源候选项 E包含调整 单元"参演" , 并根据各调整单元的权重信息将包含调整单元 "主演"的资源候选项 C 和资源候选项 D排序在包含调整单元"参演"的资源候选项 E之前, 再根据调整单元 "主演"的出现位置信息,将标题中出现调整单元"主演"的资源候选项 C排序在摘要中 出现调整单元"主演"的资源候选项 D之前, 以获得如下排序结果: 资源候选项 C;
资源候选项 D;
资源候选项 E。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所有调整单元, 并结合所述第一排序辅 助信息, 来确定所述多个资源候选项的排序结果的实现方式, 均应包含在本发明的 范围内。 例如: 1 ) 第一子排序装置 33 1基于调整单元出现次数的由多至少来排序; 2 ) 第一子排序装置 331基于调整单元的数量的由多至少来排序; 3 ) 第一子排序装 置 331根据资源候选项所对应网站的权威性或优质度由高至低进行排序; 4 ) 第一子 排序装置 331先基于调整单元出现的次数的由高至低来对各资源候选项进行排序, 再基于优质度由高至低来对调整单元出现的次数相同的各资源候选项进行排序等 等; 5 ) 当第一排序辅助信息中的每一项均采用值来表示, 第一子排序装置 331根据 第一排序辅助信息中各项的值来获得各个资源候选项的评价值, 并根据该评价值来 对各个资源候选项进行排序等。
接着, 所述提供装置 34根据所述排序结果来生成展现信息, 并通过用户设备 2 提供给所述用户 1。
例如, 所述提供装置 34根据资源候选项 (、 资源候选项 D及资源候选项 E的排序 结果, 确定资源候选项 C对应的展现信息 C、 资源候选项 D对应的展现信息 D及资源 候选项 E对应的展现信息 E的排序如下所示, 并通过用户设备 2将排序后的展现信息 提供给所述用户 1 :
展现信息 C;
展现信息 D;
展现信息£。
需要说明的是, 提供装置 34可根据实际情况, 例如, 用户设备 2所请求的展现 信息数量少于资源候选项数量等, 选择部分资源候选项来生成展现信息, 提供给用 户 1。
图 7示出了本发明另一个优选实施例的用于确定资源候选项的排序结果的排序 确定装置示意图。 所述排序确定装置包括: 第一获取装置 31、 检索装置 32、 排序装 置 33及提供装置 34。 所述排序装置 33还包括第二子排序装置 332。
具体的, 用户设备 2将用户 1输入的输入序列发送至计算机设备 3的过程已在参 照图 5所示的实施例中予以详述, 并以引用的方式包含于此, 不再赘述。
接着, 第一获取装置 31由计算机设备所接收的所述输入序列中获取检索信息及 调整信息。
例如,所述第一获取装置 31由计算机设备所接收的输入序列 "李弘基主演的电视 剧"中获取检索信息"李弘基 "与"电视剧 "以及调整信息"主演"。 其中, 所述第一获取 装置 31由输入序列中获取检索信息及调整信息的方式已在参照图 5所示实施例中予 以详述, 并以引用的方式包含于此, 不再赘述。
接着, 检索装置 32根据所述检索信息进行检索, 以获得多个资源候选项。
例如, 所述检索装置 32根据检索单元 "李弘基"与检索单元"电视剧 "进行检索, 获得的资源候选项包括: 资源候选项 Al、 资源候选项 B 1和资源候选项 Cl。
接着, 第二子排序装置 332根据所述调整信息及所述检索信息, 确定所述多个 资源候选项的排序结果。
例如, 检索装置 32获得资源候选项 Al、 资源候选项 B1和资源候选项 C1 , 第二 子排序装置 332获得资源候选项 A1包含检索信息"李弘基"与 "电视剧"及调整信息"主 演 " ' 资源候选项 B 1包含检索信息 "李弘基"与"主演", 资源候选项 C 1包含检索信息 "李弘基",则所述第二子排序装置 332确定同时包含检索信息和调整信息的资源候选 项 A1排序位于资源候选项 B 1和资源候选项 C 1之前, 并对仅包含检索信息的资源候 选项 B 1及资源候选项 C 1随机排序, 获得排序结果如下:
资源候选项 A1 ;
资源候选项 B1 ;
资源候选项 Cl。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所述调整信息及所述检索信息, 确定所 述多个资源候选项的排序结果的实现方式, 均应包含在本发明的范围内。
接着, 提供装置 34根据所述排序结果来生成展现信息以提供给所述用户 1。 其 中, 提供装置 34根据所述排序结果来生成展现信息以提供给所述用户 1的方式已在 参照图 5所示实施例的提供装置 34中予以详述, 并以引用的方式包含于此, 不再赘 述。
图 8示出了本发明另再一个方面优选实施例的用于确定资源候选项的排序结果 对检索结果进行排序的方法的流程图。 所述排序确定装置包括: 第一获取装置 3 1、 检索装置 32、排序装置 33、第三获取装置 36及提供装置 34。所述排序装置 33还包括: 第二子排序装置 332 , 所述第二子排序装置 332还包括: 第三子排序装置 333。
具体的, 用户设备 2将用户 1输入的输入序列发送至计算机设备 3的过程已在图 5 所示的实施例中予以详述, 并以引用的方式包含于此, 不再赘述。
接着, 第一获取装置 31由计算机设备所接收的输入序列中获取检索信息及调整 信息。 其中, 所述调整信息包括一个或多个调整单元; 所述检索信息包括一个或多 个检索单元。
例如,所述第一获取装置 31由计算机设备所接收的输入序列 "李弘基主演和参演 的电视剧"中获取包括检索单元 "李弘基"与检索单元"电视剧"的检索信息以及包括 调整单元"主演"与调整单元"参演"的调整信息。
上述所述第一获取装置 31由计算机设备 3所接收的输入序列中获取检索信息及 调整信息的方式与图 5所示的第一获取装置 3 1由输入序列中获取检索信息及调整信 息的方式相同或相似, 并以引用的方式包含于此, 不再赘述。
接着, 检索装置 32根据所述检索信息进行检索, 以获得多个资源候选项。
例如, 所述检索装置 32根据检索单元 "李弘基"与检索单元"电视剧 "进行检索, 获得的资源候选项包括: 资源候选项 C l、 资源候选项 D l、 资源候选项 El等等。
接着, 第三获取装置 36获取用于辅助确定所述排序结果的第二排序辅助信息。 其中, 所述第二排序辅助信息包括但不限于以下至少一项:
1 )各个调整单元的权重信息。 该内容已在对图 6所示的第二获取装置 35的说明 中予以详述, 并以引用的方式包含于此, 不再赘述。
2 )所述多个资源候选项中每个资源候选项的调整单元分布信息。 该内容已在 对图 6所示的第二获取装置 35的说明中予以详述, 并以引用的方式包含于此, 不再 赘述。
3 )各个检索单元的权重信息。 例如, 所述第三获取装置 36获取检索单元 "李弘 基"的权重信息, 例如为 5 , 获取检索单元 "电视剧"的权重, 例如为 1。 本领域技术 人员应该理解,上述采用数值来表示权重信息仅仅只是列示, 而非对本发明的限定, 事实上, 权重信息也可以以其他方式来表示, 例如, 以等级来表示等等。
4 )所述多个资源候选项中每个资源候选项的检索单元分布信息。 其中, 所述 检索单元分布信息包括但不限于以下至少一项:
a )该检索单元分布信息所对应的资源候选项中各个检索单元的出现次数。
例如, 所述检索装置 32获得的资源候选项包括资源候选项 Cl、 资源候选项 D1 及资源候选项 E1 , 第三获取装置 36统计获得各资源候选项中调整单元的出现次数为: 资源候选项 CI : 检索单元"李弘基 "出现 2次, 检索单元"电视剧 "出现 2次; 资源候选项 D1 : 检索单元"李弘基 "出现 1次, 检索单元"电视剧 "出现 1次; 资源候选项 E1 : 检索单元"李弘基 "出现 1次, 检索单元"电视剧 "出现 1次。
b ) 该检索单元分布信息所对应的资源候选项中各个检索单元的出现位置。 其中, 所述出现位置包括但不限于: 标题、 摘要、 正文、 诸如 UGC等的多媒体 资源说明性内容等, 所述位置可通过资源候选项所对应信息的标签或文本信息, 例 如, <title>、 "摘要 "等, 来识别。
例如, 所述检索装置 32所获得的资源候选项包括资源候选项 F 1及资源候选项
G 1, 第三获取装置 36根据资源候选项所对应信息的标签获得各个资源候选项中各 个检索单元的出现位置为:
资源候选项 F 1的标题中包含检索单元 "李弘基"与检索单元"电视剧"; 资源候选项 G 1的标题中包含检索单元 "李弘基",摘要中包含检索单元 "电视剧"。 c ) 该检索单元分布信息所对应的资源候选项中不同检索单元的数量。
例如, 所述检索装置 32所获得的资源候选项包括资源候选项 H 1和资源候选项
I I , 第三获取装置 36获得资源候选项 H I中包括检索单元"李弘基 "和"电视剧", 资 源候选项 I 1中包括检索单元"李弘基";则第三获取装置 36确定各资源候选项中不同 检索单元的数量为:
资源候选项 H I : 2;
资源候选项 I I: 1 。
3 ) 所述多个资源候选项中每个资源候选项的预定盾量信息。 该预定盾量信息 已在对图 6所示的实施例的第二获取装置 35的说明中予以详述, 并以引用的方式包 含于此, 不再赘述。
接着, 第三子排序装置 333根据所有调整单元及所有检索单元, 并结合所述第 二排序辅助信息, 来确定所述多个资源候选项的排序结果。
具体地, 第三子排序装置 333根据所述第二排序辅助信息中的至少一项来确定 所述多个资源候选项的排序结果。
具体地, 第三子排序装置 333确定所述排序结果的方式包括但不限于:
1 ) 先根据所述第二排序辅助信息中的一项来确定所述多个资源候选项的初始 排序结果, 然后再根据所述第二排序辅助信息中的至少一项来调整该初始排序结果 以获得所述排序结果;
2 ) 仅根据所述第二排序辅助信息中的一项来确定所述多个资源候选项的排序 结果;
3 )根据所述第二排序辅助信息中的多项来直接确定所述多个资源候选项的排 序结果等。
例如, 所述第三获取装置 36获得检索单元 "李弘基"在资源候选项 C1中出现 2次、 在资源候选项 D 1中出现 1次、 在资源候选项 E1中出现 1次, 且检索单元"电视剧 "在 资源候选项 C 1中出现 2次、在资源候选项 D 1中出现 1次、在资源候选项 E 1中出现 1次, 则所述第三子排序装置 333确定两个检索单元在资源候选项 C 1中出现四次, 两个检 索单元在资源候选项 D1中出现两次, 两个检索单元在资源候选项 E1中出现两次, 且 资源候选项 D1包含 2个调整单元, 资源候选项 E1包含一个调整单元的信息, 则第三 子排序装置 333先根据检索单元的出现次数进行排序以获得初始排序结果, 再根据 调整单元数量来调整该初始排序结果, 获得资源候选项 Cl、 资源候选项 D1与资源候 选项 E1的排序结果如下:
资源候选项 C1 ;
资源候选项 D1 ;
资源候选项 El。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所有调整单元及所有检索单元, 并结合 所述第二排序辅助信息, 来确定所述多个资源候选项的排序结果的实现方式, 均应 包含在本发明的范围内。 例如: 1 ) 所述第三子排序装置 333基于检索单元在各资源 候选项中的出现位置来排序, 例如, 将检索单元出现在标题位置的资源候选项排序 在检索单元出现在摘要位置的资源候选项之前; 2 ) 所述第三子排序装置 333基于各 资源候选项中的不同检索单元的数量来排序, 例如, 将包含检索单元的数量多的资 源候选项排序在包含检索单元的数量少的资源候选项之前; 3 ) 所述第三子排序装 置 333基于每个资源候选项的质量信息来排序, 例如, 将权威网站或者优质网站对 应的资源候选项排序在前; 4 ) 所述第三子排序装置 333同时基于检索单元和调整单 元的权重信息来排序, 例如, 权重信息包括权重值, 第三子排序装置 333将各资源 候选项中各自包含的检索单元权重值和调整单元权重值相乘获得总权重值, 再基于 总权重值来对各资源候选项排序; 5 )所述第三子排序装置 333基于检索单元的的分 布信息来对各资源候选项进行排序, 对分布信息相同者, 再基于调整单元的分布信 息或者资源候选项的盾量信息来排序等等。
接着, 提供装置 34根据所述排序结果来生成展现信息, 并通过用户设备 2提供 给所述用户 1。
其中, 所述提供装置 34根据所述排序结果来生成展现信息, 并通过用户设备 2 提供给所述用户 1的方式已在对图 7所示的实施例的提供装置 34的说明中予以详述, 并以引用的方式包含于此, 不再赘述。
作为本发明的优选方案之一, 所述排序确定装置还包括第五获取装置 (未予图 示)与更新装置(未予图示)。其中, 所述第五获取装置获取关键词单元及其类型; 接着, 所述更新装置根据所述关键词单元及其类型, 建立或更新所述预定关键词类 型库。
其中, 第五获取装置进一步包括关键词获取装置 (未予图示) 、 第六获取装置 (未予图示) 与类型确定装置 (未予图示) 。 所述关键词获取装置获取所述关键词 单元; 接着, 所述第六获取装置获取用于确定所述关键词单元类型的第二类型确定 信息; 接着, 所述类型确定装置根据所述第二类型确定信息来确定该关键词单元的 类型。
其中, 所述关键词获取装置获取关键词单元的方式包括但不限于:
1 ) 由用户输入序列中获取。 例如, 所述关键词获取装置由多个用户输入的输 入序列中获得关键词单元。 例如, 所述关键词获取装置由计算机设备 3所接收的用 户 A输入的输入序列 "电视剧 A"、 用户 B输入的输入序列 "电视剧 B"中获得两个输入 序列中相同的部分"电视剧", 并将该相同部分"电视剧 "作为关键词单元。 再例如, 所述关键词获取装置将输入序列"演员李弘基 "进行切词, 获得"演员"和"李弘基", 将"演员"和"李弘基 "作为关键词单元。
2 )由输入法的词库中获取。例如,所述关键词获取装置由输入法词库中获取 "主 演"和"参演", 并将"主演"和"参演"作为关键词单元。
3 )获取预设的关键词单元等。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何获取所述关键词单元的实现方式, 均应包含 在本发明的范围内。
接着,所述第六获取装置确定所述关键词单元类型的第二类型确定信息。其中, 所述第二类型确定信息包括但不限于以下至少一项:
1 ) 关键词单元在预定语料库中的分布集中度。 其中, 所述预定预料库中包含 多个语料。 其中, 所述分布集中度表示关键词单元在预定语料库的多个语料中的分 布集中程度, 该分布集中度根据该关键词单元在预定语料库中的出现信息及包含该 关键词单元的不同语料的数量信息来获得。
其中, 所述出现信息以下至少一项:
1 )该关键词单元在预定语料库中的出现次数;
2 )该关键词单元在预定语料库中的出现次数占语料库中所有关键词数量的比 例;
所述数量信息包括以下至少一项:
1 ) 包含该关键词单元的不同语料的数量;
2 ) 包含该关键词单元的不同语料的数量占所有语料的数量的比例。
例如, 所述第六获取装置获得关键词单元"李弘基"在所述预定语料库的各语料 中共出现 1000次,且预定语料库中包含关键词单元"李弘基"的不同语料的数量为 500, 则分布集中度 =1000/500=2等。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据该关键词单元在预定语料库中的出现信 息及包含该关键词单元的不同语料的数量信息来获得分布集中度的实现方式, 均应 包含在本发明的范围内。
2 )根据该关键词单元来获得的语义分析结果。 其中, 所述语义分析包括但不 限于关键词单元的词性, 如名词、 动词, 形容词等等。 例如, 所述第六获取装置对 关键词单元"李弘基"进行词性分析获得语义分析结果为名词。
3 ) 包含该关键词单元且匹配同一语料的用户历史输入序列数量。
其中, 所述匹配同一语料的用户历史输入序列为检索结果中包含同一语料的用 户历史输入序列, 例如, 三个用户历史输入序列" iphone4签售"、 "iphone4发售"、 "iphone4开卖"的检索结果中均包含同一语料 "iphone4的销售额已突破... ", 则
"iphone4签售 "、 "iphone4发售"及" iphone4开卖"为匹配同一语料的用户历史输入序列。
则对于关键词单元" iphone4" ,若包含其的用户历史输入序列包括" iphone4签售"、 "iphone4发售 "、 "iphone4开卖 "、 "iphone4游戏' '及" iphone4误乐 ", 其中, "iphone4签 售, '、 "iphone4发售,,及' 'iphone4开卖',匹 S己同一语料, "iphone4游戏, '及" iphone4娱乐,, 匹配同一语料, 则所述第六获取装置确定包含该关键词单元" iphone4"且匹配同一语 料的用户历史输入序列数量为 5。
接着, 类型确定装置根据所述第二类型确定信息来确定该关键词单元的类型。 其中, 所述类型包括: 检索类型和调整类型等; 优选地, 还包括需要从输入序列中 去除的无效类型等。 例如, 所述第六获取装置获得该关键词单元"李弘基 "在预定预料库中的分布集 中度为 6.5 , 则类型确定装置判断该分布集中度 6.5超过分布预定阈值 4 , 并确定该关 键词单元"李弘基"的类型为检索类型。
又例如, 所述第六获取装置获得关键词单元 "电视剧"的语义分析结果为名词, 则类型确定装置基于该语义分析结果将该关键词单元"电视剧"的类型确定为检索类 型。 再例如, 所述第六获取装置获得关键词单元 "主演 "的语义分析结果为动词, 则 类型确定装置基于语义分析结果将该关键词单元"主演"确定为调整类型。 再例如, 所述第六获取装置获得关键词单元 "的 "的语义分析结果为助词, 则类型确定装置基 于语义分析结果将该关键词单元"的"确定为无效类型。
叉例如, 第六获取装置获得该关键词单元"李弘基 "且匹配同一语料的用户历史 输入序列数量为 1000 , 高于预定判断阈值, 则类型确定装置确定将该关键词单元"李 弘基"的类型为检索类型。
又例如, 第六获取装置获得该关键词单元"李弘基"的语义分析结果为名词且匹 配同一语料的用户历史输入序列数量为 1000 , 则类型确定装置根据当名词的匹配同 一语料的用户历史输入序列数量超所 900时将其该名词定为检索类型的预定规则, 确定该关键词单元"李弘基 "的类型为检索类型。
又例如, 第六获取装置获得该关键词单元"李弘基 "的分布集中度为 6.5且匹配同 一语料的用户历史输入序列数量为 1000 , 则排序确定装置先将分布集中度及匹配同 一语料的用户历史输入序列数量归一化, 然后相加, 关键词单元"李弘基 "的综合评 价值为 1.2 ,高于综合预定阈值,则确定将该关键词单元"李弘基"的类型为检索类型。
需要说明的是, 上述举例仅为更好地说明本发明的技术方案, 而非对本发明的 限制, 本领域技术人员应该理解, 任何根据所述第二类型确定信息来确定该关键词 单元的类型的实现方式, 均应包含在本发明的范围内。
对于本领域技术人员而言, 显然本发明不限于上述示范性实施例的细节, 而且 在不背离本发明的精神或基本特征的情况下, 能够以其他的具体形式实现本发明。 因此, 无论从哪一点来看, 均应将实施例看作是示范性的, 而且是非限制性的, 本 发明的范围由所附权利要求而不是上述说明限定, 因此旨在将落在权利要求的等同 要件的含义和范围内的所有变化涵括在本发明内。 不应将权利要求中的任何附图标 记视为限制所涉及的权利要求。 此外, 显然"包括"一词不排除其他单元或步骤, 单 数不排除复数。 系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通 过软件或者硬件来实现。 第一, 第二等词语用来表示名称, 而并不表示任何特定的 顺序。
具体实施方式
现在, 将参考附图来详细地描述本发明的具体实施方式。
以下对本发明的具体实施例进行了描述。 需要理解的是, 本发明并不局限于下 述特定实施方式, 本领域技术人员可以在所附权利要求的范围内做出各种变形或修 改。
应当理解, 本申请之任一流程图中所示的方法步骤并不要求严格按照图示的顺 序执行。 某些步骤可以在另一些步骤之前执行, 或并入其它步骤, 有些步骤可以同 步执行, 诸如此类。
虽然以下的各具体实施例中, 均以鼠标、 键盘作为人机交互时的输入设备, 并 以用户设备的显示器作为人机交互的输出设备, 应当理解, 本发明并不排除使用其 它输入设备和输出设备的情形, 例如, 用户通过手写板进行输入, 用户设备通过扬 声器作为输出设备等。
当前的用户终端大多能够通过以太网、 WIFI、 3G或 2G随时连接到互联网, 因 此如果能够利用广泛分布于互联网上的服务器的强大处理能力和庞大的语料库来 辅助用户在用户终端上进行文字录入将能够大大地提高文字录入的准确度和效率。 因此本发明提出一种利用网络服务器辅助进行文字录入的方法、 用户设备、 网络服 务器和系统。
本发明用于供用户进行文字输入的方法、 设备、 服务器及系统的第一实施例: 如图 4所示, 在本发明第一实施方式的用户设备 140上存储有本地语料库 1403 , 该本地语料库 1403中存储有基础词汇集、 基础语言模型、 用户使用输入法过程中生 成的词汇集。
除此之外, 该本地语料库 1403还可以存储一些辅助信息: 例如, 用户对输入法 的各种设置属性, 包括但不限于模糊音、 繁简体、 双拼、 全拼、 简拼等等; 以及用 户的属性信息, 包括但不限于职业、 爱好、 专业领域、 简历、 年龄等等。 这些辅助 的信息有助于对候选词条进行优化排序。
用户设备 140还具有键盘 1401 , 用于由用户输入文字的拼音字母或者笔画序列。 用户设备 140中的匹配装置 1402基于所输入的拼音字母或笔画序列在本地语料库 1403中查找匹配的本地词条选项, 并通过显示装置 1406显示出来, 供用户选择。 该 键盘可以是纯数字键盘或全字母键盘(QWERTY键盘) , 也可以是实体键盘或虚拟 键盘。 为了在输入过程中通过互联网获得来自网络服务器 150的协助, 本发明的用户 设备 140中增加了网络通信装置 1404和汇总装置 1405。 网络通信装置 1404通过互联 网或局域网与网络服务器 150进行通信, 把通过键盘 1401输入的拼音字母和笔画序 列发送到网络服务器 150。 网络服务器 150利用庞大的网络语料库和强大的处理能力 来查找适合的词条选项。 由网络服务器 150获得的网络词条选项被返回到网络通信 装置 1404 , 并且由网络通信装置 1404将接收到的网络词条选项传输给汇总装置 1405。 汇总装置 1405接收来自匹配装置 1402的本地词条选项和来自网络通信装置 1404的 网络词条选项, 经过汇总后在显示装置 1406上显示出来, 供用户选择。
网络服务器 150可以是分布在互联网上的多个网络服务器 1501 ... ... 150η。这些网 络服务器 1501 ... ... 150η协同工作, 构成一个服务器云, 为大量用户提供服务。 网络 服务器 150也可以是位于企业局域网上的一个或多个服务器。
图 1为根据本发明第一实施方式, 在与网络服务器通信的用户设备端输入文字 的方法的流程图。 如图所示, 在步骤 S1 101中, 检测用户在用户设备的键盘 1401上 的按键输入序列。 该按键序列可以是一个或多个词组甚至一句话的简拼或全拼, 例 如用户要输入"我喜欢用百度搜索引擎",可以输入每个字声母的筒拼 "wxhybdssyq" , 可以输入每个字的全拼" woxihuanyongbaidusousuoyinqing" , 还可以输入简拼和全拼 的混合输入" woxhuanybaidssyinq"。一般来说,全部输入全拼的话,候选词更加精确 , 减少翻页查找的次数, 但是需要输入较多字符。 如果全部输入简拼, 则重码较多, 导致翻页查找时间较长, 效率不高。 因此通常全拼和简拼混合输入比较有效。 特别 是, 如果用户输入一个新词条, 本地语料库 1403会根据用户的选词同步地把新词条 该词条。
接着, 在步驟 S1 102中, 获得用户的按键输入序列后, 将所述输入序列在用户 设备 140的本地语料库 1403中进行匹配查询, 以获得一个或多个匹配的本地词条选 项。
在步骤 S 1 103中, 将所述按键输入序列发送至网络服务器 150。
上述步骤 S 1 102和 S 1 103可以先后执行, 也可以同时执行。
为了快速地显示所获得的词条选项, 在步驟 S 1 102中获得本地词条选项之后, 可以立即转到步骤 S1 105 ,将所获得的本地词条选项汇总并显示给用户,供其选择。 与此同时, 网络服务器 150收到来自用户设备 140的按键输入序列, 并在网络语料库 中查找匹配的网络词条选项。 在步骤 S 1 104 , 用户设备 140接收到来自网络服务器 150的网络词条选项并发送 到汇总装置 1405。 然后转到步骤 S1105 , 在汇总装置 1405中将来自匹配装置 1402的 本地词条选项和来自网络服务器 150的网络词条选项进行汇总后提供给显示装置 1406显示, 由用户进行选择。 经过与网络词条选项汇总后的词条选项更加精确。
由于输入的过程是动态连续的过程, 词条选项也是随着用户按键输入而不断变 化的, 因此步骤 S 1105之后又转回步骤 S 1101检测用户设备的按键输入。
上述步骤之间的次序是可以调换的, 而不影响本发明的实现。 例如为了更快地 获得网络词条选项, 可以调换步驟 S 1102和 S 1103的次序, 先把检测到的按键输入序 列发送到所述网络服务器。
在汇总装置 1405收到本地词条选项和网络词条选项之后如何进行汇总和显示 是本发明要解决的一个问题。
由于网络传输和服务器处理的滞后, 汇总装置 1405—般会先收到本地词条选项 然后才收到网络词条选项, 当网络服务器还没有反馈回网络词条选项时, 可以立即 将本地词条选项提供给显示装置 1406供用户选择, 不必与网络词条选项一同显示。
根据本发明第一实施方式中的第一实例, 为了快速地显示出候选词条, 汇总装 置 1405在获得所述本地词条选项之后, 在接收到所述网络词条选项之前, 将所获得 的本地词条选项按照其优先级排序在词条栏中显示给所述用户, 其中优先级越高, 该输入词条选项越靠前显示。 具体地, 匹配装置 1402可以根据用户输入历史记录中 对各个词条选项的选择频度、 各词条选项中各个词汇间的文义关联性来确定其优先 级高低。 匹配装置 1402也可根据用户设定的输入偏好选择来确定优先级高低。
在汇总装置 1405接收到所述网络词条选项之后, 用户可能已经在先前显示的本 地词条选项中选定部分词条, 或者已经翻页浏览部分本地词条选项。 这时所接收到 的网络词条选项中可能有部分词条与先前获得的本地词条选项相同。 因此需要从网 络词条选项中剔除这些已经被选定和 /或重复的词条。接着, 将剩余的网络词条选项 按照该词条的优先级插入到当前和后续显示的本地词条选项中, 而不改变本地候选 词条排列的先后次序。 这样处理的优点是, 在加入网络候选词条时, 用户当前浏览 的词条选项栏上的词条位置不会有太大的变化。
假设, 要用简拼" wxhybdssyq"输入"我喜欢用百度搜索引擎", 从本地语料库中 检索出 "无信号,微型化,无限好,玩笑话, ... "等本地词条选项对应于字母组合 "wxh" , 随后收到对应于 "wxh"的网络词条选项"无信号,微型化,我喜欢,无限好,玩笑话 ..."。 网络词条选项比本地词条选项多了一个"我喜欢"。 如果此时用户还没有对字母组合 "wxh"选定词条, 则将网络词条选项中重复的词条剔除后剩下的词条"我喜欢 "插入 到当前显示的选项中, 并不改变本地候选词条排列的先后次序。 如果词条"我喜欢" 的排序优先级比较当前显示的本地词条选项更低, 则将该词条"我喜欢 "按照其优先 级插入到后续显示的词条选项的适当位置。 当用户在词条选项栏上向后翻页时会显 示出已经插入在本地词条选项中的网络词条选项 "我喜欢"。
如果本地词条选项中也有选项"我喜欢" , 并且在收到对应于" wxh"网络词条选 项时, 用户已经在本地词条选项中选择了 "我喜欢", 则与字母组合 "wxh"对应的词 条被确定, 则剔除对应 "wxh"的所有网络词条选项。
由于网络语料库非常庞大, 匹配的结果会更加准确,例如可能直接返回"我喜欢 用百度搜索引擎"这个词条, 而不需逐个词组进行翻页选词, 因此即便网络反馈稍有 滞后仍然会大大加快输入速度。
在上述实例中, 由于网络滞后, 一些本地词条选项已经被显示, 网络词条选项 作为对本地词条选项的补充, 只需要选择本地词条选项中没有的网络词条选项按照 一定次序插入到当前显示的词条中即可。 这时并不改变本地词条选项显示的先后次 序, 只是补充一下本地词条选项中没有的网络词条选项。
本发明第一实施方式的第二实例与第一实例类似, 为了快速地显示出候选词条, 汇总装置 1405在获得所述本地词条选项之后, 在接收到所述网络词条选项之前, 也 是先将所获得的本地词条选项按照其优先级排序在词条栏中显示给所述用户。
与第一实例不同之处在于, 在接收到所述网络词条选项之后, 将所述网络词条 选项中剔除已经显示的词条后剩余的网络词条选项与当前显示和还未被显示的本 地词条选项一同按照这些词条的优先级重新排列后显示, 供用户选择。 这里所述的 已经显示的词条是指当前显示的词条之前被浏览过的、 被翻页排除的词条选项。
举例来说, 假设要用简拼 "wxhybdssyq"输入"我喜欢用百度搜索引擎", 从本地 语料库中检索出"无信号, 微型化, 无限好, 玩笑话, ..., 我喜欢 "等本地词奈选项 对应于字母组合" wxh" , 其中正确的词条"我喜欢 "优先级比较靠后, 需要多几翻页 才能找到。 随后收到对应于" wxh"的网络词条选项"我喜欢, 无信号, 微型化, 无限 好,玩笑话 ...",由于匹配精度较高,在网络词条选项中 "我喜欢"的优先级比较靠前。 这时用户正在浏览本地词条选项的前几个词条"无信号, 微型化, 无限好, 玩笑话" 等无关词条。 已经显示过的词条由于没有被用户所选择, 因此先从收到的网络词条 选项中剔除已经显示过的词条, 将剩下的网络词条选项与当前显示和还未被显示的 本地词条选项一同按照这些词条的优先级重新排列后, 由于正确的词条"我喜欢 "在 网络词条选项中优先级较高, 重新排列后被调整到当前显示的词条选项中。 该实施 例的优点是能够迅速地把正确的词条调整到词条选项的前列或首位。 由于相同词条 在本地词条选项和网络词条选项中的优先级可能不同, 因此在重新排列时可以按照 该词条的两个优先级的加权平均来决定新的次序。
在第二实例中, 由于后收到的网络词条选项与当前显示和未显示的本地词条选 项按照一定规则重新排列。 这时并会改变本地词条选项显示的先后次序。
本发明第一实施方式的第三实例与第二实例类似, 区别仅仅在于重新排列时完 全按照该网络词条选项的优先级进行重排, 在此不再赘述。
在第三实例中, 收到的网络词条选项后不再显示本地词条选项, 而是显示先前 未显示网络词条选项。
本发明第一实施方式的第四实例与第一至第三实例不同, 在获得所述本地词条 选项之后, 并不立即将所获得的本地词条选项按照其优先级显示给用户以供其选择, 而是等待接收到所述网络词条选项之后才按照这些词条的优先級整体排列后显示 给用户进行选择。 该实施例显示的词条选项比较准确, 特别是对于整句话连续输入 再进行词条选择的情况更加有利, 因为整句话连续输入的时间比较长, 网络响应的 延迟不会造成太大的影响, 而整句话的匹配需要更大语料库、 语言匹配模型和处理 能力的支持, 因此等待接收到网络词条选项才显示给用户选择将提供更加准确的结 果。
在第四实例中, 等待接收到网络词条选项后才一同排序和显示本地词条和网络 词条选项 (需要剔除重复选项) , 在网络响应速度非常快的情况以及整句连续输入 的情况下比较有利。
匹配装置 1402可以根据词条先前是否被选择过、 词条先前被选择的时间先后、 词条先前被选择的次数、用户预置的输入偏好选项和 /或词条在网络上的被搜索的次 数来确定本地词条选项的优先级。
在网络服务器 150上的网络语料库可以包括分别对应于每个用户的用户网络语 料库 1501以及公共网络语料库 1504。
用户网络语料库 1501是每个注册的用户的本地语料库 1403在网络服务器 150上 备份。 用户设备 140还包括本地同步装置 (未示出) , 用于当注册用户登录网络服 器 150或者将该用户保留在网络服务器 150上的用户网络语料库 1501与本地语料库 1403进行同步。 用户设备 140还包括本地更新装置 (未示出) , 用于根据用户对词 条的选择更新所述本地语料库 1403 , 并且将该选择发送到网络服务器 150以更新所 述用户网络语料库 1501。 由于一些词在上下文中将被重复输入, 需要及时更新本地 词库 1403和用户网络语料库 1501 , 提高最近被输入词条的优先级以加快输入。
与本地语料库 1403相同, 用户网络语料库 1501也存储有基础词汇集、 基础语言 模型、 用户使用输入法过程中生成的词汇集。 还可以存储一些辅助信息: 例如, 用 户对输入法的各种设置属性, 包括但不限于模糊音、 繁简体、 双拼、 全拼、 筒拼等 等; 以及用户的属性信息, 包括但不限于职业、 爱好、 专业领域、 简历、年龄等等。
由于网络服务器 150上保存有用户网络语料库, 因此无论用户用哪个终端设备, 只要能够连接到网络服务器 150都可以通过登录后同步本地语料库 1403或在线使用 用户网络语料库 1501来快速地进行录入。
公共网络语料库 1504基于对公开文献、 出版物、 大量用户的输入、 大量用户在 网络搜索引擎上的检索词汇、大量网页的索引关键词和 /或关键词广告信息进行分析 统计而形成, 其反映用户群体的共性或热点。
下面参照图 2、 图 3和图 5描述网络服务器 150的结构和其操作流程。
如图 5所示, 本发明第一实施方式的网络服务器 150包括用户网络语料库 1501、 匹配装置 1502、 网络通信装置 1503、 公共网络语料库 1504、 语料库更新装置 1505、 关键词广告库 1506以及同步装置 1507。
如上文所述, 用户网络语料库 1501和公共网络语料库 1504合称为网络语料库。 网络通信装置 1503通过网络连接到一个或多个用户设备 140, 用于经由网络接收用 户在用户设备上的按键输入序列, 并将基于该按键输入序列所获得的网络词条选项 反馈回所述用户设备, 供用户选择。
匹配装置 1502连接到用户网络语料库 1501、 公共网络语料库 1504和匹配装置 1502 , 用于基于所述按键输入序列在用户网络语料库 1501和公共网络语料库 1504中 进行匹配查询获得一个或多个匹配的网络词条选项。 该匹配装置 1502还包括优先級 确定装置 (未示出),用于根据词条先前是否被选择过、词条先前被选择的时间先后、 词条先前被选择的次数、用户预置的输入偏好选项和 /或词条在网络上的被搜索的次 数来确定所匹配的词条选项中各个词条的优先级。
同步装置 1507连接到用户网络语料库 1501和网络通信装置 1503, 用于当用户通 过所述用户设备 140登录网络服务器 150时, 在接收到来自用户设备的语料库同步指 令之后, 将该用户的所述用户网络语料库 1501与该用户设备中的本地语料库 1403进 行同步。 语料库更新装置 1505连接到用户网络语料库 1501和公共网络语料库 1504 , 用于 根据用户的输入和选词来更新用户网络语料库 1501 , 以及才良据对大量用户的输入、 大量用户在网络搜索引擎上的检索词汇、大量网页的索引关键词和 /或关键词广告信 息进行分析统计来更新公共网络语料库 1504。
关键词广告库 1506用于提供与关键词相关的广告链接。 一些厂商可以买下若干 关键词或者字母组合,例如,百度公司可以买下"百度"、 "搜索引擎 "这样的关键词, 也可以买下" bd"、 "baidu"、 "ssyq"等等这样的字母组合, 当匹配装置 1502基于来自 用户设备 140的按键序列在网络语料库中匹配到 "百度 "或"搜索引擎"这样的词条,或 者收到 "bd"、 "baidu"、 "ssyq"这样的组合时, 从关键词广告库 1506中找到对应的关 键词广告信息"百度网 "及其链接。 将该关键词广告信息通过网络通信装置 1503返回 到用户设备 140并显示在词条选项中。 用户选择该广告信息就可以跳转到相应的网 址链接。 这里的 "选择 "包括鼠标点击, 也包括直接通过键盘选择对应的数字选择按 键。 匹配装置 1502中的优先级确定装置可以对关键词广告信息分配较高的优先级, 以保证该广告信息被安排在显示给用户的第一轮词条选项中或当前显示的选项中 益示。
根据需要, 关键词广告库 1506可以合并到公共网络语料库 1504中, 或者合并到 用户网络语料库 1501中, 当需要更新关键词广告信息时, 只需把新的关键词广告信 息在网络服务器 150加入到用户网络语料库 1501中, 通过同步的方式下载到用户设 备 140的本地语料库 1403。 这样本地语料库 1403中包含关键词广告库, 在本地语料 库中找到关键词广告信息时, 将该关键词广告信息作为本地词条选项显示出来, 所 述关键词广告信息带有链接, 当用户选择所显示的广告信息时可以跳转到相关链接 即使用户设备 140没有连接到互联网, 也可以在输入文字时出现广告信息的词条选 项, 增加广告的曝光率。
图 2为根据本发明第一实施方式在网络服务器上辅助用户进行文字输入的方法 的流程图。
如图 2中所示, 在步骤 S 1201中, 网络服务器 150的网络通信装置 1503经由网络 接收用户在用户设备 140上的按键输入序列;
在步驟 S 1202中, 基于所述按键输入序列在网络语料库中进行匹配查询获得一 个或多个匹配的网络词条选项;
在步骤 S 1203中, 将所获得的所述网络词条选项反馈回所述用户设备 140 , 供用 户选择。 如上文所述, 用户可以在网络服务器 150上注册为注册用户。 注册用户可以登 录到网络服务器 150, 并在网络服务器 150上保留用户网络语料库 1501。 在已登录的 情况下, 用户还可以选择是否同步用户网络语料库 1501和用户设备的本地语料库 1403。 根据用户是否登录以及是否执行语料库的同步, 匹配装置 1502可以执行不同 的匹配操作, 以提供精确的网络词条选项。 图 3示出在这些情况下的处理步骤。
图 3为根据本发明第一实施方式在网络服务器上辅助用户进行文字输入的方法 中在网络语料库中进行匹配查询的具体步骤的流程图。
如图 3所示, 在步驟 S1201之后, 执行步骤 S1301 , 判断用户是否登录网络服务 器 150 , 如果没有登录, 由于网络服务器 150不能判断该用户的身份, 则不能利用该 用户保留在网络服务器 150上的用户网络语料库 1501进行匹配查询, 而是转到步驟 S 1503 , 在接收到用户在用户设备上的按键输入序列时仅仅在所述公共网络语料库 中检索匹配的网络词条选项。登录的方式可以有多种,例如利用用户名和密码登录、 利用用户设备的 MAC地址自动登录、 利用用户设备的固定 IP地址自动登录等等。
如果用户已经登录, 则进行到步骤 S 1302 , 判断该用户的用户网络语料库 1501 是否与用户设备的本地语料库 1403同步。 如果用户是用他人的或公共的电脑, 他不 希望把自己的用户网络语料库 1501同步到本地设备上, 也不希望本地设备上他人的 词库同步到自己的用户网络语料库 1501上, 因此用户可以选择不同步。 但是, 用户 还是希望能获得其用户网络语料库 1501的支持。 因此, 当步骤 S1302判断没有同步 的情况下, 在接收到用户在用户设备上的按键输入序列时在所述公共网络语料库 1504和该用户的用户网络语料库 1501中检索匹配的网络词条选项。 如果步骤 S1302 判断没有用户网络语料库已经与用户设备的本地语料库同步的情况下, 由于用户设 备会首先在本地语料库中查找匹配词条选项 , 因此网络服务器 150没有必要重复在 相同的用户网络语料库 1501中查找匹配的词条选项, 因此这时在接收到用户在用户 设备上的按键输入序列时仅仅在所述公共网络语料库 1504中检索匹配的网络词条 选项。
另外, 上述的匹配装置 1402、 匹配装置 1502在根据用户输入序列在词库中进行 匹配查询获得多个输入词条选项时还获得各个输入词条选项的优先级。 汇总装置 1405将匹配装置 1402和匹配装置 1502提供的多个匹配的输入词条选项按优先级顺 序在词条栏中显示给所述用户, 其中优先级越高, 该输入词条选项越靠前显示。 具 体地, 匹配装置 1402和匹配装置 1502可以根据用户历史记录中对各个词条选项的选 择频度、 各词条选项中各个词汇间的文义关联性来确定其优先级高低。 匹配装置 1402和匹配装置 1502也可根据用户设定的输入偏好选择来确定优先级高低, 例如, 当用户设定输入偏好为: 1 )优先级高低: 计算机词汇 >电子词汇>普通词汇; 2 )优 先级高低:中文 >英文,则在输入序列 "woshiyongwindows"可判断 "我使用视窗软件" 优先级最高, "我使用视窗"次之、 "我使用 windows"再次之。 另外, 匹配装置 1402 和匹配装置 1502还可根据目前用户设备的 IP地址来判断其所处的地域, 从而可以确 定输入序列中与该地域相关的词汇的优先级, 例如, 但用户输入序列为
"woxihuanbund" , 其中" bund"的译文有" 1提岸 2码头 3同盟 4 (上海) 外滩" , 但网 络通信装置 1503根据用户设备 IP地址获知目前位于中国上海市, 从而可确定" bund" 对应译文中 "上海外滩"或"外滩"优先级最高, 因而可提供如下输入词条选项 "1我喜 欢上海外滩; 2我喜欢外滩; 3我喜欢码头; 4我喜欢堤岸; 5我喜欢同盟"。
本发明的第一实施方式还提供一种用于供用户进行文字输入的系统, 其中包括 本发明第一实施方式的用于输入文字的用户设备和本发明第一实施方式的用于辅 助用户设备输入文字的网络服务器。
本发明用于供用户进行文字输入的方法、设备、服务器及系统的第二实施方式, 该第二实施方式是在第一实施方式的基础上, 增加了 "群组语料库 "的概念, 以更好 的提升输入法首选项的命中率, 提高输入效率:
内部网或局域网的多用户, 如企业客户、 网吧、 翻译服务等类型的用户通常具 有较明显的共性。这种共性可能是相同或相似的工作内容、相同或相似的兴趣爱好、 相同或相似的年龄阶段、 相同或相似的地理区域等。 由于这种共性, 多个用户之间 在进行文字输入时对于备选词项的选择体现出一种趋同或相似。 例如, 同样是键入 "shzh"这样的缩略输入, 对于深圳市的居民, 命中率最高的首选词项是"深圳"; 对 于河北省深州市的居民, 命中率最高的首选词项则 "深州 "; 对于航天产业的从业人 员, 命中率最高的首选词项则很有可能是"神舟 (N号) "; 而对于以上共性皆不具 备的普通用户来说, 命中率最高的首选词项有可能是"神州 (大地) "。 尽管现在大 多数输入法都提供词频调整功能, 即根据输入选择的最新历史记录调整备选词项的 显示顺序, 然而这种调整必须进行学习或训练, 当目标是一个并不常用甚至较为生 僻的词时, 这种繁瑣是不言而喻的。 并且这种现有的词频调整也是不稳定的, 例如 当用户换了另一台计算机, 将需要重新进行学习和训练。 因此, 如果能够通过网络 对具有较大共性的其他用户的先前输入进行利用, 辅助用户在用户终端上进行文字 录入, 无疑将大大地提高文字录入的准确度和效率。
在本实施方式中, 与网络服务器进行通信的供用户进行文字输入的用户设备与 第一实施方式所揭示的相同, 在此可参考说明书第一实施方式中结合图 4的阐述, 在此不再赘述。
图 6为根据本发明第二实施方式在与网络服务器通信的用户设备端输入文字的 方法的流程图。 如图所示, 在步骤 S2101中, 检测用户在用户设备的键盘 1401上 的按键输入序列。 该步驟和本发明第一实施方式中步骤 S 1 101相似, 其具体应用举 例可参本发明第一实施方式步骤 S 1 101中所记载的内容, 在此不再赘述。 但利用根 据本实施方式的文字输入方法, 用户可以更进一步地简化输入, 甚至在只用每个字 的首字母的情况下, 也能很快得到想要的结果, 因为可利用的群组词条, 是已经由 与之具有共性的其他用户训练过的。
接着, 在步骤 S2102中, 获得用户的按键输入序列后, 将所述输入序列在用户 设备 140的本地语料库 1403中进行匹配查询, 以获得一个或多个匹配的本地词条 选项。
在步骤 S2103中 , 将所述按键输入序列发送至网络服务器。
上述步骤 S2102和 S2103可以先后执行, 也可以同时执行。
为了快速地显示所获得的词条选项 , 在步骤 S2102中获得本地词条选项之后, 可以立即转到步骤 S2105 ,将所获得的本地词条选项汇总并显示给用户,供其选择。 与此同时, 网络服务器收到来自用户设备 140的按键输入序列, 并在与之所属群组 相关联的群组语料库中查找匹配的群组词条选项。
在步驟 S2104 , 用户设备 140接收到来自网络服务器的群组词条选项并发送到 汇总装置 1405。 然后转到步骤 S2105 , 在汇总装置 1405中将来自匹配装置 1402的 本地词条选项和来自网络服务器的群组词条选项进行汇总后提供给显示装置 1406 显示, 由用户进行选择。 由于网络传输和服务器处理的滞后, 汇总装置 1405—般 会先收到本地词条选项然后才收到群组词条选项, 当网络服务器还没有反馈回群组 词条选项时, 可以立即将本地词条选项提供给显示装置 1406供用户选择, 不必与 群组词条选项一同显示。 当然, 经过与群组词条选项汇总后的词条选项更加精确。
由于输入的过程是动态连续的过程, 词条选项也是随着用户按键输入而不断变 化的, 因此步骤 S2105之后又转回步骤 S2101检测用户设备的按键输入。
上述步骤之间的次序是可以调换的, 而不影响本发明的实现。 例如为了更快地 获得群组词条选项, 可以调换步驟 S2102和 S2103的次序, 先把检测到的按键输入 序列发送到所述网络服务器。
在本实施方式的第一实例中, 上述方法还可以包括用户设备侧的注册步骤 S2106 , 即在进行文字输入之前, 可以通过网络通信装置 1404将所述用户注册到网 络服务器上的一个或多个用户群组, 所述用户群组与所述群组语料库相关联。 这种 注册过程例如可以采用本领域公知的群组注册功能。 但是, 也可能不需要这样的注 册步骤即可将用户与某个群组相关联。 例如, 由于用户属于某个业务部门, 针对该 业务部门的群组在 (例如由部门主管)创建时, 部门成员包括该用户即被自动添加 到了该群组。 这样, 当用户在进行文字输入时, 网络服务器基于其身份可以立即确 定其属于该业务部门群组, 并从所对应的群组语料库中为其搜索相关群组词条。
在实施方式中, 上述方法还可以包括所述用户向所述网络服务器发送用户身份 信息的步骤 S2107 , 以便所述网络服务器确定与之关联的用户群组, 进而确定与之 关联的群组语料库。 然而该步骤也并非网络服务器用于确定用户群组所必不可少的 步骤。 例如, 当网络服务器本身是仅适用于某个或某些用户群组的内部网服务器, 则用户无需任何认证程序即可被认为可以利用该网络服务器上的群组语料库。
在本实施方式中, 上述方法还可以包括如下步骤 S2108 : 所述用户将自己选定 的词条发送到所述网络服务器, 以便所述网络服务器更新至与所述用户所属用户群 组相关联的群组语料库。 根据这一功能, 用户群组中的每个成员可以向群组语料库 提供自己的贡献, 例如新的词条、 自身累进的输入习惯等, 这些资源可以以适当形 式被收集在群组语料库的词条或词条属性中, 供群组中其他成员参考或直接利用。 优选地, 对于用户群组中具有较高权限或占主导地位的成员的贡献, 赋予较大的权 重。 例如, 部门主管所提供的新词条, 在反馈时具有较高优先级。
在图 6中 , 以虚线方框以及虚线连接线来表示上述可选步骤 S2106-S2108。 下面参照图 7、 图 9描述本实施方式中网络服务器的结构和其操作流程。
图 9为根据本实施方式用于辅助用户进行文字输入的网络服务器的方框图。 如图 9中所示, 网络服务器 250包括一个或多个群组语料库 2501 (为简化示图 只示出了其中一个) 、 匹配装置 2502、 网络通信装置 2503、 群组管理装置 2504以 及群组语料库管理和更新装置 2505。
如图 9中所示, 网络通信装置 2503通过网络连接到一个或多个用户设备 140, 用于经由网络接收用户在用户设备上的按键输入序列, 并将基于该按键输入序列所 获得的群组词条选项反馈回所述用户设备, 供用户选择。 如上所述, 网络通信装置 2503还包括身份信息接收装置, 用于接收可选的用户身份信息, 并将其转发给群组 管理装置 2504以便其确定与之管理的用户群组。 网络通信装置 2503还包括词条接 收装置, 用于接收所述用户在输入时最终选定的词条, 以便将其转发给群组语料库 管理和更新装置 2505 ,使其利用这些词条来更新与所述用户所属用户群组相关联的 群组语料库。 优选地, 该词条接收装置还用于接收与用户群组有关的材料, 以便将 其转发给群组语料库管理和更新装置 2505 ,使其利用这些材料来初始地构建所述用 户所属用户群组相关联的群组语料库。
匹配装置 2502连接到用户群组语料库 2501、 匹配装置 2502和群组管理装置 2504 , 用于基于所述按键输入序列在用户群组语料库 2501中进行匹配查询获得一 个或多个匹配的群组词条选项, 然后将该群组词条选项发送至网络通信装置 2503 , 以返回给用户设备 140。 其中匹配装置 2502根据群组管理装置 2503所确定的用户 群组信息从多个群组语料库 2501中选择输入用户所对应的一个或多个群组语料库 来进行匹配查询。 另外, 该匹配装置 2502还包括优先级确定装置(未示出), 用于根 据词条来源用户的优先级、 词条先前是否被选择过、 词条先前被选择的时间先后、 词条先前被选择的次数、用户预置的输入偏好选项和 /或词条在网络上的被搜索的次 数来确定所匹配的词条选项中各个词条的优先级。
群组管理装置 2504连接到匹配装置 2502、网络通信装置 2503以及群组语料库 管理和更新装置 2505。 群组管理装置 2504负责管理用户群组, 包括从网络通信装 置 2503接收用户注册信息, 并将用户注册到一个或多个用户群组; 维护用户群组 信息, 例如群组名称、 群组成员 ID、 群组所对应的群组语料库编号等; 根据用户身 份确定其所属群组, 并将确定结果发送至匹配装置 2502帮助其选择进行匹配查询 的群组语料库 2501。 另外, 群组管理装置还辅助群组语料库管理和更新装置 2505 管理和更新词库, 例如将某个用户在用户群组中的优先级别信息发送至群组语料库 管理和更新装置 2505 , 后者根据该信息调整相关词条的优先级属性等。
群组语料库管理和更新装置 2505连接到网络通信装置 2503 , 用于接收用户所 发送的词条, 将其更新到群组语料库 2501的词条或其属性中。 优选地, 群组语料 库管理和更新装置 2505还可以从网络通信装置 2503接收与用户群组有关的材料, 通过对其进行学习或训练, 来对群组语料库 2501进行初始化或更新。 该功能对于 进一步简化用户输入, 减少系统构建分类词库的开销来说是非常有用的。 例如, 对 于一个从事半导体领域专利文件翻译的群组, 用户可以上传一份包含常用半导体领 域词汇如"蚀刻"、 "汽相沉积"、 "涂覆 "等的资料。 群组语料库管理和更新装置 505 利用该资料为该群组初始化一个群组语料库 2501 ,从而不再需要其中成员在首次输 入相关词条时要付出的训练劳动。
群组语料库 2501是本发明所引入的重要概念, 其直接对应于用户群组, 通常 每个用户群组对应于一个群组语料库。 群组语料库 2501包含其所对应的用户群组 中的群组成员最常用的词条, 将该语料库在群组成员中共享可以使这些成员用户节 省很多耗时费力的输入法训练步骤, 在其他成员的输入基础上直接获得自己想要的 输入结果。 在下文中将结合图 1 1对群组语料库 2501的构成以及群组词条的属性进 行详细说明。
图 7为根据本实施方式在网络服务器上辅助用户进行文字输入的方法的流程图。 如图 7中所示, 在步驟 S2201中, 网络服务器 250的网络通信装置 2503经由 网络接收用户在用户设备 140上的按键输入序列;
在步驟 S2202中, 基于所述按键输入序列和用户所加入的群组在对应的群组语 料库中进行匹配查询获得一个或多个匹配的群组词条选项;
在步骤 S2203中, 将所获得的群组词条选项反馈回所述用户设备 140 , 供用户 选择。
在本实施方式中, 上述方法还可以包括网络服务器侧的注册步骤 S2204 , 即在 文字输入之前, 可以通过网络通信装置 2503接收用户的注册信息, 并将用户注册 到一个或多个用户群组, 所述用户群组与所述群组语料库相关联。 这种注册过程例 如可以采用本领域公知的群组注册功能。 优选地, 可以根据各种标准即用户所具有 的任何共性来建立用户群组, 例如从事相关的工作、 完成同一项任务、 具有相似的 爱好或者居住于同一个城市。 下文中结合图 10详述了用户群组的例子。
但是, 也可能不需要这样的注册步驟即可将用户与某个群组相关联。 例如, 由 于用户属于某个业务部门, 针对该业务部门的群组在 (例如由部门主管) 创建时, 部门成员包括该用户即被自动添加到了该群组。 这样, 当用户在进行文字输入时, 网络服务器基于其身份可以立即确定其属于该业务部门群组, 并从所对应的群组语 料库中为其搜索相关群组词条。
在本实施方式中, 上述方法还可以包括接收所述用户发送的用户身份信息的步 驟 S2205 , 以便确定与之关联的用户群组, 进而确定与之关联的群组语料库。 然而 该步骤也并非网络服务器 250用于确定用户群组所必不可少的步驟。 例如, 当网络 服务器 250本身是仅适用于某个或某些用户群组的内部网服务器, 则用户无需任何 认证程序即可被认为可以利用该网络服务器 250上的群组语料库 2501。
在本实施方式中, 上述方法还可以包括接收用户返回词条以更新群组语料库 2501的步驟 S2206。 根据这一功能, 用户群组中的每个成员可以向群组语料库提供 自己的贡献, 例如新的词条、 自身累进的输入习惯等, 这些资源可以以适当形式被 收集在群组语料库的词条或词条属性中, 供群组中其他成员参考或直接利用。 优选 地, 对于用户群组中具有较高权限或占主导地位的成员的贡献, 赋予较大的权重。 例如, 部门主管所提供的新词条, 在反馈时具有较高优先级。
在图 7中, 以虛线方框以及虚线连接线来表示上述可选步骤 S2204-S2206。 图 8为根据本实施方式的辅助用户进行文字输入的系统 230的框图。 如图 8所 示, 系统 230包括网络服务器 250和用户设备 140 , 网络服务器 250和用户设备 140 通过网络连接。 网络代表使用例如 TCP/IP协议集来彼此通信的全球范围内的网络 和网关集合, 其可以是以主要节点或主机计算机之间的高速数据通信线路的骨干网 为核心的因特网, 其由成千上万的商业、 政府、 教育和对数据和消息进行路由的其 他计算机系统组成。 网络还可以实现为大量不同类型的网络,诸如,例如, 内联网、 局域网 (LAN )或广域网 (WAN ) 。 图 8意在作为一个示例, 而不是对不同示范性 实施方式的结构性限制。
图 10示出了根据本实施方式在网络服务器上注册的用户群组的示意图。 图中 示出了 4个用户群组,分别为 "篮球同盟"群组 2601、 "正华花园小区"群组 2602、 "半 导体领域翻译 "群组 2603以及"世博旅游"群组 2604。 这些群组分别代表了根据用户 共同的兴趣爱好、 居住区域、 从事工作、 短期关注来划分的用户群组。 实际上, 还 可以有更多的分组标准, 例如同一个网络游戏的玩家、 同一个大学的学生等等。 在 现实生活中, 用户之间可能具有千丝万缕的联系, 导致多个用户之间能够具有某种 在于该分类对于群组成员的输入法效率的促进效果。 对于共同拥有很多为群组之外 的人所不了解的资源的群组来说, 这种群组之间的共享是最有效果的。 例如, 上面 例子中专业性较强的半导体群组以及同玩网络游戏的玩家群组 , 这些群组中的群组 成员之间的共同语言对于一般输入法来说常常是比较生僻的词条,诸如"无卤素树脂 " (半导体术语) 、 "虫族、 神族"(网络游戏用语) 等。
如图所示, 用户群组 2601~2604各自包括其群组成员, 群组成员可以利用自己 的真实名字或网络 ID等来注册到群组。 不同的群组所包括的成员可以有重复或交 叉。例如, "半导体领域翻译"群组 2603和"世博旅游"群组 2604均包括成员"李梦", "篮球同盟"群组 2601中的成员全部是"正华花园"群组 2602的成员, 因为该篮球同 盟就是该小区业主之间的一个组织。
如上所述,各用户群组中被突出显示(图中利用加粗字体表示)的成员(例如, 群组 2601中的张小亮)具有较高优先级, 其他成员具有较低优先级。 如果需要, 还可以设置更细化的优先级级别。 这种优先级区别主要用于群组成员向群组语料库 贡献词条时帮助确定词条的优先级。 网络服务器在接收到群组成员反馈或贡献的词 条时, 将根据其来源用户的优先级、 该词条的当前优先级、 最近被选择时间或次数 来分配或更新该词条的优先级。 词条优先级是词条属性中重要的一项, 用于确定向 用户提供的群组词条在备选框中出现的顺序。 例如, "篮球同盟"群组的队长最近拟 向群组成员推广腰旗橄榄球运动来提高成员身体素质, 其作为来源用户提供的词条 "yq -腰旗"将被赋予比某一成员提供的词条" yq -延庆"具有更高的优先级。
在图 10中的下半部分中, 还给出了这些用户群组所针对的群组语料库所包含 的部分词条, 以补充说明这种用户分组对于输入法的促进意义。 例如, 其中"正华花 园"群组所对应的群组语料库所包含词条有 "xfxch -双福洗车行"、 "syj -三友居饭 店"、 "lj -丽家宝贝"这些小区周边商户的名称,以及" psy -潘山屹(物业经理名字)"、 "rjq -任坚强 (业务会主任) "这样的小区业主熟知常用的词条。
图 1 1中示出了根据本发明一个实施例的在网络服务器上维护的群组语料库中 群组词条的属性列表 2700的示意图。 该列表 2700中列出了用于某个篮球同盟群组 的群组语料库的多个词条。 第一列 2701为该词条的内容, 如" zb -走步 (篮球技术 术语) "、 "lb -篮板(篮球技术术语 ) "、 "bkl -巴克利(篮球明星名字 ) "以及 "huren -湖人(美国 NBA篮球队名字"。 除了这些常用的篮球相关词汇, 该群组语料库还 包括" zhhb -正华杯(小区赛事名称) " 、 "yq -腰旗(群组成员活动) "这样的特定 群组用语。
列表 2700中第二列至第第七列 2702 - 2707列出了词条的各个属性, 分别为群 组标识、 优先级别、 被选次数、 最近被选择时间、 来源用户和 /或目标用户。 其中需 要重点说明的是, 优先级别、 来源用户和 /或目标用户三个属性。 其中优先级别确定 了当匹配到多个词条后以何种顺序向用户提供词条, 优先级别较高的词条被排在靠 源用户是指向语料库管理和更新装置提供该词条的成员用户, 该属性将影响该词条 的优先級设定。 至于目标用户, 则是指可以向其提供该词条的用户范围。 例如, 可 以设定, 仅将" xincheng -新成广场"提供给每周参加活动的篮球队成员, 而同样喜 欢篮球的外围成员则可以不共享该词条以免不必要地打扰这些用户。 又例如, 对于 一个以工作内容为基础形成的群组, 某些词条的目标用户可以是全体成员, 而另一 些词条的目标用户可以仅仅是一部分职员, 而排除了与该词条无关联的普通职员对 于该词条的选择。 例如, 在针对"百度"公司群组的群组语料库中, 词条" bdtb -百度 贴吧"、 "ssyq -搜索引擎"的目标用户可以是全体成员, 而词条 "ZX -撰写"、 "qq -侵 权"的目标用户则可以仅限定为该公司知识产权管理职能部门成员,以进一步提高群 组词条反馈的准确性和用户的输入效率。
前面提到, 词条优先级属性可以根据被选时间、 次数、 来源用户等来进行动态 调整, 以使得所返回的词条的顺序更符合该群组成员的输入习惯, 将最准确的词条 选项返回给用户。 众所周知, 很多输入法为第一顺位的备选词分配了最方便使用的 空格键, 而下面顺序的备选词则需要通过数字键、 " + "和" - "或者其他快捷键来选 择, 而在重码很多的情况下更后面的词条甚至需要用到翻页键。 如果能够将最准确 的词条排在靠前位置甚至第一位, 无疑将大大节省用户的劳动, 提高用户的输入效 率。
本发明的第二实施方式还提供一种用于供用户进行文字输入的系统, 其中包括 本发明第二实施方式的用于输入文字的用户设备和本发明第二实施方式的用于辅 助用户设备输入文字的网络服务器。
本发明用于供用户进行文字输入的方法、设备、服务器及系统的第三实施方式: 图 12为实现本实施方式以及本发明的一个典型的计算机网络的示意图, 示出 的各类网络元素包括用户设备 311和 312 , 互联网 321 , 路由器 331和网络服务器 341~343。 下文中, 用户设备 3 11也称电脑 31 1 , 用户设备 312也称手机 312。 本领 域技术人员理解, 图 12中仅为简明起见而示出的各类网络元素的数量可能小于一 个实际网络中的数量, 但这种省略无疑地是以不会影响对本发明进行清楚、 充分的 公开为前提的。 本领域技术人员还理解, 图中还省略了一些其它类型的网络元素, 例如, 通常位于用户设备与互联网之间的调制解调器( modem ) 、 接入设备(如 DSLAM ) 等。
图 12中,虽然将三个网络服务器 341〜343表示为由同一路由器 331提供路由, 但这并不排除本发明对包括但不限于以下变形的适用: 各个网络服务器 341 343因 所处地域不同的原因而归属于不同的路由器, 并分别服务于不同区域的用户设备。 典型地, 可以为一个城市、 省份甚至一个国家提供一个这样的服务器, 当然, 考虑 到终端用户的分布的不均匀性, 服务器的分布可以服从于终端用户的分布情况。 总 之, 网络布局的任何变化都落入本发明的范围和精神之内。
图 13a~13d为在用户设备处显示的用于对根据本实施方式的输入法进行设置的 人机界面示意图。
如前所述, 本发明提供了一种为用户提供文字输入的解决方案, 实则提供一种 新的输入法, 这种输入法可以是计算机 31 1使用的唯一的一种输入法, 或者, 用户 由附图标记标示的各个对象可以是 windows操作系统中提供的若千按钮, 其中, 按 钮 3201是输入法的开关按钮, 结合图 13a〜图 13c可以看出, 当用户控制鼠标 3205 移至按钮 3201上后, 点击左键, 即可改变按钮 3201的显示状态并改变了该输入法 的状态, 为了清楚地向用户表明这种改变, 图 13c中所示的表示激活的按钮 320Γ 的颜色区别于图 13a中表示未激活的按钮 3201。 下文中, 将根据本实施方式的基于 用户设备与网络服务器的互动的输入法成为本输入法, 以便于表述。
在图 13a中, 还示出了按钮 3202 , 通过鼠标点击, 可以在中文和英文输入之间 进行切换。 本领域技术人员通过阅读上下文能够理解, 本发明并不限于中文和 /或英 文(如按钮 3202,)的输入, 而及于几乎各种可以通过用户设备输入的语言的文字, 本领域技术人员通过阅读上下文还能理解, 本输入法用于输入中文时, 并不限于拼 音输入, 而及于笔画、 五笔等其它中文输入方式以及它们组合后形成的输入方式。
典型地, 本输入法的中英文输入切换也可以用键盘上的特定按键来控制, 例如, 当键盘上的 CapsLock (CapsLK) 处于小写状态时, 本输入法将为用户提供汉字或词 语的选择列表, 用户通过键盘、 鼠标的进一步选择将输入结果添加至相应的输入框 中; 相对应的, 当 CapsLK处于大写状态时, 本输入法将不提供包括汉字的选择列 表, 不失一般性地, 直接将用户按键顺序所对应的英文字母序列添加在相应的输入 框中。
在图 13a中, 还示出了按钮 3203 , 通过鼠标点击, 可以在全角和半角之间进行 切换, 不失一般性地, 图 13a中的按钮 3203的新月形图形表示半角, 图 13b中的按 钮 3203'中的圆形表示全角。
在图 13a中, 还示出了按钮 3204 , 通过鼠标点击, 可以在中文标点符号与英文 标点符号之间进行切换, 典型地, 图 13a中的按钮 3204表示中文标点符号, 图 13b 中的按钮 3204'表示英文标点符号。
图 13a~13c中任一附图所示的是一个控制栏, 其默认地展现在用户设备屏幕的 右下角, 并优选地可以被鼠标拖动。
图 13d所示的是另一种供用户对本输入法的开 /关进行配置的人机交互界面,其 中具体的是一个基于 web的浏览器的一部分, 其中, 在"是否希望使用百度在线输 入法(本输入法) "一项之后, 提供了表示开启与关闭的两个选项, 通过点选, 即可 人工地控制本输入法是否激活。 本领域技术人员理解, 当用户通过以上方式或其等同替换方式将本输入法激活, 即使计算机 311预先安装了其它的输入法, 本输入法仍成为当前的默认输入方式。 相对应地, 当本输入法进入关闭状态时, 为了保证计算机 3 1 1上有可用的输入法, 计算机 311优选地自动将其它预先安装的输入法例如 Google输入法激活。 当然,其 它输入法的激活也可以由用户来手动控制, 从而选择希望切换至的其它输入法, 典 型地, 参看图 13a, 当鼠标右键点击按钮 3201时, 出现一个弹出菜单, 其中列出了 其它可用的各种输入法供用户选择, 当用户选择了其中之一后, 按钮 3201的位置 将出现一个表示新近选择的其它输入法的标志。 对此不赘。 作为一种替代方式, 本 输入法也可以是计算机 311唯一可用的输入法。
根据本实施方式, 本输入法可以在其它一些条件下被激活, 例如, 当用户在图 13d的个性设置中开启了在线输入法后, 每当用户使用浏览器访问一个预定的网络 地址集合中的任一地址时, 本输入法即被激活。
以下结合图 14并参照图 12对以计算机 3 11为例的用户设备中用于为用户提供 文字输入的方法进行介绍, 虽然以计算机 31 1为例, 但本领域技术人员理解, 同样 的流程也可以在手机 312中实现。 下文的描述会涉及到网络服务器的一些内容, 这 些内容还会在下文中有专门的详述。
图 14所示的是根据本实施方式的在连接到一个网络服务器的用户设备中供用 户进行文字输入的方法流程图,本实施方式中,用户例如张三已经将本输入法激活。 为了更形象地介绍这一过程, 还参照图 16。 应当理解, 用户设备与网络服务器之间 的连接方式不限于高速、 稳定的有线连接, 也包括无线连接或有线与无线混合的方 式。
首先, 在步骤 S3301中, 计算机 311获得用户提供的输入信息。 具体地, 张三 打开 IE浏览器, 首页 www.baidu.com自动被打开, 于是呈现出大致如图 16所示的 内容, 当然其中的搜索栏 3501应为空且图中位于搜索栏下方的相关内容应忽略。 张三将光标移至搜索栏 3501附近或其中的某处,点击鼠标,于是可以在搜索栏 3501 中进行输入。 不失一般性地, 假设张三依次敲击键盘上的以下按键, 其中每个按键 敲击次数为 1且按下时间低于一个阈值: C,A,0,M,E,I,W,A,N,G。
根据本实施方式的第一实例, 张三的上述每一次敲击所提供的信息都视为一个 输入信息, 也即, 当张三按下 C键, 一个表示 C键被按下的输入信息被计算机 31 1 获得。 以下, 用输入信息所表示的按键或按键组合来标识该输入信息, 并采用输入 信息 "XX"的形式, 双引号部分表示相应的按键或按键组合。 例如, 表示按下按键 C 的输入信息, 称为输入信息 "C" , 表示依次按下按键 C,A,0,M,E,I的输入信息, 称为 输入信息" CAOMEI"。
具体地, 步骤 S3301的实现可以采用这样的方式, 由浏览器通过其中的一段脚 本或函数来检索用户提供的输入信息。 也即, 本输入法分布在计算机 311上的用于 获得输入信息的功能模块是由基于 web的浏览器来实现的。 于是, 如果张三使用提 供本输入法的相应浏览器, 那么他在下载和安装该浏览器的同时, 也就下载和安装 了实现该输入法在客户端的功能模块, 即可开始基于本输入法进行文字输入。
上述的浏览器的特定脚本或函数是可以替代的, 例如, 计算机 311可以安装独 立于浏览器的应用程序, 其类似于本输入法的客户端软件, 根据本实施例, 该客户 端软件主要负责上述的输入信息的获取以及下文中还将提到的输入信息发送等后 续操作。
接着, 在步骤 S3302中, 计算机 31 1将输入信息" C"发送给网络服务器。 参看 图 12 , 该输入信息具体到达哪个网络服务器将取决于所示网络的具体配置, 以及路 由算法等等。最直接的一种做法是, 为每一个 IP地址段的用户设备指定一个网络服 务器, 于是, 当计算机 31 1明确了自己的 IP地址以后, 就可以知道该向哪个网络服 务器发送输入信息,或者,预先将 IP地址段与网络服务器的对应关系保存在路由设 备中, 在从来自计算机 311的 IP包中获知计算机 311的 IP地址后, 查询该对应关 系, 即可确定作为目的地的网络服务器; 作为多种替代方式之一, 输入信息可以发 往多个网络服务器, 再由这些网络服务器之间进行通信来确定一个执行后续操作的 网络服务器; 作为多种替代方式中的另一种, 图 12所示的网络服务器互有分工, 当张三通过图 13a~13c的人机界面设置了输入语言时, 计算机 3 11发出的输入信息 可以携带相应的语言例如中文的标识, 于是, 由中文输入服务器 (如有) 来负责后 续操作。 另外, 应当理解, 本发明不限制对网络服务器的选择方式及用户设备与 网络服务器之间的交互方式, 例如, 这种交互可以基于 IP协议, 也可以基于其它用 于互联网的通信协议。
参看图 15, 其中, 计算机 311发出的输入信息在步骤 S3401中由一个网络服务 器例如图 12中的网络服务器 341 (以下简称服务器 341 )接收。
随后的步骤 S3402中, 服务器 341基于输入信息 "C"来在词典数据库中进行匹 配查询, 以生成备选输入项集合。 其中, 根据不同的输入方式, 服务器 341使用不 同的算法来对输入信息" C"进行翻译, 这些输入方式包括一般的英文字母输入、 中 文拼音、 中文笔画等等。 以英文输入为例, 如不考虑联想输入, 那么服务器 341将 生成包含一个备选输入项即英文字母 c的备选输入项集合。 如果考虑联想输入, 那 么这个集合将包括以字母 c开头的至少一个单词。 如果采用中文拼音输入, 则备选 输入项集合将包括拼音时首字母为 C的各个中文字。
在步骤 S3403中, 服务器 341将生成的备选输入项集合发送回计算机 311。 同 样的, 步骤 S3403中的发送过程可以基于 WEB实现, 于是, 备选输入项集合将被 封装在 http协议下的传输单元中进行发送。可选地,这种发送也可以以即时信息( IM ) 的方式发送, 例如小 i机器人与用户端之间的互动方式。 本例中, 不妨假设张三进 行中文输入, 于是这个集合包括"从、 此、 才、 处、 、 吃、 出、 成、 车、 差 ..."等 中文字, 其中的每一个中文字成为一个备选输入项或简称输入项。 在下文中, 为输 入信息和通过输入该输入信息而得到的备选输入项集合使用相同的标识方式, 以简 要地明确它们的对应关系, 于是上述集合称为集合" C"。
参看图 14, 服务器 341发回的备选输入项发回计算机 3 11后, 在步骤 S3303中 由计算机 31 1接收。
随后的步骤 S3304中, 计算机 31 1将该集合通知给张三, 这一步骤可以采用任 何已知的计算机提供人可读信息的技术手段, 典型的例子如屏幕显示、 扬声器播放 等。 不失一般性地本例以屏幕显示为例。
由于集合" C"中包含较多的中文字, 可能难以在一个提示栏(如图 16中附图标 记 3505所指) 中完整显示, 因此, 可以在提示栏中每行显示 5个备选输入项, 并 在之前给予顺序号标记, 以方便用户通过按下键盘上的数字键进行选择。 另外, 提 示栏还将包括一个按钮 , 方便用户通过鼠标点击来显示下一行的备选输入项。 优 选地, 用户也可以通过按下键盘上的指定键来命令显示下一行的备选输入项, 例如 pagedown键。
之后, 张三会通过鼠标、 键盘操作来从这些备选的输入项中选择一个, 并通过 按下鼠标左键或相应数字键的方式确认,从而给予计算机 31 1—个指示信息,例如, 当鼠标悬停在中文字 "从"上时点击左键, 就给予计算机 311—个指示"从"为被选择 的输入项的指示信息。
于是, 在步骤 S3305中, 计算机 31 1接收到张三提供的这一指示信息, 并据此 在步骤 S3306中将中文字"从"作为此次输入的输入结果, 将其显示在用户进行输入 时指定的位置, 例如, 浏览器的搜索栏中。
在步骤 S3307中, 计算机 311还将这一信息发送给网络服务器 341。 根据本发 明的一个具体实施例, 参看图 15 , 网络服务器 341根据张三从备选输入项集合 "C" (以下简称集合 "C" )中选择 "从"这一信息, 来对自身保存的词典数据库进行训练和 更新。 于是, 在本输入法拥有可观的用户群体时, 就能学习全体用户的选择, 动态 地更新词库,例如,将一段时间内有超过预定数量的用户输入的一个新词如"犀利哥" 加入到词库中。
根据本发明的另一具体实例, 张三并未从集合" C"中做出选择, 而是又按下了 键盘上的 A键,如本领域技术人员所知的 ,此时备选输入项集合将会收敛。具体地: 作为相互替代的方式之一:
计算机 311再次执行步骤 S3301 , 得到输入信息 "A";
进入步骤 S3302 , 计算机 31 1将输入信息 "A"也发送至网络服务器 341 ;
此后, 执行步驟 S3402 , 其中, 网络服务器 341将之前的输入信息" C"和新的输 入信息 "A "进行整合, 得到整合结果 "CA" , 或称一个新的输入信息" CA" , 于是, 以 此为入口再次从词库中找寻相应的备选输入项, 并将这些备选输入项所构成的集合 "CA"返回给计算机 31 1。 此后的过程与上述类似, 在此不赘。
作为相互替代的方式之一, 计算机 31 1在首次执行步 ¾ S3303时, 将集合 "C" 緩存起来, 优选地, 当张三在步驟 S3305中提供了指示信息的情况下, 緩存的集合 "C"被清除。反之,计算机 31 1将会依赖緩存中的集合" C"来响应张三进一步的输入。 这种依赖緩存的场景例如: 张三依次输入 "C" , "A" , "0"而后才选择一个输入项例 如"草", 或者张三依次输入" C", "A" , T , 而后输入退格, 又输入 "0", 诸如此类。 采用这种方式, 可以适当减少本输入法下的用户端与服务器端在通信链路上的交互, 以降低对网络资源的占用。
在以上的例子中, 用户每次按键的操作都触发了一个对应于单次按键的输入信 息的获得和发送。 以下介绍这个例子的一个变化例。
具体地, 计算机 31 1将在用户提示它截取输入信息时, 才将之前已经输入的序 列作为输入信息,并发送给服务器 341。例如,用户依次输入 C,A,0,M,E,I,W,A,N,G, 最后按下空格, 这个按下空格的操作即可触发计算机 3 11将" C,A,0,M,E,I,W,A,N,G" 整体地作为输入信息, 发送给服务器 341。 本实施方式中, 执行步驟 S3304后, 计 算机 311上的屏幕显示不失一般性地如图 16所示, 其中, 搜索栏 3501就是用户将 光标移动到的位置,并在此进行输入,可以看到,输入信息栏 3502中的" caomeiwang" 也随服务器 341返回的备选输入项集合一起显示在屏幕上, 并且, 音节与音节之间 还用上标"" '分隔开, 以给用户更加清晰的体验。 提示栏 3505中各个备选输入项如 3503和 3504分别以顺序编号加以标识。 如果用户选择 "草莓网", 则"草莓网 "这三 个中文字将会最终出现在搜索栏 3501中, 优选地, 光标将会位于 "网"字之后。 根据本实施方式中, 本输入法的操作过程还可与用户的身份信息相关联, 以下 仍参照图 14和图 15进行说明。 息的人机界面, 例如, 在用户希望激活本输入法时, 执行步骤 S3308 , 提供一个登 陆界面, 用户通过输入用户名和密码来向网络服务器 341进行认证, 在获取用户的 身份信息后, 在步驟 S3309中发送给网络服务器 341 , 如果认证通过, 则网络服务 器 341将会响应用户此后的文字输入操作, 且这一过程还考虑该用户的身份信息。 以下还会对此进行详述。 当然, 网络服务器 341得到用户的身份信息的方式不限于 此, 例如, 计算机 3 1 1可以访问操作系统或其它应用程序中的用户身份信息, 如果 这种访问可以被允许, 则即可从中得到用户的身份信息, 并报告给服务器 341 , 此 种方式更适用于计算机 3 1 1用于家庭或其它私人用途的情况。
张三的身份信息在步骤 S3406中由服务器 341接收到。 根据这一身份信息, 可 以在步骤 S3407中调取张三的输入历史记录, 这是服务器 341保存的过去一段时间 内张三用本输入法输入的所有文字, 其中包括张三历次从备选输入项集合中选择的 那些输入项。
步驟 S3407中调取的输入历史记录可以作用于步驟 S3402中的备选输入项集合 的生成过程, 具体地: 在步骤 S3402中, 计算机 3 1 1首先基于输入信息例如" C"在 词典数据库中进行匹配查询, 得到初步查询结果, 其内容与上述实例中的备选输入 项集合 "C"相同。 接着, 计算机 31 1根据张三的输入历史记录, 对初步查询结果进 行处理, 以生成本实例中的备选输入项集合 "C"。 典型地, 计算机 3 1 1将输入历史 记录与初步查询结果进行对比, 将初步查询结果中的同样包含在输入历史记录中的 内容(输入项)排列在优先于其它内容(输入项) 的位置, 以本实例中的备选输入 项集合 "C"。 可选地, 这些输入历史记录可以与当时的输入信息相对应, 于是, 在 进行初步查询结果与输入历史记录的比对时, 可以仅参照对应于当前的输入信息的 那些输入历史记录来排列各个输入项的位置进而生成备选输入项集合。
用户的身份信息的另一示范性作用在于, 帮助服务器 341根据步驟 S3404中接 收到的指示信息, 来对该用户的输入历史记录进行更新, 例如, 将指示信息所标示 的输入项加入到输入历史记录, 或者, 将输入信息与指示信息所标示的输入项相关 联地加入到输入历史记录。
图 17为根据本实施方式的连接到网络服务器的供用户进行文字输入的用户设 备的框图, 以图 12所示的用户设备 31 1为例, 其中包括:
获得装置 3 111 , 用于获得用户提供的输入信息;
第一发送装置 3112 , 用于将所述输入信息发送至所述网络服务器, 所述网络服 务器基于该输入信息来向所述用户设备提供反馈信息;
第一接收装置 3113 , 用于接收所述网络服务器发回的反馈信息;
通知装置 3 114 ,用于将所述反馈信息通知所述用户,以用于进一步的人机交互。 进一步地, 用户设备 311还包括:
身份获取装置 3115 , 用于获取所述用户的身份信息;
第一发送装置 3112还用于, 将所述用户的身份信息发送给所述网络服务器。 进一步地, 用户设备 311还包括:
第二接收装置 3116 , 用于接收所述用户提供的指示信息, 其用于表示所述用户 在所述备选输入项集合中所选择的输入项, 并将所述输入项作为所述用户输入的输 入结果;
第一发送装置 3112还用于, 将所述指示信息发送给所述网络服务器。
其中, 用户设备 31 1供用户在基于 WEB的应用程序中进行文字输入。
其中, 用户设备 31 1供用户在基于 WEB的浏览器程序中进行文字输入。
图 18为根据本实施方式的用于辅助一个用户设备的用户来进行文字输入的网 络服务器的框图, 例如图 12所示的服务器 341 , 其中包括:
第三接收装置 3411,用于接收所述用户设备发来的由所述用户提供的输入信息; 生成装置 3412 , 用于基于所述输入信息来在词典数据库中进行匹配查询, 以生 成一个备选输入项集合;
第二发送装置 3413 , 用于将所述备选输入项集合发送给所述用户设备。
其中, 第三接收装置 341 1还用于:
- 接收所述用户设备发来的新的输入信息;
生成装置 3412还用于: 将所述新的输入信息与之前接收到的输入信息进行整 合, 以得到整合结果; 基于所述整合结果来在所述词典数据库中进行匹配查询, 以 生成新的备选输入项集合;
第二发送装置 3413还用于,将所述新的备选输入项集合发送给所述用户设备。 进一步地, 其中, 第三接收装置 3411还用于, 接收所述用户设备发来的所述 用户的身份信息;
服务器 341还包括: 调取装置 3414 , 用于根据所述用户的身份信息, 调取所述 用户的输入历史记录;
生成装置 3412还包括: 查询装置 34121 , 用于基于所述输入信息在所述词典数 据库中进行匹配查询, 得到初步查询结果; 处理装置 34122 , 用于根据所述用户的 输入历史记录, 对所述初步查询结果进行处理, 以生成所述备选输入项集合。
其中, 处理装置 34122还用于: 将所述输入历史记录与所述初步查询结果进行 对比, 将所述初步查询结果中的同样包含在所述输入历史记录中的内容排列在优先 于其它内容的位置, 以生成所述备选输入项集合。
其中, 第三接收装置 341 1还用于, 接收来自所述用户设备的指示信息, 其用 于表示所述用户在所述备选输入项集合中所选择的输入项;
该服务器 341还包括更新装置 3415, 用于根据所述指示信息所表示的输入项, 执行以下各项中的至少一项: -更新所述用户的输入历史记录; -对所述网络服务器 所存储的词典数据库进行训练和更新。
本发明的第三实施方式还提供一种用于供用户进行文字输入的系统, 其中包括 本发明第三实施方式的用于输入文字的用户设备和本发明第三实施方式的用于辅 助用户设备输入文字的网络服务器。
本发明用于供用户进行文字输入的方法、设备、服务器及系统的第四实施方式: 图 19示出根据本实施方式一个方面的用于当用户进行文字输入时同时提供与 输入信息相关的搜索相关信息的用户设备 41。 也即, 当用户在用户设备 41上进行 文字输入时, 用户设备 41在根据用户输入序列提供相应输入词条选项, 还根据用 户设备的输入序列搜索相关的搜索相关信息, 如广告信息、 网页信息、 旅游信息或 地图信息。 以下以广告信息为例进行说明:
其中, 用户设备 41可以是任何一种可与用户通过键盘、 遥控器、 触摸板、 或 声控设备进行人机交互的电子产品,例如计算机、智能手机、 PDA、游戏机、或 IPTV 等。
用户设备 41中包括第一获取装置 411、 查询装置 412、 提供装置 413、 用于保 存本地词库的存储装置 414 (为筒明起见, 以下筒称词库 414 ) 、 和用户保存关键 词广告库的存储装置 414' (为简明起见, 以下简称本地广告库 414' , 或称为搜索信 息库 414' )。 本领域技术人员应能理解, 存储装置 414和 414'可以为分离或同一存 储器, 也可分别由一组存储器阵列来实现。
具体地, 第一获取装置 41 1通过任何一种可与用户进行人机交互的交互设备来 实时地获取用户正在输入的输入序列。 该交互设备可以是键盘、 遥控器、 触摸板或 声控设备等。 以键盘为例, 但用户敲击键盘中按键进行输入时, 第一获取装置 411 实时地获取用户敲击的按键序列 (为简明起见, 以下仍称输入序列) 。
查询装置 412将第一获取装置 41 1提供的用户输入序列与词库 414进行匹配查 询, 获得一个或多个匹配的输入词条选项。 以下以中文为例进行说明, 本发明允许 用户在按全拼、 双拼、 五笔等方法输入中文。 同时, 查询装置 412还根据用户输入 序列在关键词广告库 414'中进行搜索, 获得相关的一个或多个广告信息选项。 与例 如, 当用户敲击按键输入" woaiwaitan" ,查询装置 412在词库 414中查询获得 "1 我 爱外滩;2 我爱"等词条组合, 同时在广告库 414'中查询获得与 "外滩"有关的广告信 息有"外滩三号"、 "外滩 18号"等地标性建筑,因此提供广告信息选项" 3 外滩三号; 4 外滩 18号"。 本领域技术人员应理解, 查询与输入序列相关的广告信息 (或搜索 相关信息) 的过程可以釆用目前公知的各种智能或模糊搜索算法, 在此不作赘述。 本领域技术人员应理解, 查询与输入序列相关的广告信息 (或搜索相关信息) 的过 程可以采用目前公知的各种智能或模糊搜索算法, 在此不作赘述。
提供装置 413随后将查询装置获得的一个或多个匹配的输入词条选项按一定顺 序和格式提供给所述用户, 供其选择以作具体输入。 例如, 通过在显示器的一个输 入窗口栏中显示给用户是, 可将多个词条选项与输入序列分栏显示, 多个词条选项 可全部列入下一栏中供用户选择。 优选地, 可以在词条栏中仅显示一行词条选项, 该行词条选项数目可以是缺省的也可由用户设定, 通过由用户按动特定功能键显示 上一行或下一行词条选项 , 该特定功能键例如可以是" +"和" -"。
优选地, 为便于用户注意, 广告信息选项在词条栏中可采用不同显示方式, 例 如不同颜色或灰度。而且广告信息选项中内置有与该广告信息相关的网页 IP地址或 统一资源标示符 (URL ) 。 用户可通过按该选项相应的数字键或通过鼠标移动光标 至该选项处悬停或点击来选择该广告信息选项。 当用户选择该广告信息选项, 用户 设备 41中的网址定向装置 (未示出)可通过网络定向到其对应的网页网址, 例如 在浏览器打开情形, 经由网络连接到该网址对应的网页服务器, 并在浏览器中显示 其网页给用户。
在本实施方式一实例中, 第一获取装置 411和查询装置 412和提供装置 413之 间是持续不断地工作。 具体地, 第一获取装置 41 1实时地获取用户的输入序列并持 续不断地提供给查询装置 412 , 例如" w"、 "wo" ... "wo" ... "woai" ... "woaiwaitan" , 查 询装置 412也实时地对第一获取装置 411持续不断地提供的用户输入序列进行匹配 查询, 以持续获取与上述各输入序列相对应的词条选项, 例如" w"对应" 1我、 2喔、 3握、 4窝"; "woai"对应" 1我爱、 2喔、 3握、 4窝,,; "woaiwaitan"对应" 1我爱外滩、 2外滩三号、 3 外滩 18号"。 在此, 本领域技术人员应理解"持续"是指在用户最终 选择一个词条选项前一直进行的动作方式, 例如用户在敲击按键序列 "woai"后可能 稍停片刻, 如 0.5秒, 再继续敲击随后的按键。
在另一个实例中, 查询装置 412在根据用户输入序列在词库 414和广告库 414' 中进行匹配查询获得多个输入词条选项和广告信息选项时还获得其各自的优先级。 提供装置 413将查询装置 412提供的多个匹配的输入词条选项和广告信息选项按优 先级顺序在词条栏中显示给所述用户, 其中优先级越高, 该输入词条选项或广告信 息选项越靠前显示。 优选地, 为便于用户进行文字输入, 优先级最好的输入词条选 项一般置于最前位置, 使得用户可通过简单地按' 'ENTER"或空格键来选择, 而广告 信息选项通常置于每行中较末尾选项位置。
具体地, 查询装置 412可以才艮据用户特征来在词库 414和广告库 414'中进行查 询, 获得匹配的输入词条选项和广告信息选项。 在获得匹配的多个输入词条选项和 广告信息选项后, 还可根据用户特征来确定其优先级高低。 用户特征包括用户的输 入历史记录、 用户设定的个人偏好选择、 用户属性、 用户地址等, 用户属性包括用 户的职业、 性别、 国际、 出生地、 年龄等体现个人特性的信息。 查询装置 412还可 根据用户输入历史记录中对各个词条选项或词条选项中的词汇的选择频度、 各词条 选项中各个词汇间的文义关联性来确定其优先级高低。 查询装置 412也可根据用户 设定的个人偏好选择来确定优先级高低, 例如, 当用户设定输入偏好为: 优先级高 低: 购物 >饮食 >旅游, 则获取用户输入序列 "woaiwaitan"后, 查询装置在广告库 14' 中查询获得与 "waitan"对应的多个位于外滩的地标性建筑或旅游景点, 如招商局总 部、 汇丰大厦、 花旗银行、 外滩三号、 外滩 18号等, 随后根据用户设定的个人偏 好可判断"外滩三号"、 "外滩 18号"等以购物、 餐饮为主的建筑景点的优先级最高。 另外, 查询装置 412还可根据目前用户设备的 IP地址来判断其所处的地域,从而可 以确定输入序列中与该地域相关的词汇的优先级, 例如, 但用户输入序列为
"woxihuanbund" , 其中" bund"的译文有" 1堤岸 2码头 3同盟 4 (上海) 外滩", 当 查询装置 412根据用户设备 IP地址获知目前位于中国上海市, 从而可确定" bund" 对应译文中 "上海外滩"或"外滩"优先级最高, 因而可提供如下输入词条选项 "1我喜 欢上海外滩; 2我喜欢外滩; 3我喜欢码头; 4我喜欢提岸; 5我喜欢同盟"。 为筒明 起见, 我们可将用户输入历史记录, 用户设定个人偏好、 计算机 IP地址(或用户地 址)等统称为用户特征, 且本领域技术人员应能理解, 用户特征包括但不限于上述 内容。
本本领领域域技技术术人人员员应应能能理理解解,, 用用户户设设备备 4411可可以以在在本本地地存存储储器器中中保保存存上上述述用用户户输输 入入历历史史记记录录、、 用用户户设设定定输输入入偏偏好好以以及及词词汇汇间间的的各各种种关关联联性性。。 优优选选地地,, 用用户户设设备备 4411 还还可可对对所所保保存存的的用用户户输输入入历历史史记记录录、、 输输入入偏偏好好及及词词汇汇间间关关联联性性等等信信息息进进行行更更新新。。 如如 图图 2200所所示示,, 用用户户设设备备 4411还还包包括括第第二二获获取取装装置置 441155、、 更更新新装装置置 441166。。 其其中中第第二二获获取取 装装置置 441155通通过过与与用用户户的的进进一一步步交交互互来来获获取取该该用用户户对对提提供供装装置置 441133所所提提供供多多个个输输入入词词 条条选选项项的的选选择择。。 更更新新装装置置 441166根根据据第第二二获获取取装装置置 441155提提供供的的用用户户选选择择来来更更新新词词库库和和 用用户户输输入入历历史史记记录录、、 词词汇汇间间的的关关联联性性等等,, 例例如如可可在在词词库库 441144中中增增加加新新词词条条选选项项和和已已 有有词词条条选选项项的的优优先先级级,, 用用户户特特征征。。 更更优优选选地地,, 如如用用户户设设备备可可接接入入互互联联网网,, 第第二二获获取取 装装置置 441155还还可可自自行行在在互互联联网网中中搜搜寻寻新新的的词词条条组组合合,, 并并用用以以更更新新词词库库 441144等等。。
在在一一个个优优选选实实例例中中,, 用用户户设设备备 4411中中的的广广告告库库 441144''可可随随时时与与周周期期性性地地进进行行主主动动 更更新新,, 例例如如用用户户设设备备 4411经经由由网网络络与与一一个个或或多多个个网网络络设设备备相相连连接接,, 并并随随时时或或周周期期性性
Figure imgf000060_0001
。。
在另一优选实例中, 广告库 414'可以是位于用户设备 41的外部, 例如位于一 个网络设备处或分布于多个网络设备处, 用户设备 41可经由网络与网络设备相连 接, 从而查询与用户输入序列相关的广告信息选项。
图 20示出根据本实施方式另一方面的用于当用户进行文字输入时同时提供与 输入信息相关的搜索相关信息的用户设备 41和网络设备 42 ,其中用户设备 41经由 网络与网络设备 42相连接, 该网络可以为互联网、 内部网等。 也即, 当用户在用 户设备 41上进行文字输入时,用户设备 41经由网络向网络设备 42发送查询请求, 请求网络设备 42根据用户输入序列搜索相关的搜索相关信息, 如广告信息、 网页 信息、 旅游信息或地图信息, 然后将网络设备反馈的搜索相关信息与网络设备查询 获得的输入词条选项一起提供给用户。 以下以广告信息为例进行说明:
在一个实例中, 用户设备 41中包括第一获取装置 411、 第一发送装置 417、 第 一接收装置 418、提供装置 413。网络设备 42包括第二接收装置 421、查询装置 422、 第二发送装置 423、 用于保存网络词库的存储装置 424 (为简明起见, 以下筒称网 络词库 424 )和用于保存关键词广告库的存储装置 424, (为简明起见, 以下简称网 络广告库 424' ) 。
具体地, 第一获取装置 41 1通过任何一种可与用户进行人机交互的交互设备来 实时地获取用户正在输入的输入序列。 该交互设备可以是键盘、 遥控器、 触摸板或 声控设备等。 以键盘为例, 但用户敲击键盘中按键进行输入时, 第一获取装置 411 实时地获取用户敲击的按键序列 (为简明起见, 以下仍称输入序列) 。
用户设备 41中的第一发送装置 417实时并持续不断地将第一获取装置 41 1提 供的用户输入序列发送至网络设备 42。 网络设备 42中的第二接收装置 421接收到 该输入序列并提供给查询装置 422。 查询装置 422将用户输入序列与词库 424进行 匹配查询, 获得一个或多个匹配的输入词条选项。 以下以中文为例进行说明, 本 发明允许用户在按全拼、 双拼、 五笔等方法输入中文。 同时, 查询装置 422还根据 用户输入序列在关键词广告库 424中进行搜索, 获得相关的一个或多个广告信息选 项。 与例如, 当用户敲击按键输入 "woaiwaitan" , 查询装置 422在词库 424中查询 获得 "1 我爱外滩; 2 我爱"等词条组合, 同时在广告库 424中查询获得与 "外滩 "有 关的广告信息有"外滩三号"、 "外滩 18号"等地标性建筑, 因此提供广告信息选项" 3 外滩三号; 4 外滩 18号"。 本领域技术人员应理解, 查询与输入序列相关的广告信 息 (或搜索相关信息) 的过程可以采用目前公知的各种智能或模糊搜索算法, 在此 不作赘述。
网络设备 42中的第二发送装置 423也实时和持续不断地将查询装置 422提供 的输入词条选项发送至用户设备 41。 用户设备 41中的第一接收装置 419接收到所 述输入词条选项并实时和持续地提供给提供装置 413 , 提供装置 413随后将获得的 一个或多个匹配的输入词条选项按一定顺序和格式提供给所述用户, 供其选择以作 具体输入。 例如, 通过在显示器的一个输入窗口栏中显示给用户是, 可将多个词条 选项与输入序列分栏显示, 多个词条选项可全部列入下一栏中供用户选择。优选地, 可以在词条栏中仅显示一行词条选项, 该行词条选项数目可以是缺省的也可由用户 设定, 通过由用户按动特定功能键显示上一行或下一行词条选项, 该特定功能键, 例如可以是" +"和" -"。
优选地, 为便于用户注意, 广告信息选项在词条栏中可采用不同显示方式, 例 如不同颜色或灰度。而且广告信息选项中内置有与该广告信息相关的网页 IP地址或 统一资源标示符 (URL ) 。 用户可通过按该选项相应的数字键或通过鼠标移动光标 至该选项处悬停或点击来选择该广告信息选项。 当用户选择该广告信息选项, 用户 设备 1可通过网络定向到其对应的网页网址, 例如在浏览器打开情形, 经由网络连 接到该网址对应的网页服务器, 并在浏览器中显示其网页给用户。
优选地, 用户设备 41中的第一获取装置 411、 第一发送装置 417、 第一接收装 置、 和网络设备 42中的第二接收装置 421、 查询装置 412和第二发送装置 423之间 是持续不断地配合工作。 具体地, 第一获取装置 411实时地获取用户的输入序列并 持续不断地提供给查询装置 412,例如" w"、"wo" ... "wo,,... "woai,,... "woaiwaitan" , 第 一发送装置 417也实时和持续不断地将各种输入序列发送给网络设备 42。 网络设备 42中的第二接收装置 421接收到用户设备 41所发送的各种输入序列后也实时和持 续不断地提供给查询装置 422 , 查询装置 422随即实时地对第一接收装置 421持续 不断地提供的用户输入序列进行匹配查询, 以持续获取与上述各输入序列相对应的 词条选项, 例如" w"对应" 1我、 2喔、 3握、 4窝"; "woai"对应" 1我爱、 2喔、 3握、 4窝"; "woaiwaitan"对应" 1我爱外滩、 2外滩三号、 3 外滩 18号"。 在此, 本领域 技术人员应理解"持续"是指在用户最终选择一个词条选项前一直进行的动作方式, 例如用户在敲击 按键序列" woai"后可能稍停片刻, 如 0.5秒, 再继续敲击随后的按 键。
在本实施方式一个优选实例中, 查询装置 422在根据用户输入序列在网络词库 424和网络广告库 424中进行匹配查询获得多个输入词条选项和广告信息选项时还 获得其各自的优先级。 用户设备 41中的提供装置 413将网络设备 42提供的多个匹 配的输入词条选项和广告信息选项按优先级顺序在词条栏中显示给所述用户, 其中 优先级越高, 该输入词条选项或广告信息选项越靠前显示。 优选地, 为便于用户进 行文字输入, 优先级最好的输入词条选项一般置于最前位置, 使得用户可通过简单 地按' 'ENTER"或空格键来选择, 而广告信息选项通常置于每行中较末尾选项位置。
优选地,当用户通过用户设备 41登录网络设备时,网络设备 42的查询装置 422 可根据用户登录的 ID来获取用户特征。 例如用户输入历史记录, 用户特定的用户 词库、 用户设定的个人偏好、 用户属性信息等。 所述用户特征可以保存在网络设备 42中, 也可保存在于网络设备 42相连接的其他网络设备中。
随后, 查询装置 422可以根据用户特征来在网络词库 424和网絡广告库 424中 进行查询, 获得匹配的输入词条选项和广告信息选项。 具体地, 查询装置 422可根 据用户输入历史记录中对各个词条选项或词条选项中的词汇的选择频度、 各词条选 项中各个词汇间的文义关联性来确定其优先級高低。 查询装置 422也可根据用户设 定的个人偏好选择来确定优先级高低,例如, 当用户设定输入偏好为:优先级高低: 购物 >饮食 >旅游, 则获取用户输入序列 "woaiwaitan"后, 查询装置在网络广告库 424 中查询获得与 "waitan"对应的多个位于外滩的地标性建筑或旅游景点, 如招商局总 部、 汇丰大厦、 花旗银行、 外滩三号、 外滩 18号等, 随后根据用户设定的个人偏 好可判断"外滩三号"、 "外滩 18号"等以购物、 餐饮为主的建筑景点的优先级最高。 另外, 查询装置 422还可根据目前用户设备的 IP地址来判断其所处的地域,从而可 以确定输入序列中与该地域相关的词汇的优先级, 例如, 但用户输入序列为 "woxihuanbund" , 其中" bund"的译文有" 1堤岸 2码头 3同盟 4 (上海) 外滩", 当 查询装置 412根据用户设备 IP地址获知目前位于中国上海市, 从而可确定" bund" 对应译文中 "上海外滩"或"外滩"优先级最高, 因而可提供如下输入词条选项 "1我喜 欢上海外滩; 2我喜欢外滩; 3我喜欢码头; 4我喜欢堤岸; 5我喜欢同盟"。 为简明 起见, 我们可将用户输入历史记录, 用户设定个人偏好、 计算机 IP地址(或用户地 址)等统称为用户特征, 且本领域技术人员应能理解, 用户特征包括但不限于上述 内容。
优选地, 网络设备 42还可对所保存的用户输入历史记录、 输入偏好及词汇间 关联性等信息进行更新。 如图 21所示, 用户设备 41还包括第二获取装置 415、 第 三发送装置 418 ; 网络设备 42还包括第二接收装置 425和更新装置 426。 其中用户 设备 41中的第二获取装置 415通过与用户的进一步交互来获取该用户对提供装置 413所提供多个输入词条选项的选择, 并由第三发送装置 418发送至网络设备。 更 新装置 426根据第二接收装置 425所接收的用户选择来更新词库和用户输入历史记 录、 词汇间的关联性等, 例如可在网络词库 424中增加新词条选项和已有词条选项 的优先級, 用户特征。 更优选地, 网络设备 42还可包括第三获取装置(未示出), 其还可自行在互联网中搜寻新的词条组合, 并用以更新网络词库 424等。
在本实施方式另一优选实例中, 广告库 414'可以是位于网络设备 42以外, 例 如位于另一个网络设备处或分布于其他多个网络设备处, 网络设备 42可经由网络 与所述其他网络设备相连接, 从而查询与用户输入序列相关的广告信息选项。
图 21示出根据本实施方式的另一个优选实例, 其中用户设备 41本身也包括查 询装置 412和用于保存本地词库的存储器 414 (以下简称本地词库 414 ) , 本地词 库 414并可随时或定期地与网络设备 42的网络词库中该用户特定的用户词库进行 同步。
如图 21所示, 第一获取装置 411在获取用户输入序列后, 可将所述用户输入 序列先提供给用户设备 41的查询装置 412进行匹配查询, 具体查询过程如前面参 照图 19~20所描述的内容, 该内容引用在此不作赘述; 第一获取装置 411还可通过 第三发送装置 418将用户输入序列发送至网络设备 42, 由其中的查询装置 422进行 匹配查询, 获得一个或多个与用户输入序列有关的输入词条选项和广告信息选项具 体查询过程如前面参照图 20所描述的内容, 该内容引用在此不作赘述。 用户设备 41还包括一个合并装置 420 , 其将来自本身的查询装置 412提供的一个或多个输入 词条选项和来自网络设备 42的查询装置 422提供的一个或多个输入词条选项进行 合并处理, 删除其中的重复选项, 并根据一定规则来确定最终合并得到的多个词条 选项和来自网络设备 42反馈的与输入序列有关的广告信息选项的优先级顺序, 随 后提供给提供装置 413 , 由其按相应的优先级顺序提供给用户。 通常, 网络设备 42 提供的输入词条选项应该更为准确, 因此优先级较本地查询获得输入词条选项为高, 而同样地, 为不影响用户的文字输入, 广告信息选项通常置于每行中较末尾选项位 置。
图 22为根据本实施方式一个方面的在用户设备中当用户进行文字输入时同时 提供与输入信息相关的搜索相关信息的方法流程图。 也即, 当用户在用户设备 41 上进行文字输入时, 用户设备 41在根据用户输入序列提供相应输入词条选项, 还 根据用户设备的输入序列搜索相关的搜索相关信息, 如广告信息、 网页信息、 旅游 信息或地图信息。 以下以广告信息为例进行说明:
在步骤 S41中, 用户设备 41通过任何一种可与用户进行人机交互的交互设备 来实时地获取用户正在输入的输入序列。 该交互设备可以是键盘、 遥控器、 触摸板 或声控设备等。 以键盘为例, 但用户敲击键盘中按键进行输入时, 用户设备 41实 时地获取用户敲击的按键序列 (为简明起见, 以下仍称输入序列) 。
在步骤 S42中,用户设备 41根据所获得的用户输入序列与本地保存的词库(以 下简称本地词库)进行匹配查询, 获得一个或多个匹配的输入词条选项。 以下以中 文为例进行说明, 本实施方式允许用户在按全拼、 双拼、 五笔等方法输入中文。 同 时, 用户设备还根据用户输入序列在本地保存的关键词广告库 (以下筒称本地广告 库) 中进行搜索, 获得相关的一个或多个广告信息选项。 与例如, 当用户敲击按键 输入" woaiwaitan" , 用户设备 41在本地词库中查询获得 " 1 我爱外滩; 2 我爱"等词 条组合, 同时在广告库 414'中查询获得与 "外滩 "有关的广告信息有 "外滩三号"、 "外 滩 18号"等地标性建筑, 因此提供广告信息选项" 3 外滩三号; 4 外滩 18号"。 本领 域技术人员应理解, 查询与输入序列相关的广告信息 (或搜索相关信息) 的过程可 以采用目前公知的各种智能或模糊搜索算法, 在此不作赘述。
在步骤 S43中, 用户设备 41将所获得的一个或多个匹配的输入词条选项按一 定顺序和格式提供给所述用户, 供其选择以作具体输入。 例如, 通过在用户设备 41 的显示器中一个输入窗口栏中显示给用户是, 可将多个词条选项与输入序列分栏显 示, 多个词条选项可全部列入下一栏中供用户选择。 优选地, 可以在词条栏中仅显 示一行词条选项, 该行词条选项数^]可以是缺省的也可由用户设定, 通过由用户按 动特定功能键显示上一行或下一行词条选项, 该特定功能键例如可以是" +"和" -,,。 优选地, 为便于用户注意, 广告信息选项在词条栏中可采用不同显示方式, 例 如不同颜色或灰度。而且广告信息选项中内置有与该广告信息相关的网页 IP地址或 统一资源标示符 (URL ) 。
在步骤 S44中, 用户和用户设备 41可根据所提供的输入词条选项做进一步人 机交互。 用户可通过在用户设备 41的键盘上按该选项相应的数字键或通过用户设 备 41的鼠标移动光标至该选项处悬停或点击来选择该广告信息选项。 而当用户选 择该广告信息选项, 用户设备 41可通过网络定向到其对应的网页网址, 例如在浏 览器打开情形, 经由网络连接到该网址对应的网页服务器, 并在浏览器中显示其网 页给用户。
优选地, 步骤 S41至 S43之间是是持续不断地循环。 具体地, 在步骤 S41中, 用户设备 41实时地获取用户持续输入的输入序列并持续不断地在本地进行查询, 例如, 用户持续地输入" w"、 "wo" ... "wo" ... "woai" ... "woaiwaitan" , 在步骤 S42中, 用户设备 41也实时地对根据持续获取的用户输入序列进行匹配查询, 以持续获取 与上述各输入序列相对应的词条选项,例如" w"对应" 1我、 2鬼、 3握、 4窝"; "woai" 对应" 1我爱、 2喔、 3握、 4窝"; "woaiwaitan"对应" 1我爱外滩、 2外滩三号、 3 外 滩 18号"。 在此, 本领域技术人员应理解"持续"是指在用户最终选择一个词条选项 前一直进行的动作方式, 例如用户在敲击按键序列" woai"后可能稍停片刻, 如 0.5 秒, 再继续敲击随后的按键。
优选地, 在步骤 S42中, 用户设备 41在根据用户输入序列在词库和广告库中 进行匹配查询获得多个输入词条选项和广告信息选项时还获得其各自的优先级。 在 步驟 S43中, 用户设备 41将查询获取的多个匹配的输入词条选项和广告信息选项 按优先級顺序在词条栏中显示给所述用户, 其中优先级越高, 该输入词条选项或广 告信息选项越靠前显示。 优选地, 为便于用户进行文字输入, 优先级最好的输入词 条选项一般置于最前位置, 使得用户可通过简单地按" ENTER"或空格键来选择, 而 广告信息选项通常置于每行中较末尾选项位置。
优选地, 在步骤 S42中, 用户设备 41还可以根据用户特征来在词库和广告库 中进行查询, 获得匹配的输入词条选项和广告信息选项。 在获得匹配的多个输入词 条选项和广告信息选项后, 还可根据用户特征来确定其优先级高低。 用户特征包括 用户的输入历史记录、 用户设定的个人偏好选择、 用户属性、 用户地址等, 用户属 性包括用户的职业、 性别、 国际、 出生地、 年龄等体现个人特性的信息。 具体地, 用用户户设设备备 11可可根根据据用用户户输输入入历历史史记记录录中中对对各各个个词词条条选选项项或或词词条条选选项项中中的的词词汇汇的的选选择择 频频度度、、 各各词词条条选选项项中中各各个个词词汇汇间间的的文文义义关关联联性性来来确确定定其其优优先先级级高高低低。。 用用户户设设备备 4411 也也可可根根据据用用户户设设定定的的个个人人偏偏好好选选择择来来确确定定优优先先级级高高低低,, 例例如如,, 当当用用户户设设定定输输入入偏偏好好 为为:: 优优先先级级高高低低:: 购购物物 >>饮饮食食 >>旅旅游游,, 则则获获取取用用户户输输入入序序列列 ""wwooaaiiwwaaiittaann""后后,, 用用户户设设 备备 4411在在广广告告库库中中查查询询获获得得与与"" wwaaiittaann""对对应应的的多多个个位位于于外外滩滩的的地地标标性性建建筑筑或或旅旅游游景景点点,, 如如招招商商局局总总部部、、 汇汇丰丰大大厦厦、、 花花旗旗银银行行、、 外外滩滩三三号号、、 外外滩滩 1188号号等等,, 随随后后根根据据用用户户设设 定定的的个个人人偏偏好好可可判判断断""外外滩滩三三号号""、、 ""外外滩滩 1188号号""等等以以购购物物、、餐餐饮饮为为主主的的建建筑筑景景点点的的优优 先先级级最最高高。。 另另外外,, 在在步步骤骤 SS4422中中,, 用用户户设设备备 4411还还可可根根据据目目前前用用户户设设备备的的 IIPP地地址址来来 判判断断其其所所处处的的地地域域,,从从而而可可以以确确定定输输入入序序列列中中与与该该地地域域相相关关的的词词汇汇的的优优先先级级,,例例如如,, 但但用用户户输输入入序序列列为为"" wwooxxiihhuuaannbbuunndd"",,其其中中 ""bbuunndd""的的译译文文有有"" 11堤堤岸岸 22码码头头 33同同盟盟 44((上上 海海))外外滩滩"",, 当当用用户户设设备备 4411根根据据用用户户设设备备 IIPP地地址址获获知知目目前前位位于于中中国国上上海海市市,, 从从而而可可 确确定定"" bbuunndd""对对应应译译文文中中""上上海海外外滩滩""或或""外外滩滩""优优先先级级最最高高,, 因因而而可可提提供供如如下下输输入入词词条条 选选项项"" 11我我喜喜欢欢上上海海外外滩滩;; 22我我喜喜欢欢外外滩滩;; 33我我喜喜欢欢码码头头;; 44我我喜喜欢欢提提岸岸;; 55我我喜喜欢欢同同 盟盟""。。 为为简简明明起起见见,, 我我们们可可将将用用户户输输入入历历史史记记录录,, 用用户户设设定定个个人人偏偏好好、、 计计算算机机 IIPP地地 址址 ((或或用用户户地地址址)) 等等统统称称为为用用户户特特征征,, 且且本本领领域域技技术术人人员员应应能能理理解解,, 用用户户特特征征包包括括 但但不不限限于于上上述述内内容容。。
本本领领域域技技术术人人员员应应能能理理解解,, 用用户户设设备备 4411可可以以在在本本地地存存储储器器中中保保存存上上述述用用户户输输 入入历历史史记记录录、、 用用户户设设定定输输入入偏偏好好以以及及词词汇汇间间的的各各种种关关联联性性。。 优优选选地地,, 用用户户设设备备 4411 还还可可对对所所保保存存的的用用户户输输入入历历史史记记录录、、 输输入入偏偏好好及及词词汇汇间间关关联联性性等等信信息息进进行行更更新新。。 在在 步步骤骤 SS4455 ((未未示示出出)) 中中,, 用用户户设设备备 4411还还通通过过与与用用户户的的进进一一步步交交互互来来获获取取该该用用户户对对所所 提提供供多多个个输输入入词词条条选选项项的的选选择择,, 然然后后根根据据所所获获取取的的用用户户选选择择来来更更新新词词库库和和用用户户输输入入 历历史史记记录录、、 词词汇汇间间的的关关联联性性等等,, 例例如如可可在在词词库库中中增增加加新新词词条条选选项项和和已已有有词词条条选选项项的的 优优先先级级,, 用用户户特特征征。。 更更优优选选地地,, 如如用用户户设设备备可可接接入入互互联联网网,, 在在步步驟驟 SS4455中中,, 用用户户设设 备备 4411还还可可自自行行在在互互联联网网中中搜搜寻寻新新的的词词奈奈组组合合,, 并并用用以以更更新新词词库库等等。。
在在一一个个优优选选实实例例中中,,用用户户设设备备 4411中中的的广广告告库库可可随随时时与与周周期期性性地地进进行行主主动动更更新新,, 例例如如用用户户设设备备 4411经经由由网网络络与与一一个个或或多多个个网网络络设设备备相相连连接接,, 并并随随时时或或周周期期性性地地与与网网
Figure imgf000066_0001
。。
在另一优选实例中, 广告库可以是位于用户设备 41的外部, 例如位于一个网 络设备处或分布于多个网络设备处, 在步骤 S42中, 用户设备 41可经由网络与网 络设备相连接, 从而查询与用户输入序列相关的广告信息选项。
图 23为根据本实施方式另一方面的用户设备与网络设备相配合来当用户进行 文字输入时同时提供与输入信息相关的搜索相关信息的方法流程图。
其中用户设备 41经由网络与网络设备 42相连接, 该网络可以为互联网、 内部 网等。 也即, 当用户在用户设备 41上进行文字输入时, 用户设备 41经由网络向网 络设备 42发送查询请求, 请求网络设备 42根据用户输入序列搜索相关的搜索相关 信息, 如广告信息、 网页信息、 旅游信息或地图信息, 然后将网络设备反馈的搜索 相关信息与网络设备查询获得的输入词条选项一起提供给用户。 以下以广告信息为 例进行说明:
在一个实例中, 网络设备 42保存网络词库和关键词广告库 (为筒明起见, 以 下简称网络广告库或搜索相关信息库) 。
具体地, 如图 23所示, 在步驟 S41中, 用户设备 41通过任何一种可与用户进 行人机交互的交互设备来实时地获取用户正在输入的输入序列。 该交互设备可以是 键盘、 遥控器、 触摸板或声控设备等。 以键盘为例, 但用户敲击键盘中按键进行输 入时, 用户设备 1实时地获取用户敲击的按键序列 (为简明起见, 以下仍称输入序 列) 。
在步骤 S42中, 用户设备 41实时并持续不断地将获取的用户输入序列发送至 网络设备 42。
在步骤 S43中, 网络设备 42根据接收的用户输入序列在网络词库中进行匹配 查询, 获得一个或多个匹配的输入词条选项。 以下以中文为例进行说明, 本发明允 许用户在按全拼、 双拼、 五笔等方法输入中文。 同时, 网络设备 42还根据用户输 入序列在网络广告库中进行搜索, 获得相关的一个或多个广告信息选项。 与例如, 当用户敲击按键输入 "woaiwaitan" , 网络设备 42在网络词库中查询获得 "1 我爱外 滩; 2 我爱"等词条组合, 同时在网络广告库中查询获得与"外滩"有关的广告信息有 "外滩三号"、 "外碓 18号"等地标性建筑, 因此提供广告信息选项" 3 外滩三号; 4 外 滩 18号"。 本领域技术人员应理解, 查询与输入序列相关的广告信息 (或搜索相关 信息) 的过程可以采用目前公知的各种智能或模糊搜索算法, 在此不作赘述。
在步驟 S45中, 网络设备 42也实时和持续不断地将所查询的输入词条选项发 送至用户设备 41。在步骤 S47中, 用户设备 41将接收到的来自网络设备 42实时和 持续地所述输入词条选项并提供给用户, 用户设备 41可将获得的一个或多个匹配 的输入词条选项按一定顺序和格式提供给所述用户, 供其选择以作具体输入或进一 步交互。 例如, 通过在显示器的一个输入窗口栏中显示给用户是, 可将多个词条选 项与输入序列分栏显示, 多个词条选项可全部列入下一栏中供用户选择。 优选地, 可以在词条栏中仅显示一行词条选项, 该行词条选项数目可以是缺省的也可由用户 设定, 通过由用户按动特定功能键显示上一行或下一行词条选项, 该特定功能键例 如可以是" +"和" -"。
优选地, 为便于用户注意, 广告信息选项在词条栏中可采用不同显示方式, 例 如不同颜色或灰度。而且广告信息选项中内置有与该广告信息相关的网页 IP地址或 统一资源标示符 (URL ) 。
在步骤 S48中, 用户和用户设备 41可根据所提供的输入词条选项做进一步人 机交互。 用户可通过按该选项相应的数字键或通过鼠标移动光标至该选项处悬停或 点击来选择该广告信息选项。 而, 当用户选择该广告信息选项, 用户设备 41可通 过网络定向到其对应的网页网址, 例如在浏览器打开情形, 经由网络连接到该网址 对应的网页服务器, 并在浏览器中显示其网页给用户。
优选地,步骤 S41至 S47之间是持续不断地循环工作。具体地,在步骤 S41中, 用户设备 41实时地获取用户的输入序列并持续不断地发送给网络设备 42,例如" w"、 "wo" ... "wo" ... "woai" ... "woaiwaitan" , 网络设备 42也才艮据用户输入序列后实时和持 续不断进行匹配查询, 并将查询到的输入词条序列持续不断地发送回用户设备 41 , 例如" w"对应 " 1我、 2喔、 3握、 4窝"; "woai"对应" 1我爱、 2喔、 3握、 4窝"; "woaiwaitan" 对应" 1我爱外滩、 2外滩三号、 3 外滩 18号"。 在此, 本领域技术人员应理解"持 续"是指在用户最终选择一个词条选项前一直进行的动作方式, 例如用户在敲击 按 键序列 "woai"后可能稍停片刻, 如 0.5秒, 再继续敲击随后的按键。
优选地, 在步驟 S43 , 网络设备 42在根据用户输入序列在网络词库和网络广告 库中进行匹配查询获得多个输入词条选项和广告信息选项时还获得其各自的优先 级。 在步骤 S47中, 用户设备 41将网络设备 42提供的多个匹配的输入词条选项和 广告信息选项按优先级顺序在词条栏中显示给所述用户, 其中优先级越高, 该输入 词条选项或广告信息选项越靠前显示。 优选地, 为便于用户进行文字输入, 优先級 最好的输入词条选项一般置于最前位置, 使得用户可通过筒单地按 "ENTER"或空格 键来选择, 而广告信息选项通常置于每行中较末尾选项位置。
优选地, 当用户通过用户设备 41登录网络设备时, 在步驟 S43中, 网络设备 42还可根据用户登录的 ID来获取用户特征。 例如用户输入历史记录, 用户特定的 用户词库、 用户设定的个人偏好、 用户属性信息等。 所述用户特征可以保存在网络 设备 42中, 也可保存在于网络设备 42相连接的其他网络设备中。
随后, 在步骤 S43中, 网络设备 42可以根据用户特征来在网络词库和网絡广 告库中进行查询, 获得匹配的输入词条选项和广告信息选项。 具体地, 在步骤 S43 中, 网络设备 42可根据用户输入历史记录中对各个词条选项或词条选项中的词汇 的选择频度、 各词条选项中各个词汇间的文义关联性来确定其优先级高低。 网络设 备 42也可根据用户设定的个人偏好选择来确定优先级高低, 例如, 当用户设定输 入偏好为: 优先级高低: 购物 >饮食 >旅游, 则获取用户输入序列 "woaiwaitan"后, 网络设备 42在网络广告库中查询获得与 "waitan"对应的多个位于外滩的地标性建筑 或旅游景点, 如招商局总部、 汇丰大厦、 花旗银行、 外滩三号、 外滩 18号等, 随 后根据用户设定的个人偏好可判断 "外滩三号"、 "外滩 18号"等以购物、餐饮为主的 建筑景点的优先级最高。 另外, 在步驟 S43中, 网络设备 42还可根据目前用户设 备的 IP地址来判断其所处的地域,从而可以确定输入序列中与该地域相关的词汇的 优先级, 例如, 但用户输入序列为" woxihuanbund" , 其中" bund"的译文有" 1堤岸 2 码头 3同盟 4 (上海)外滩", 在步骤 S43中, 当网络设备 42根据用户设备 IP地址 获知目前位于中国上海市, 从而可确定" bund"对应译文中"上海外滩"或"外滩"优先 级最高, 因而可提供如下输入词条选项" 1我喜欢上海外碓; 2我喜欢外滩; 3我喜 欢码头; 4我喜欢堤岸; 5我喜欢同盟"。为简明起见,我们可将用户输入历史记录, 用户设定个人偏好、 计算机 IP地址(或用户地址)等统称为用户特征, 且本领域技 术人员应能理解, 用户特征包括但不限于上述内容。
优选地, 网络设备 42还可对所保存的用户输入历史记录、 输入偏好及词汇间 关联性等信息进行更新。 在步骤 S49 (未示出)中, 用户设备 41通过与用户的进一 步交互来获取该用户对所提供多个输入词条选项的选择, 并发送至网络设备; 在步 骤 S410 (未示出) 中, 网络设备 42根据所接收的用户选择来更新词库和用户输入 历史记录、 词汇间的关联性等, 例如可在网络词库中增加新词条选项和已有词条选 项的优先级, 用户特征。 更优选地, 网络设备 42还可自行在互联网中搜寻新的词 条组合, 并用以更新网络词库等。
优选地, 网络广告库可以是位于网络设备 42以外, 例如位于另一个网络设备 处或分布于其他多个网络设备处, 网络设备 42可经由网络与所述其他网络设备相 连接, 从而查询与用户输入序列相关的广告信息选项。
图 24示出根据本实施方式的另一个优选实例, 其中用户设备 41本身也保存有 本地词库, 并可随时或定期地与网络设备 42的网络词库中该用户特定的用户词库 进行同步。
如图 24所示, 在步骤 S44中, 用户设备 41根据所述用户输入序列在本地词库 中进行匹配查询, 具体查询过程如前面参照图 44所描述的步骤 S42的内容, 该内 容引用在此不作赞述。 步骤 S41至 S43如前面参照图 23所描述的步骤 S41 S43的 内容, 该内容引用在此不作赘述。 本领域技术人员应能理解, 步骤 S41至 S43和步 骤 S44可以是同步进行, 其完成时间主要取决于用户设备 41和网络设备 42的处理 速度以及用户设备 41与网络设备 42之间的网络传输延时。 在步骤 S46中, 用户设 备 41将本地查询到的一个或多个输入词条选项和来自网络设备 42的一个或多个输 入词条选项进行合并处理, 删除其中的重复选项, 并根据一定规则来确定最终合并 得到的多个词条选项和来自网络设备 42反馈的与输入序列有关的广告信息选项的 优先级顺序, 随后, 在步骤 S48中, 将所述输入词条选项和广告信息选项按相应的 优先级顺序提供给用户, 供其选择或作进一步人机交互。 通常, 网络设备 42提供 的输入词条选项应该更为准确, 因此优先级较本地查询获得输入词条选项为高, 而 同样地,为不影响用户的文字输入,广告信息选项通常置于每行中较末尾选项位置。
本领域技术人员应理解,在以上参照图 20、 21、 23和 24所描述的实施方式中, 当用户经由用户设备 41登录网络设备 42时, 网络设备 42中的网络词库也可为该 用户特定的用户词库。
上面以中文为例来描述本发明的实施方式, 本领域技术人员应能理解本发明还 可适用于另一种文字进行输入的情形,例如韩文、 日文、 法文、 德文或意大利文等, 需要更改和调整的仅仅是将中文输入规则替换为所述另一种文字的输入规则, 以及 更换相应的词库和用户设定输入偏好等。
对于本领域技术人员而言, 显然本发明不限于上述示范性实施例的细节, 而且 在不背离本发明的精神或基本特征的情况下, 能够以其他的具体形式实现本发明。 因此, 无论从哪一点来看, 均应将实施例看作是示范性的, 而且是非限制性的, 本 发明的范围由所附权利要求而不是上述说明限定, 因此旨在将落在权利要求的等同 要件的含义和范围内的所有变化嚢括在本发明内。 不应将权利要求中的任何附图标 记视为限制所涉及的权利要求。 此外, 显然"包括"一词不排除其他单元或步驟, 单 数不排除复数。 系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通 过软件或者硬件来实现。 第一, 第二等词语用来表示名称, 而并不表示任何特定的 顺序。

Claims

权 利 要 求 书
1. 一种计算机设备实现的用于确定资源候选项的排序结果的方法, 其中, 该 方法包括以下步骤:
a 由来自用户的输入序列中获取检索信息及调整信息;
b 根据所述检索信息进行检索, 以获得多个资源候选项;
c 根据所述调整信息, 确定所述多个资源候选项的排序结果;
d 根据所述排序结果来生成展现信息, 以提供给所述用户。
2. 根据权利要求 1所述的方法, 其中, 所述调整信息包括一个或多个调整单 元, 其中, 该方法还包括:
- 获取用于辅助确定所述排序结果的第一排序辅助信息;
其中, 所述步骤 c还包括以下步骤:
- 根据所有调整单元, 并结合所述第一排序辅助信息, 来确定所述多个资源候 选项的排序结果。
3. 根据权利要求 2所述的方法, 其中, 所述第一排序辅助信息包括以下至少 一项:
- 各个调整单元的权重信息;
- 所述多个资源候选项中每个资源候选项的调整单元分布信息;
- 所述多个资源候选项中每个资源候选项的预定质量信息。
4. 根据权利要求 1所述的方法, 其中, 所述步驟 c还包括以下步驟:
- 根据所述调整信息及所述检索信息, 确定所述多个资源候选项的排序结果。
5. 根据权利要求 4所述的方法, 其中, 所述调整信息包括一个或多个调整单 元, 所述检索信息包括一个或多个检索单元, 其中, 所述方法还包括:
- 获取用于辅助确定所述排序结果的第二排序辅助信息;
其中, 所述步骤 c还包括以下步骤:
- 根据所有调整单元及所有检索单元, 并结合所述第二排序辅助信息, 来确定 所述多个资源候选项的排序结果。
6. 根据权利要求 5所述的方法, 其中, 所述第二排序辅助信息包括以下至少 一项:
- 各个调整单元的权重信息;
- 所述多个资源候选项中每个资源候选项的调整单元分布信息; - 各个检索单元的权重信息;
- 所述多个资源候选项中每个资源候选项的检索单元分布信息;
- 所述多个资源候选项中每个资源候选项的质量信息。
7. 根据权利要求 3或 6所述的方法, 其中, 所述调整单元分布信息包括以下 至少一项:
- 该调整单元分布信息所对应的资源候选项中各个调整单元的出现次数; - 该调整单元分布信息所对应的资源候选项中各个调整单元的出现位置; - 该调整单元分布信息所对应的资源候选项中不同调整单元的数量。
8. 根据权利要求 6所述的方法, 其中, 所述检索单元分布信息包括以下至少 一项:
- 该检索单元分布信息所对应的资源候选项中各个检索单元的出现次数; - 该检索单元分布信息所对应的资源候选项中各个检索单元的出现位置; - 该检索单元分布信息所对应的资源候选项中不同检索单元的数量。
9. 根据权利要求 3或 6或 7或 8中任一项所述的方法, 其中, 所述质量信息 包括以下至少一项:
- 该质量信息所对应的资源候选项的权威性;
- 该质量信息所对应的资源候选项的优质度。
10. 根据权利要求 1至 9中任一项所述的方法,其中,该方法还包括以下步骤: - 获取用于确定所述检索信息及所述调整信息的第一类型确定信息; 其中, 所述步骤 a还包括以下步驟:
- 根据所述第一类型确定信息, 由来自用户的输入序列中获取所述检索信息及 所述调整信息;
其中, 所述第一类型确定信息包括以下至少一项:
- 根据所述输入序列在预定关键词类型库中查询所得的信息单元及其类型; - 根据所述输入序列来获得的语义分析结果。
1 1. 根据权利要求 10所述的方法, 其中, 该方法还包括以下步骤:
X 获取关键词单元及其类型;
y根据所述关键词单元及其类型, 建立或更新所述预定关键词类型库。
12. 根据权利要求 1 1所述的方法, 其中, 所述步骤 X还包括以下步骤: - 获取所述关键词单元; - 获取用于确定所述关键词单元类型的第二类型确定信息;
- 根据所述第二类型确定信息来确定该关键词单元的类型;
其中, 所述第二类型确定信息包括以下至少一项:
- 该关键词单元在预定语料库中的分布集中度;
- 根据该关键词单元来获得的语义分析结果;
- 包含该关键词单元且匹配同一语料的用户历史输入序列数量。
13. 根据权利要求 1至 12中任一项所述的方法, 其中, 所述步骤 a还包括以 下步骤:
- 获取来自用户的输入序列;
- 去除所述输入序列中的无效信息, 以获得可用信息;
- 由所述可用信息中获取所述检索信息及所述调整信息。
14. 一种用于确定资源候选项的排序结果的排序确定装置, 其中, 该排序确定 装置包括:
第一获取装置、 用于由来自用户的输入序列中获取检索信息及调整信息; 检索装置、 用于根据所述检索信息进行检索, 以获得多个资源候选项; 排序装置、 用于根据所述调整信息, 确定所述多个资源候选项的排序结果; 提供装置、 用于根据所述排序结果来生成展现信息, 以提供给所述用户。
15. 根据权利要求 14所述的排序确定装置, 其中, 所述调整信息包括一个或多 个调整单元, 其中, 所述排序确定装置还包括:
第二获取装置、 用于获取用于辅助确定所述排序结果的第一排序辅助信息; 其中, 所述排序装置还包括:
第一子排序装置、 用于根据所有调整单元, 并结合所述第一排序辅助信息, 来 确定所述多个资源候选项的排序结果。
16. 根据权利要求 15所述的排序确定装置, 其中, 所述第一排序辅助信息包括 以下至少一项:
- 各个调整单元的权重信息;
- 所述多个资源候选项中每个资源候选项的调整单元分布信息;
- 所述多个资源候选项中每个资源候选项的预定质量信息。
17. 根据权利要求 14所述的排序确定装置, 其中, 所述排序装置还包括: 第二子排序装置、 用于根据所述调整信息及所述检索信息, 确定所述多个资源 候选项的排序结果。
18. 根据权利要求 17 所述的排序确定装置, 其中, 所述调整信息包括一个或 多个调整单元, 所述检索信息包括一个或多个检索单元, 其中, 所述排序确定装置 还包括:
第三获取装置、 用于获取用于辅助确定所述排序结果的第二排序辅助信息; 其中, 所述第二子排序装置还包括:
第三子排序装置、 用于根据所有调整单元及所有检索单元, 并结合所述第二排 序辅助信息, 来确定所述多个资源候选项的排序结果。
19. 根据权利要求 18所述的排序确定装置, 其中, 所述第二排序辅助信息包括 以下至少一项:
- 各个调整单元的权重信息;
- 所述多个资源候选项中每个资源候选项的调整单元分布信息;
- 各个检索单元的权重信息;
- 所述多个资源候选项中每个资源候选项的检索单元分布信息;
- 所述多个资源候选项中每个资源候选项的质量信息。
20. 根据权利要求 16或 19所述的排序确定装置, 其中, 所述调整单元分布信 息包括以下至少一项:
- 该调整单元分布信息所对应的资源候选项中各个调整单元的出现次数; - 该调整单元分布信息所对应的资源候选项中各个调整单元的出现位置; - 该调整单元分布信息所对应的资源候选项中不同调整单元的数量。
21. 根据权利要求 19所述的排序确定装置, 其中, 所述检索单元分布信息包括 以下至少一项:
- 该检索单元分布信息所对应的资源候选项中各个检索单元的出现次数; - 该检索单元分布信息所对应的资源候选项中各个检索单元的出现位置; - 该检索单元分布信息所对应的资源候选项中不同检索单元的数量。
22. 根据权利要求 16或 19或 20或 21中任一项所述的排序确定装置, 其中, 所述质量信息包括以下至少一项:
- 该质量信息所对应的资源候选项的权威性;
- 该质量信息所对应的资源候选项的优质度。
23. 根据权利要求 14至 22中任一项所述的排序确定装置, 其中, 所述排序确 定装置还包括:
第四获取装置、用于获取用于确定所述检索信息及所述调整信息的第一类型确 定信息;
其中, 所述第一获取装置还包括:
第一子获取装置、 用于根据所述第一类型确定信息, 由来自用户的输入序列中 获取所述检索信息及所述调整信息;
其中, 所述第一类型确定信息包括以下至少一项:
- 根据所述输入序列在预定关键词类型库中查询所得的信息单元及其类型; - 根据所述输入序列来获得的语义分析结果。
24. 根据权利要求 23所述的排序确定装置,其中,所述排序确定装置还包括: 第五获取装置、 用于获取关键词单元及其类型;
更新装置、 用于根据所述关键词单元及其类型, 建立或更新所述预定关键词类 型库。
25. 根据权利要求 24所述的排序确定装置, 其中, 所述第五获取装置包括: 关键词获取装置、 用于获取所述关键词单元;
第六获取装置、 用于获取用于确定所述关键词单元类型的第二类型确定信息; 类型确定装置、 用于根据所述第二类型确定信息来确定该关键词单元的类型; 其中, 所述第二类型确定信息包括以下至少一项:
- 该关键词单元在预定语料库中的分布集中度;
- 根据该关键词单元来获得的语义分析结果;
- 包含该关键词单元且匹配同一语料的用户历史输入序列数量。
26. 根据权利要求 14至 25中任一项所述的排序确定装置, 其中, 所述第一获 取装置还包括:
输入序列获取装置、 用于获取来自用户的输入序列;
去除装置、 用于去除所述输入序列中的无效信息, 以获得可用信息; 第二子获取装置、 用于由所述可用信息中获取所述检索信息及所述调整信息。
27. 一种用于对资源候选项进行排序的计算机设备, 其中, 所述计算机设备包 括: 权利要求 14至 26中任一项所述的排序确定装置。
PCT/CN2011/083406 2011-04-13 2011-12-02 用于确定资源候选项的排序结果的方法、装置及设备 WO2012139394A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110092452.3 2011-04-13
CN201110092452.3A CN102163228B (zh) 2011-04-13 2011-04-13 用于确定资源候选项的排序结果的方法、装置及设备

Publications (1)

Publication Number Publication Date
WO2012139394A1 true WO2012139394A1 (zh) 2012-10-18

Family

ID=44464455

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/083406 WO2012139394A1 (zh) 2011-04-13 2011-12-02 用于确定资源候选项的排序结果的方法、装置及设备

Country Status (2)

Country Link
CN (1) CN102163228B (zh)
WO (1) WO2012139394A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108762804A (zh) * 2018-04-24 2018-11-06 阿里巴巴集团控股有限公司 灰度发布新产品的方法和装置
CN111522448A (zh) * 2019-02-02 2020-08-11 北京搜狗科技发展有限公司 一种提供输入候选项的方法、装置和设备

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163228B (zh) * 2011-04-13 2014-10-08 北京百度网讯科技有限公司 用于确定资源候选项的排序结果的方法、装置及设备
CN102289512A (zh) * 2011-09-01 2011-12-21 上海汉翔信息技术有限公司 便携式设备上基于地理位置动态调整候选词的系统及方法
CN102521717B (zh) * 2011-12-30 2016-03-30 山东中创软件工程股份有限公司 一种配置管理资源库信息统计的方法与装置
CN102982137A (zh) * 2012-11-16 2013-03-20 北京百度网讯科技有限公司 资源的搜索方法、系统和装置
CN103869998B (zh) * 2012-12-11 2018-05-01 百度国际科技(深圳)有限公司 一种对输入法所产生的候选项进行排序的方法及装置
CN104035934B (zh) * 2013-03-06 2019-01-15 腾讯科技(深圳)有限公司 一种多媒体信息推荐的方法及装置
CN103258023B (zh) * 2013-05-07 2016-08-31 百度在线网络技术(北京)有限公司 搜索候选词的推荐方法及搜索引擎
CN103268310A (zh) * 2013-05-14 2013-08-28 百度在线网络技术(北京)有限公司 基于推荐的自媒体信息编辑方法及装置
CN104462510B (zh) * 2014-12-22 2018-09-11 北京奇虎科技有限公司 基于用户搜索意图的搜索方法及装置
CN105701155B (zh) * 2015-12-30 2019-05-31 百度在线网络技术(北京)有限公司 信息推送方法和装置
TWI782710B (zh) * 2021-09-17 2022-11-01 兆豐國際商業銀行股份有限公司 用於排序及顯示應用程式選項的電子裝置及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060200460A1 (en) * 2005-03-03 2006-09-07 Microsoft Corporation System and method for ranking search results using file types
CN101158971A (zh) * 2007-11-15 2008-04-09 深圳市迅雷网络技术有限公司 一种基于搜索引擎的搜索结果排序方法及装置
CN101233513A (zh) * 2005-07-29 2008-07-30 雅虎公司 对结果集进行重排序的系统和方法
CN102163228A (zh) * 2011-04-13 2011-08-24 北京百度网讯科技有限公司 用于确定资源候选项的排序结果的方法、装置及设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060200460A1 (en) * 2005-03-03 2006-09-07 Microsoft Corporation System and method for ranking search results using file types
CN101233513A (zh) * 2005-07-29 2008-07-30 雅虎公司 对结果集进行重排序的系统和方法
CN101158971A (zh) * 2007-11-15 2008-04-09 深圳市迅雷网络技术有限公司 一种基于搜索引擎的搜索结果排序方法及装置
CN102163228A (zh) * 2011-04-13 2011-08-24 北京百度网讯科技有限公司 用于确定资源候选项的排序结果的方法、装置及设备

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108762804A (zh) * 2018-04-24 2018-11-06 阿里巴巴集团控股有限公司 灰度发布新产品的方法和装置
CN111522448A (zh) * 2019-02-02 2020-08-11 北京搜狗科技发展有限公司 一种提供输入候选项的方法、装置和设备
CN111522448B (zh) * 2019-02-02 2024-04-30 北京搜狗科技发展有限公司 一种提供输入候选项的方法、装置和设备

Also Published As

Publication number Publication date
CN102163228A (zh) 2011-08-24
CN102163228B (zh) 2014-10-08

Similar Documents

Publication Publication Date Title
WO2012139394A1 (zh) 用于确定资源候选项的排序结果的方法、装置及设备
JP5997350B2 (ja) ソーシャルグラフ情報に基づく構造化検索クエリ
US8583087B2 (en) Disambiguating ambiguous characters
US8825694B2 (en) Mobile device retrieval and navigation
US10885076B2 (en) Computerized system and method for search query auto-completion
US9058408B2 (en) System and method of controlling interactive communication services by responding to user query with relevant information from content specific database
KR101708448B1 (ko) 예측 쿼리 제안 캐싱
US9037560B2 (en) Method and system for triggering a search request
TWI418999B (zh) 預測資訊檢索
KR100799658B1 (ko) 문자스트림과 연관된 호스트 기반 지능형 결과
AU2011291549B2 (en) Predictive query completion and predictive search results
US10671665B2 (en) Personalized audio introduction and summary of result sets for users
WO2011150730A1 (zh) 一种用于英文与另一种文字混合输入的方法和设备
US20080114743A1 (en) Method and system for incrementally selecting and providing relevant search engines in response to a user query
US7548899B1 (en) Method and system for information retrieval based on menu selections
KR20090100430A (ko) 질문에 대한 답변 얻기
CN101999119A (zh) 用于输入识别和完成的技术
CN102063450A (zh) 一种基于网络的供用户进行文字输入的方法与设备
CN102063451A (zh) 供用户进行文字输入及同时提供搜索相关信息的方法和设备
CN102063194A (zh) 用于供用户进行文字输入的方法、设备、服务器和系统
US20100179964A1 (en) User interface and system for two-stage search
US11392589B2 (en) Multi-vertical entity-based search system
WO2011127788A1 (zh) 用于供用户进行文字输入的方法、设备、服务器及系统
JP5895777B2 (ja) 情報分類プログラム及び情報処理装置
US20160247522A1 (en) Method and system for providing access to auxiliary information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11863411

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11863411

Country of ref document: EP

Kind code of ref document: A1