WO2013143252A1 - Method and system for prompting input candidate words based on context scenario - Google Patents

Method and system for prompting input candidate words based on context scenario Download PDF

Info

Publication number
WO2013143252A1
WO2013143252A1 PCT/CN2012/079960 CN2012079960W WO2013143252A1 WO 2013143252 A1 WO2013143252 A1 WO 2013143252A1 CN 2012079960 W CN2012079960 W CN 2012079960W WO 2013143252 A1 WO2013143252 A1 WO 2013143252A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
candidate word
candidate
word set
input
Prior art date
Application number
PCT/CN2012/079960
Other languages
French (fr)
Chinese (zh)
Inventor
李静
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Publication of WO2013143252A1 publication Critical patent/WO2013143252A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/274Converting codes to words; Guess-ahead of partial word inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques

Definitions

  • the present invention relates to the field of input methods, and in particular to a method and system for prompting input candidate words based on a context scenario. Background technique
  • the existing input method candidate hint ignores the user's difference. In the case of the same entered word, the candidate does not consider the user's personalized information, so that the user cannot find the candidate quickly and conveniently.
  • the existing input method is mainly based on large-scale word frequency statistics method, combined with local context to achieve candidate word probability statistics and prompts.
  • the current mainstream input method can count the most recent and most frequent input of the user, and the weighted priority displays the user's most recent and most frequently used vocabulary.
  • the input method for the smart phone can be combined with the characteristics of the mobile phone to provide the name of the phone book, the candidate word prompt of the phone or the user to specify a fixed candidate word.
  • the candidate word prompt is combined with the device attribute, and is usually used in the Internet search, and the candidate input word is presented by referring to the model of the input method carrying device and the corresponding function.
  • the commonly used input method only considers the context of user input, ignoring the device used by the user and also the recipient of the information, and the received information changes the behavior habits of the device user. Change. Taking a mobile phone as an example, when a user receives a short message, different information may be replied to different short messages; when the user browses the webpage, the user may use the input function such as replying and searching for different pages on the Internet, so the vocabulary used by the user It will be different because of the situation at the time. In this case, the existing input method does not provide the candidate words well for the user. Summary of the invention
  • the present invention provides a method for prompting an input candidate word based on a context scenario, which is used to calculate the historical information of the short message input by the user, and simultaneously consider the context scenario input by the non-local user, to make up for the deficiency of the existing input method, and improve the candidate.
  • the word "first hit rate” and “candidate hit rate” make the input content truly "personalized”.
  • a context candidate based input candidate prompting method is provided, wherein the following steps are included:
  • an input candidate word prompting system based on a context scenario which includes:
  • a receiving device configured to receive a term input by a user
  • generating means configured to generate a first candidate word set based on an input history of the term by the user and an arbitrary non-local context scenario related to the term;
  • the present invention is applied to various input candidate platforms having context scenarios, and the context scenarios include any context scenarios input by non-local users.
  • the context scenarios include any context scenarios input by non-local users.
  • a candidate word set is generated, and the candidate word set is provided to the user.
  • the present invention fully utilizes non-local user resource information, especially represented by a mobile communication device. For example, when a mobile phone performs a short message chat, if the candidate word set can be re-generated in consideration of the context scenario, the input of the user is greatly facilitated, thereby Improve the candidate's first hit rate for mobile phone input.
  • FIG. 1 is a flow chart showing a specific embodiment of a method for prompting an input candidate word based on a context scenario according to the present invention
  • FIG. 2 is a schematic structural diagram of a specific implementation manner of a context candidate based input candidate prompting system according to the present invention
  • FIG. 3 is a schematic structural diagram of a specific implementation manner of a generating device in an input candidate prompting system based on a context scenario according to the present invention.
  • FIG. 1 is a schematic flowchart of a specific embodiment of a method for prompting an input candidate word based on a context scenario according to the present invention, which includes steps S101 to S103, and FIG. 1 is combined with a specific embodiment. The method shown is described.
  • Step S101 receiving a term input by a user.
  • the method of the present invention can be applied to any device that can load an input method, including but not limited to: a terminal such as a PC, a notebook computer, a PDA (Personal Handheld Computer), a mobile phone, a tablet computer, etc., preferably capable of loading an input method. Mobile phone. Therefore, the following is an example of a mobile phone.
  • the terms entered by the user may be one of characters in various languages, one of pinyin, or a combination thereof. For example: “Baidu”, “woxihuan”, “Baidu ditu” and so on.
  • Step S102 Generate a first candidate word set based on an input history of the entry by the user and an arbitrary non-local context scenario related to the entry.
  • the second set of candidate words may be generated first based on a history recorded by the user on the entry.
  • the term needs to be semantically analyzed, such as analyzing its part of speech, inputting history, etc., to determine candidate words. For example, if the entry entered by the user is "on”, then according to the analysis of the entry, it is known that the term usually appears as a verb part, and then a noun part of the term appears, such as: , machine, etc.; According to the input history, there will also be entries such as: start, start, guide, etc., which often appear after the entry "open".
  • a user common vocabulary is generated according to each user input information. Calculate the probability a for the word beginning with "up”. Secondly, receiving text information input by the non-local user (user B), and cutting the text information to form at least one type of word. That is to divide the original text message content, using the reverse maximum matching method, the result is "today ⁇ I ⁇ go ⁇ on ⁇ land ⁇ software park ⁇ ". Then, the term is stored in the pre-stored vocabulary, and two consecutive singular words in the segmentation result are stored as one word, and stored in a pre-stored vocabulary, such as "I go", "go to", " ⁇ ", and the like.
  • the probability values of successive occurrences can be calculated using the n-gram model, assuming that the probability of occurrence of "upper” is b.
  • a third set of candidate words is generated based on the pre-stored vocabulary. Since user A's reply input has "up”, the values of a*a and ⁇ *b are compared. A and ⁇ are trained parameters, so that the candidate hint box gives priority to "ground” rather than traditional” Go to work, "on the train", “online”, etc. That is, the third candidate word set may be a ground, a class, a car, a net, or the like.
  • new terms appearing in the context are given a higher weight such that the terms appearing in these contexts preferentially appear in the set of candidate words entered by the user.
  • the second candidate word set and the third candidate word set are weighted to generate a first candidate word set.
  • the weights of the second candidate word set and the third candidate word set may be set by the user according to requirements.
  • the weight of the third candidate word set is higher than the second candidate word set, usually Next, the first candidate word of the third candidate word set is the first candidate word of the first candidate word set.
  • Step S103 providing the first candidate word set to the user.
  • a first candidate word set most relevant to the user input term can be obtained, and the first candidate word set is provided to the user for selection by the user.
  • the first candidate will be displayed differently from other candidate words, such as: reverse white, different colors, and so on.
  • FIG. 2 is a block diagram showing a specific embodiment of a context-based scene-based input candidate word presentation system 10 in accordance with the present invention.
  • the system 10 includes: a receiving device 1 1, a generating device 12, and a providing device 13.
  • the receiving device 1 1 is configured to receive a term input by a user.
  • the system of the present invention can be applied to any device that can load an input method, including but not limited to: a terminal such as a PC, a notebook computer, a PDA (Personal Handheld Computer), a mobile phone, a tablet computer, etc., preferably capable of loading an input method.
  • a terminal such as a PC, a notebook computer, a PDA (Personal Handheld Computer), a mobile phone, a tablet computer, etc.
  • Mobile phone Mobile phone. Therefore, the following is an example of a mobile phone.
  • the terms entered by the user may be one of characters in various languages, one of pinyin, or a combination thereof. For example: “Baidu”, “woxihuan”, “Baidu ditu” and so on.
  • the generating means 12 is configured to generate a first candidate word set based on the input history of the term by the user and any non-local contextual scene related to the term.
  • the generating means 12 may be further configured to first generate a second set of candidate words based on a history of the user input to the entry.
  • the term needs to be semantically analyzed, such as analyzing its part of speech, inputting history, etc., to determine candidate words. For example, if the entry entered by the user is "on”, then according to the analysis of the entry, it is known that the term usually appears as a verb part, and then a noun part of the term appears, such as: , machine, etc., according to the input history, there will also be words such as: start, start, guide, etc. often appear after the entry "open".
  • a non-local contextual scene associated with the term generates a third set of candidate words.
  • the candidate words are usually: skills, learning, goals, length, room, etc.; and when the user is browsing a web page related to the NBA player, then when the user When a reply, search, etc. operation is required, the entry "section" is entered, and the first candidate word is: ratio.
  • the candidate words may be: Lee, Kenbauer, Takashi, etc.
  • the above candidate word set is the third candidate word set.
  • the system 10 generates a user common vocabulary based on each user input information. For example, calculate the probability a of the word beginning with "up".
  • the generating device 12 further includes: a word generating module 121, a storage module 122, and a generating module 123.
  • the word generating module 121 is configured to receive text information input by a non-local user (user B), and perform word cutting on the text information to form at least one type of word. That is, the original short message content is cut, and the reverse maximum matching method is used. After the segmentation, the result is "Today ⁇ I ⁇ Go ⁇ Software Park ⁇ ".
  • the storage module 122 is configured to store the word in the pre-stored vocabulary, that is, two consecutive singular words in the segmentation result are stored as one word, and stored in a pre-stored vocabulary, such as "I go", “go", " ⁇ ”” Wait.
  • the probability values of consecutive occurrences can be calculated using the n-gram model, assuming the probability of occurrence of " ⁇ " Is 13.
  • the generating module 123 is configured to generate, according to the entry entered by the user, a third candidate word set based on the pre-stored vocabulary. Since user A's reply input has "up”, the values of a *a and ⁇ ⁇ b are compared, and a and ⁇ are trained parameters, so that the candidate hint box gives priority to "ground” instead of the traditional one.
  • the third candidate word set may be a ground, a class, a car, a net, or the like.
  • new terms appearing in the context are given a higher weight such that the terms appearing in these contexts preferentially appear in the set of candidate words entered by the user.
  • the generating means 12 is configured to generate a first candidate word set according to the second candidate word set and the third candidate word.
  • the second candidate word set and the third candidate word set are weighted to generate a first candidate word set.
  • the weights of the second candidate word set and the third candidate word set may be set by the user according to requirements.
  • the weight of the third candidate word set is higher than the second candidate word set.
  • the first candidate word of the third candidate word set is the first candidate word of the first candidate word set.
  • candidate word recommendation can be fully utilized by non-local context scenarios, and the candidate word hit rate in the input process can be effectively improved.

Abstract

Provided is a method for prompting input candidate words based on a context scenario, comprising: receiving an entry input by a user; based on the input history of the entry by the user and a random non-local context scenario related to the entry, generating a first candidate word set; and providing the first candidate word set to the user. Also provided is a system used for the method. The present invention fully uses a context scenario to recommend candidate words, which effectively increases the hit rate of a candidate first word in the input process.

Description

一种基于上下文场景的输入候选词提示方法及系统  Input candidate word prompting method and system based on context scene
[0001】本申请要求了 2012月 3月 28日提交的、 申请号为 201210086810.4、 发明 名称为 "一种基于上下文场景的输入候选词提示方法及系统"的中国专利申请 的优先权, 其全部内容通过引用结合在本申请中。 技术领域 [0001] The present application claims the priority of the Chinese patent application filed on March 28, 2012, with the application number 201210086810.4, entitled "A Context-Based Input Candidate Prompt Method and System", the entire contents of which are incorporated herein by reference. This is incorporated herein by reference. Technical field
[0002]本发明涉及输入法领域, 具体地说涉及一种基于上下文场景的输入候 选词提示方法及系统。 背景技术  The present invention relates to the field of input methods, and in particular to a method and system for prompting input candidate words based on a context scenario. Background technique
[0003】不同的人因为兴趣、 爱好、 习惯不同, 常用输入内容也各有不同。 现 有的输入法候选字提示忽略了用户的差异性, 在相同的已输入字的情况下, 候选词没有考虑用户的个性化信息, 使得用户无法快捷、 方便地查找到候选 词。 现有的输入法主要是以大规模词频统计方法为主, 结合本地上下文情境 来实现候选词概率统计和提示。 目前的主流输入法能够对用户最近、 最频繁 的输入进行统计, 加权优先显示用户的最近、 最频繁的使用的词汇。  [0003] Different people have different input contents because of different interests, hobbies, and habits. The existing input method candidate hint ignores the user's difference. In the case of the same entered word, the candidate does not consider the user's personalized information, so that the user cannot find the candidate quickly and conveniently. The existing input method is mainly based on large-scale word frequency statistics method, combined with local context to achieve candidate word probability statistics and prompts. The current mainstream input method can count the most recent and most frequent input of the user, and the weighted priority displays the user's most recent and most frequently used vocabulary.
[0004]常用的输入法主要分为以下几类: [0004] Commonly used input methods are mainly divided into the following categories:
1、 用于智能手机的输入法, 可以结合手机特性, 提供电话薄中姓名、 电话的候选词提示或者用户指定固定的候选词。  1. The input method for the smart phone can be combined with the characteristics of the mobile phone to provide the name of the phone book, the candidate word prompt of the phone or the user to specify a fixed candidate word.
2、 根据词汇分类不同, 可以提供某项领域的专有用词, 如 "股票代号" 快速输入装置等。  2, according to the vocabulary classification, can provide a specific term in a certain field, such as "stock code" fast input device.
3、 结合设备属性进行候选词提示, 通常用在互联网搜索时, 参考输入 法承载设备的型号、 相应的功能等信息进行候选词提示。  3. The candidate word prompt is combined with the device attribute, and is usually used in the Internet search, and the candidate input word is presented by referring to the model of the input method carrying device and the corresponding function.
4、 利用多用户的个性化信息挖掘出用户特征、 进行候选词提示, 如, 通过统计用户客户端的词表, 挖掘出兴趣爱好一致的用户, 建立相似度关系, 从而将兴趣相近的用户词表推荐给其他用户。  4. Using multi-user personalized information to mine user characteristics and make candidate word prompts, for example, by counting the vocabulary of the user client, mining users with the same interests and hobbies, establishing a similarity relationship, and thus the user vocabulary with similar interests Recommended for other users.
[0005]但是目前常用的输入法只考虑了用户输入的上下文, 忽略了用户所使 用设备同时也是信息的接受者, 接收到的信息对设备使用者的行为习惯的改 变。 以手机为例, 当用户接收到短消息时, 可能针对不同的短信进行回复不 同的信息; 当用户进行网页浏览时, 可能针对互联网上不同页面使用回帖、 搜索等输入功能, 因此用户使用的词汇会因当时的情境有所不同。 在这种情 况下, 现有输入法并不能很好地为用户提供候选词。 发明内容 [0005] However, the commonly used input method only considers the context of user input, ignoring the device used by the user and also the recipient of the information, and the received information changes the behavior habits of the device user. Change. Taking a mobile phone as an example, when a user receives a short message, different information may be replied to different short messages; when the user browses the webpage, the user may use the input function such as replying and searching for different pages on the Internet, so the vocabulary used by the user It will be different because of the situation at the time. In this case, the existing input method does not provide the candidate words well for the user. Summary of the invention
[0006]本发明提供一种基于上下文场景的输入候选词提示方法, 用于通过统 计用户的短信输入的历史信息, 同时考虑非本地用户输入的上下文场景, 弥 补现有输入法的不足, 提高候选词的 "首字命中率" 和 "候选词命中率" , 使得输入内容真正达到 "个性化" 。  The present invention provides a method for prompting an input candidate word based on a context scenario, which is used to calculate the historical information of the short message input by the user, and simultaneously consider the context scenario input by the non-local user, to make up for the deficiency of the existing input method, and improve the candidate. The word "first hit rate" and "candidate hit rate" make the input content truly "personalized".
[0007]根据本发明的一个方面, 提供一种基于上下文场景的输入候选词提示 方法, 其中, 包括以下步骤:  According to an aspect of the present invention, a context candidate based input candidate prompting method is provided, wherein the following steps are included:
a ) 接收用户输入的词条;  a) receiving the entry entered by the user;
b ) 基于所述用户对所述词条的输入历史和任意非本地的与所述词条相 关的上下文场景, 生成第一候选词集合; 以及  b) generating a first set of candidate words based on an input history of the term by the user and any non-local contextual scene associated with the term;
c ) 将所述第一候选词集合提供给所述用户。  c) providing the first set of candidate words to the user.
[0008]根据本发明的另一个方面, 提供一种基于上下文场景的输入候选词提 示系统, 其中, 包括: According to another aspect of the present invention, an input candidate word prompting system based on a context scenario is provided, which includes:
[0009】接收装置, 用于接收用户输入的词条;  a receiving device, configured to receive a term input by a user;
[0010】生成装置, 用于基于所述用户对所述词条的输入历史和任意非本地的 与所述词条相关的上下文场景, 生成第一候选词集合; 以及 [0010] generating means, configured to generate a first candidate word set based on an input history of the term by the user and an arbitrary non-local context scenario related to the term;
[0011]提供装置, 用于将所述第一候选词集合提供给所述用户。 [0011] providing means for providing the first candidate word set to the user.
[0012]本发明提供的基于上下文场景的输入候选词提示方法及系统, 本发明 应用于各种具有上下文场景的可输入平台, 所述上下文场景包含任何非本地 用户输入的上下文场景。 根据用户输入的词条, 并结合对用户历史输入记录 和所述上下文场景的分析, 生成候选词集合, 并将所述候选词集合提供给用 户。 本发明充分利用非本地用户资源信息, 尤其是以移动通信设备为代表, 如手机进行短信聊天时, 如果能够考虑到上下文场景再生成候选词集合, 将 对用户的输入有很大的帮助, 从而提高手机输入的候选首字命中率。 附图说明 The present invention is applied to various input candidate platforms having context scenarios, and the context scenarios include any context scenarios input by non-local users. According to the entry input by the user, combined with the analysis of the user history input record and the context scenario, a candidate word set is generated, and the candidate word set is provided to the user. The present invention fully utilizes non-local user resource information, especially represented by a mobile communication device. For example, when a mobile phone performs a short message chat, if the candidate word set can be re-generated in consideration of the context scenario, the input of the user is greatly facilitated, thereby Improve the candidate's first hit rate for mobile phone input. DRAWINGS
[0013]通过阅读参照以下附图所作的对非限制性实施例所作的详细描述, 本 发明的其它特征、 目的和优点将会变得更明显:  Other features, objects, and advantages of the present invention will become more apparent from the Detailed Description of Description
[0014] 图 1为根据本发明的一种基于上下文场景的输入候选词提示方法的一 种具体实施方式的流程示意图; 1 is a flow chart showing a specific embodiment of a method for prompting an input candidate word based on a context scenario according to the present invention;
[0015] 图 2为根据本发明的一种基于上下文场景的输入候选词提示系统的一 种具体实施方式的结构示意图;  2 is a schematic structural diagram of a specific implementation manner of a context candidate based input candidate prompting system according to the present invention;
[0016] 图 3为本发明的一种基于上下文场景的输入候选词提示系统中的生成 装置的一种具体实施方式的结构示意图。  3 is a schematic structural diagram of a specific implementation manner of a generating device in an input candidate prompting system based on a context scenario according to the present invention.
[0017]附图中相同或相似的附图标记代表相同或相似的部件。 具体实施方式  The same or similar reference numerals in the drawings denote the same or similar components. detailed description
[0018】为使本发明的目的、 技术方案和优点更加清楚, 下面将结合附图对本 发明的实施例作详细描述。  The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
[0019]下面详细描述本发明的实施例, 所述实施例的示例在附图中示出, 其 中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能 的元件。 下面通过参考附图描述的实施例是示例性的, 仅用于解释本发明, 而不能解释为对本发明的限制。  The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the drawings are intended to be illustrative of the invention and are not to be construed as limiting.
[0020]下文的公开提供了许多不同的实施例或例子用来实现本发明的不同结 构。 为了筒化本发明的公开, 下文中对特定例子的部件和设置进行描述。 当 然, 它们仅仅为示例, 并且目的不在于限制本发明。 此外, 本发明可以在不 同例子中重复参考数字和 /或字母。 这种重复是为了筒化和清楚的目的, 其本 身不指示所讨论各种实施例和 /或设置之间的关系。 应当注意, 在附图中所图 示的部件不一定按比例绘制。 本发明省略了对公知组件和处理技术及工艺的 描述以避免不必要地限制本发明。 [0020] The following disclosure provides many different embodiments or examples for implementing different structures of the present invention. In order to simplify the disclosure of the present invention, the components and arrangements of the specific examples are described below. Of course, they are merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in different examples. This repetition is for the purpose of clarity and clarity and does not in itself indicate the relationship between the various embodiments and/or arrangements discussed. It is noted that the components illustrated in the drawings are not necessarily to scale. The description of known components and processing techniques and processes is omitted to avoid unnecessarily limiting the invention.
[0021]如图 1所示, 图 1为根据本发明提供的基于上下文场景的输入候选词提 示方法的一个具体实施方式的流程示意图, 包括步骤 S101~S103 , 下面结合具 体的实施例对图 1所示的方法进行说明。 [0022]步骤 S101 , 接收用户输入的词条。 本发明的方法可以应用于任何可以 装载输入法的设备中, 所述设备包括但不限于: PC、 笔记本电脑、 PDA (个 人掌上电脑) 、 手机、 平板电脑等终端, 优选为能够装载输入法的手机。 因 此以下以手机为例进行阐释。 As shown in FIG. 1 , FIG. 1 is a schematic flowchart of a specific embodiment of a method for prompting an input candidate word based on a context scenario according to the present invention, which includes steps S101 to S103, and FIG. 1 is combined with a specific embodiment. The method shown is described. [0022] Step S101, receiving a term input by a user. The method of the present invention can be applied to any device that can load an input method, including but not limited to: a terminal such as a PC, a notebook computer, a PDA (Personal Handheld Computer), a mobile phone, a tablet computer, etc., preferably capable of loading an input method. Mobile phone. Therefore, the following is an example of a mobile phone.
[0023】用户输入的词条可以是各种语言的字符、 拼音中的一种或者它们的组 合。 例如: "百度" 、 "woxihuan" 、 "百度 ditu" 等等。 [0023] The terms entered by the user may be one of characters in various languages, one of pinyin, or a combination thereof. For example: "Baidu", "woxihuan", "Baidu ditu" and so on.
[0024]步骤 S102 , 基于所述用户对所述词条的输入历史和任意非本地的与所 述词条相关的上下文场景, 生成第一候选词集合。 [0024] Step S102: Generate a first candidate word set based on an input history of the entry by the user and an arbitrary non-local context scenario related to the entry.
[0025】优选的, 可以首先基于所述用户对所述词条输入的历史记录, 生成第 二候选词集合。 当接收到输入词条之后, 需要对该词条进行语义分析, 例如 分析其词性, 输入历史等, 来确定候选词。 例如: 用户输入的词条为 "开" , 那么根据对该词条的分析, 得知该词条通常情况下以动词词性出现, 其后会 出现一名词词性的词条, 如: 会、 电脑、 机等; 根据输入历史, 还会出现如: 始、 了、 导等经常在词条 "开" 之后出现的词条。  [0025] Preferably, the second set of candidate words may be generated first based on a history recorded by the user on the entry. After receiving the input term, the term needs to be semantically analyzed, such as analyzing its part of speech, inputting history, etc., to determine candidate words. For example, if the entry entered by the user is "on", then according to the analysis of the entry, it is known that the term usually appears as a verb part, and then a noun part of the term appears, such as: , machine, etc.; According to the input history, there will also be entries such as: start, start, guide, etc., which often appear after the entry "open".
[0026】对于某一词条的输入历史分析, 除了对大多数用户的输入历史进行海 量数据分析外, 还需要结合本机输入法的使用情况, 对候选词词条的顺序进 行调整, 以便能够更灵活地匹配个性化的用户需求。 例如该用户为一心理咨 询师, 那么可能在输入完词条 "开" 之后, 候选词的第一位即为 "导" , 之 后为: 电脑、 机、 始、 了、 会等; 而当用户为一经常开会的人士时, 那么在 输入完词条 "开" 之后, 候选词的第一位即为 "会" , 之后为: 电脑、 机、 始、 了、 导等。 对于本机输入法的使用历史的分析, 可采取本机用户的账号 信息、 cookie等常用的分析手段进行。 上述根据用户词条输入的历史记录的分 析生成的候选词集合即为第二候选词集合。 [0026] For the input history analysis of a certain term, in addition to the massive data analysis of the input history of most users, it is also necessary to adjust the order of the candidate terms in conjunction with the use of the native input method, so as to be able to More flexible matching of personalized user needs. For example, if the user is a psychological counselor, then after entering the entry "open", the first digit of the candidate word is "guide", followed by: computer, machine, start,, and so on; For a person who meets frequently, then after entering the entry "open", the first place of the candidate is "will", followed by: computer, machine, start, and guide. For the analysis of the usage history of the local input method, it can be carried out by using common analysis methods such as account information and cookies of the local user. The set of candidate words generated by the above-described analysis of the history input according to the user entry is the second candidate word set.
[0027】此外, 为了使候选词更加贴近用户需求, 对于上下文情境的分析也很 重要。 随着互联网、 以及无线通信的发展, 信息交互越来越重要了, 因此对 于非本地的且与所述词条相关的上下文场景进行分析就至关重要, 接下来基 于任意非本地的与所述词条相关的上下文场景, 生成第三候选词集合。 例如, 在通常情况下, 用户输入词条 "科" 时, 候选词通常是: 技、 学、 目、 长、 室等等; 而当用户在浏览一与 NBA球员有关的网页时, 那么当用户需要进行 回帖、 搜索等操作时, 输入了词条 "科" , 第一候选词则为: 比。 又如: 当 用户在一足球网站进行浏览时,需要进行输入操作时,当用户输入了词条"贝" 时, 候选词可能为: 利、 肯鲍尔、 隆等。 上述候选词集合即为第三候选词集 合。 [0027] Furthermore, in order to make candidate words closer to user needs, analysis of contextual contexts is also important. With the development of the Internet and wireless communication, information interaction is becoming more and more important, so it is crucial to analyze non-local and contextual contexts related to the term, based on any non-local and A contextual context related to the entry, generating a third set of candidate words. For example, in the normal case, when the user enters the entry "section", the candidate words are usually: skills, learning, goals, length, room, etc.; and when the user is browsing a web page related to the NBA player, then when the user Need to When replying, searching, etc., the entry "section" is entered, and the first candidate is: ratio. Another example: When the user needs to perform an input operation when browsing on a soccer website, when the user inputs the entry "bay", the candidate words may be: Lee, Kenbauer, Takashi, and the like. The above candidate word set is the third candidate word set.
[0028]下面以智能手机编辑短信为例, 进行说明。 [0028] The following is an example of editing a short message by a smartphone.
[0029]用户 A收到用户 B的一条短信: "今天我去上地软件园了,那里很不错! " 由于用户 A并不知道 "上地软件园" 在哪里, 因此想回复一条短信给用户 B进 行询问。 但是因为 "上地"不是一个常见词, 即未登录词, 当用户 A输入 "上" 时, 现有的候选词提示法根本无法将 "地"设为候选词。 因此, 用户 A需要分 别输入 "上" 和 "地" 两个字。 而本发明中的方法可以基于非本地的与所述 词条相关的上下文场景提示候选词, 因此本发明中的输入法可以将 "地" 作 为候选词。  [0029] User A receives a text message from User B: "I went to the Software Park today, it is very good!" Since User A does not know where "Shangdi Software Park" is, he wants to reply to a message to the user. B asks. However, because "上上" is not a common word, that is, no login words, when user A enters "up", the existing candidate prompt method cannot set "ground" as a candidate at all. Therefore, User A needs to enter the words "up" and "ground" separately. However, the method of the present invention can prompt candidate words based on non-local context scenes related to the terms, so the input method in the present invention can use "ground" as a candidate word.
[0030]首先, 根据每次用户输入信息, 生成用户常用词表。 计算以 "上" 开 头的词出现的概率 a。 其次, 接收非本地用户 (用户 B )所输入的文本信息, 对所述文本信息进行切词, 形成至少一个类词。 即切分原短信内容, 利用逆 向最大匹配法, 切分后结果为 "今天 \我\去 \上\地\软件园 \了" 。 之后将所述类 词存储于预存词汇库, 即将切分结果中连续两个单字组为一个词, 存入预存 词汇库, 如 "我去" 、 "去上" 、 "上地" 等。 其连续出现的概率值可以利 用 n-gram模型计算, 假设 "上地" 出现的概率为 b。 再次, 根据所述用户输入 的词条, 基于所述预存词汇库, 生成第三候选词集合。 由于用户 A的回复输入 有 "上" 那么会比较 a *a与 β * b的值, a与 β是经过训练得到的参数, 使得 在候选提示框了优先考虑 "地" , 而不是传统的 "上班" 、 "上车" 、 "上 网" 等。 即, 第三候选词集合可能为地、 班、 车、 网等。 优选地, 对上下文 中出现的新词条赋予更高的权重, 使得这些上下文中出现的词条优先出现在 用户输入的候选词集合中。  [0030] First, a user common vocabulary is generated according to each user input information. Calculate the probability a for the word beginning with "up". Secondly, receiving text information input by the non-local user (user B), and cutting the text information to form at least one type of word. That is to divide the original text message content, using the reverse maximum matching method, the result is "today \ I \ go \ on \ land \ software park \". Then, the term is stored in the pre-stored vocabulary, and two consecutive singular words in the segmentation result are stored as one word, and stored in a pre-stored vocabulary, such as "I go", "go to", "上上", and the like. The probability values of successive occurrences can be calculated using the n-gram model, assuming that the probability of occurrence of "upper" is b. Again, based on the entry entered by the user, a third set of candidate words is generated based on the pre-stored vocabulary. Since user A's reply input has "up", the values of a*a and β*b are compared. A and β are trained parameters, so that the candidate hint box gives priority to "ground" rather than traditional" Go to work, "on the train", "online", etc. That is, the third candidate word set may be a ground, a class, a car, a net, or the like. Preferably, new terms appearing in the context are given a higher weight such that the terms appearing in these contexts preferentially appear in the set of candidate words entered by the user.
[0031]根据所述第二候选词集合和所述第三候选词生成第一候选词集合。 优 选的, 将所述第二候选词集合和所述第三候选词集合进行加权, 生成第一候 选词集合。 第二候选词集合和第三候选词集合的权重可以根据需求由用户进 行设定。 优选的, 第三候选词集合的权重要高于第二候选词集合, 通常情况 下, 第三候选词集合的首位候选词即为第一候选词集合的首位候选词。 [0031] generating a first candidate word set according to the second candidate word set and the third candidate word. Preferably, the second candidate word set and the third candidate word set are weighted to generate a first candidate word set. The weights of the second candidate word set and the third candidate word set may be set by the user according to requirements. Preferably, the weight of the third candidate word set is higher than the second candidate word set, usually Next, the first candidate word of the third candidate word set is the first candidate word of the first candidate word set.
[0032]步骤 S103 , 将所述第一候选词集合提供给所述用户。 上述步骤结束后, 可以得到与用户输入词条最相关的第一候选词集合, 并将该第一候选词集合 提供给用户, 供用户选择。 通常情况下, 首位候选词会采用与其他候选词不 同的显示, 例如: 反白、 不同色等等。  [0032] Step S103, providing the first candidate word set to the user. After the above steps are completed, a first candidate word set most relevant to the user input term can be obtained, and the first candidate word set is provided to the user for selection by the user. Usually, the first candidate will be displayed differently from other candidate words, such as: reverse white, different colors, and so on.
[0033]参考图 2 , 图 2示出根据本发明的一种基于上下文场景的输入候选词提 示系统 10的一种具体实施方式的结构示意图。 系统 10包括: 接收装置 1 1、 生 成装置 12和提供装置 13。  Referring to FIG. 2, FIG. 2 is a block diagram showing a specific embodiment of a context-based scene-based input candidate word presentation system 10 in accordance with the present invention. The system 10 includes: a receiving device 1 1, a generating device 12, and a providing device 13.
[0034】接收装置 1 1 , 用于接收用户输入的词条。 本发明的系统可以应用于任 何可以装载输入法的设备中, 所述设备包括但不限于: PC、笔记本电脑、 PDA (个人掌上电脑) 、 手机、 平板电脑等终端, 优选为能够装载输入法的手机。 因此以下以手机为例进行阐释。  [0034] The receiving device 1 1 is configured to receive a term input by a user. The system of the present invention can be applied to any device that can load an input method, including but not limited to: a terminal such as a PC, a notebook computer, a PDA (Personal Handheld Computer), a mobile phone, a tablet computer, etc., preferably capable of loading an input method. Mobile phone. Therefore, the following is an example of a mobile phone.
[0035]用户输入的词条可以是各种语言的字符、 拼音中的一种或者它们的组 合。 例如: "百度" 、 "woxihuan" 、 "百度 ditu" 等等。  [0035] The terms entered by the user may be one of characters in various languages, one of pinyin, or a combination thereof. For example: "Baidu", "woxihuan", "Baidu ditu" and so on.
[0036】生成装置 12 , 用于基于所述用户对所述词条的输入历史和任意非本地 的与所述词条相关的上下文场景, 生成第一候选词集合。 [0036] The generating means 12 is configured to generate a first candidate word set based on the input history of the term by the user and any non-local contextual scene related to the term.
[0037】优选的, 生成装置 12可以进一步用于, 首先基于所述用户对所述词条 输入的历史记录, 生成第二候选词集合。 当接收到输入词条之后, 需要对该 词条进行语义分析, 例如分析其词性, 输入历史等, 来确定候选词。 例如: 用户输入的词条为 "开" , 那么根据对该词条的分析, 得知该词条通常情况 下以动词词性出现, 其后会出现一名词词性的词条, 如: 会、 电脑、 机等, 根据输入历史, 还会出现如: 始、 了、 导等经常在词条 "开" 之后出现的词 条。 [0037] Preferably, the generating means 12 may be further configured to first generate a second set of candidate words based on a history of the user input to the entry. After receiving the input term, the term needs to be semantically analyzed, such as analyzing its part of speech, inputting history, etc., to determine candidate words. For example, if the entry entered by the user is "on", then according to the analysis of the entry, it is known that the term usually appears as a verb part, and then a noun part of the term appears, such as: , machine, etc., according to the input history, there will also be words such as: start, start, guide, etc. often appear after the entry "open".
[0038]对于某一词条的输入历史分析, 除了对大多数用户的输入历史进行海 量数据分析外, 还需要结合本机输入法的使用情况, 对候选词词条的顺序进 行调整, 以便能够更灵活地匹配个性化的用户需求。 例如该用户为一心理咨 询师, 那么可能在输入完词条 "开" 之后, 候选词的第一位即为 "导" , 之 后为电脑、 机、 始、 了、 会等; 而当用户为一经常开会的人士时, 那么在输 入完词条 "开" 之后, 候选词的第一位即为 "会" , 之后为电脑、 机、 始、 了、 导等。 对于本机输入法的使用历史的分析, 可采取本机用户的账号信息、 cookie等常用的分析手段进行。上述根据用户词条输入的历史记录的分析生成 的候选词集合即为第二候选词集合。 [0038] For the input history analysis of a certain entry, in addition to the massive data analysis of the input history of most users, it is also necessary to adjust the order of the candidate terms in conjunction with the use of the native input method, so as to be able to More flexible matching of personalized user needs. For example, if the user is a counselor, then after entering the entry "open", the first digit of the candidate is "guide", followed by computer, machine, start, conference, etc.; When a person who meets frequently, then after entering the entry "open", the first place of the candidate is "will", then the computer, machine, start, , guidance, etc. For the analysis of the usage history of the local input method, it can be performed by using common analysis methods such as account information and cookies of the local user. The set of candidate words generated by the above analysis of the history input according to the user entry is the second candidate word set.
[0039】此外, 为了使候选词更加贴近用户需求, 对于上下文情境的分析也很 重要。 随着互联网、 以及无线通信的发展, 信息交互越来越重要了, 因此对 于非本地的且与所述词条相关的上下文场景进行分析就至关重要, 接下来生 成装置 12 , 用于基于任意非本地的与所述词条相关的上下文场景, 生成第三 候选词集合。 例如, 在通常情况下, 用户输入词条 "科" 时, 候选词通常是: 技、 学、 目、 长、 室等等; 而当用户在浏览一与 NBA球员有关的网页时, 那 么当用户需要进行回帖、 搜索等操作时, 输入了词条 "科" , 第一候选词则 为: 比。 又如: 当用户在一足球网站进行浏览时, 需要进行输入操作时, 当 用户输入了词条 "贝" 时, 候选词可能为: 利、 肯鲍尔、 隆等。 上述候选词 集合即为第三候选词集合。  [0039] Furthermore, in order to make candidate words closer to user needs, analysis of contextual contexts is also important. With the development of the Internet and wireless communication, information interaction is becoming more and more important, so it is crucial to analyze non-local and contextual contexts related to the term, and then generate device 12 for arbitrary A non-local contextual scene associated with the term generates a third set of candidate words. For example, in the normal case, when the user enters the entry "section", the candidate words are usually: skills, learning, goals, length, room, etc.; and when the user is browsing a web page related to the NBA player, then when the user When a reply, search, etc. operation is required, the entry "section" is entered, and the first candidate word is: ratio. Another example: When the user needs to perform an input operation when browsing on a soccer website, when the user inputs the entry "Bei", the candidate words may be: Lee, Kenbauer, Takashi, etc. The above candidate word set is the third candidate word set.
[0040]下面以智能手机编辑短信为例, 进行说明。  [0040] The following is an example of editing a short message by a smartphone.
[0041]用户 A收到用户 B的一条短信: "今天我去上地软件园了,那里很不错! " 由于用户 A并不知道上地软件园在哪里, 因此想回复一条短信给用户 B进行询 问。 但是因为 "上地" 不是一个常见词, 即未登录词, 当用户 A输入 "上" 时, 现有的候选词提示法根本无法将 "地"设为候选词。 因此, 用户 A需要分别输 入 "上" 和 "地" 两个字。 而本发明中的方法可以基于非本地的与所述词条 相关的上下文场景提示候选词, 因此本发明中的输入法可以将 "地" 作为候 选词。 [0041] User A receives a text message from User B: "I went to the Software Park today, it is very good!" Since User A does not know where the Shangdi Software Park is, he wants to reply to a message to User B. ask. However, because "上上" is not a common word, that is, no login words, when user A enters "up", the existing candidate prompt method cannot set "ground" as a candidate at all. Therefore, User A needs to enter the words "up" and "ground" respectively. However, the method of the present invention can present candidate words based on a non-local contextual scene associated with the term, so the input method of the present invention can use "ground" as a candidate.
[0042] 系统 10会根据每次用户输入信息, 生成用户常用词表。 如计算以 "上" 开头的词出现的概率 a。 参考图 3 , 生成装置 12还包括: 类词生成模块 121、 存 储模块 122和生成模块 123。 所述类词生成模块 121用于接收非本地用户 (用户 B )所输入的文本信息, 对所述文本信息进行切词, 形成至少一个类词。 即切 分原短信内容, 利用逆向最大匹配法, 切分后结果为 "今天 \我\去\上\地\软件 园\了" 。 存储模块 122用于将所述类词存储于预存词汇库, 即将切分结果中连 续两个单字组为一个词, 存入预存词汇库, 如 "我去" 、 "去上" 、 "上地" 等。 其连续出现的概率值可以利用 n-gram模型计算, 假设 "上地" 出现的概率 为13。 生成模块 123 , 用于根据所述用户输入的词条, 基于所述预存词汇库, 生成第三候选词集合。 由于用户 A的回复输入有 "上" 那么会比较 a *a与 β · b的值, a与 β是经过训练得到的参数, 使得在候选提示框了优先考虑 "地" , 而不是传统的 "上班" 、 "上车" 、 "上网" 等。 即, 第三候选词集合可能 为地、 班、 车、 网等。 优选地, 对上下文中出现的新词条赋予更高的权重, 使得这些上下文中出现的词条优先出现在用户输入的候选词集合中。 [0042] The system 10 generates a user common vocabulary based on each user input information. For example, calculate the probability a of the word beginning with "up". Referring to FIG. 3, the generating device 12 further includes: a word generating module 121, a storage module 122, and a generating module 123. The word generating module 121 is configured to receive text information input by a non-local user (user B), and perform word cutting on the text information to form at least one type of word. That is, the original short message content is cut, and the reverse maximum matching method is used. After the segmentation, the result is "Today\I\Go\上\地\Software Park\". The storage module 122 is configured to store the word in the pre-stored vocabulary, that is, two consecutive singular words in the segmentation result are stored as one word, and stored in a pre-stored vocabulary, such as "I go", "go", "上上"" Wait. The probability values of consecutive occurrences can be calculated using the n-gram model, assuming the probability of occurrence of "上上" Is 13. The generating module 123 is configured to generate, according to the entry entered by the user, a third candidate word set based on the pre-stored vocabulary. Since user A's reply input has "up", the values of a *a and β · b are compared, and a and β are trained parameters, so that the candidate hint box gives priority to "ground" instead of the traditional one. Go to work, "on the train", "online", etc. That is, the third candidate word set may be a ground, a class, a car, a net, or the like. Preferably, new terms appearing in the context are given a higher weight such that the terms appearing in these contexts preferentially appear in the set of candidate words entered by the user.
[0043]进一步地, 生成装置 12用于根据所述第二候选词集合和所述第三候选 词生成第一候选词集合。 优选的, 将所述第二候选词集合和所述第三候选词 集合进行加权, 生成第一候选词集合。 第二候选词集合和第三候选词集合的 权重可以根据需求由用户进行设定。 优选的, 第三候选词集合的权重要高于 第二候选词集合, 通常情况下, 第三候选词集合的首位候选词即为第一候选 词集合的首位候选词。 Further, the generating means 12 is configured to generate a first candidate word set according to the second candidate word set and the third candidate word. Preferably, the second candidate word set and the third candidate word set are weighted to generate a first candidate word set. The weights of the second candidate word set and the third candidate word set may be set by the user according to requirements. Preferably, the weight of the third candidate word set is higher than the second candidate word set. Generally, the first candidate word of the third candidate word set is the first candidate word of the first candidate word set.
[0044】提供装置 13 , 用于将所述第一候选词集合提供给所述用户。 上述步骤 结束后, 可以得到与用户输入词条最相关的第一候选词集合, 并将该第一候 选词集合提供给用户, 供用户选择。 通常情况下, 首位候选词会采用与其他 候选词不同的显示, 例如: 反白、 不同色等等。  [0044] providing means 13 for providing the first set of candidate words to the user. After the above steps are completed, a first candidate word set most relevant to the user input term can be obtained, and the first candidate word set is provided to the user for selection by the user. Usually, the first candidate will use a different display than the other candidates, such as: reverse white, different colors, and so on.
[0045]采用本发明的方法和系统, 可以充分利用非本地上下文场景进行候选 词推荐, 有效提高输入过程中的候选词命中率。  [0045] With the method and system of the present invention, candidate word recommendation can be fully utilized by non-local context scenarios, and the candidate word hit rate in the input process can be effectively improved.
[0046]对于本领域技术人员而言, 显然本发明不限于上述示范性实施例的细 节, 而且在不背离本发明的精神或基本特征的情况下, 能够以其他的具体形 式实现本发明。 因此, 无论从哪一点来看, 均应将实施例看作是示范性的, 而且是非限制性的, 本发明的范围由所附权利要求而不是上述说明限定, 因 此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明 内。 不应将权利要求中的任何附图标记视为限制所涉及的权利要求。 此外, 显然"包括"一词不排除其他模块或步骤, 单数不排除复数。  [0046] It is apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, and that the present invention can be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. Therefore, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the invention is defined by the appended claims All changes in the meaning and scope of equivalent elements are included in the present invention. Any reference signs in the claims should not be construed as limiting the claim. In addition, it is obvious that the word "comprising" does not exclude other modules or steps, and the singular does not exclude the plural.

Claims

权 利 要 求 Rights request
1、 一种基于上下文场景的输入候选词提示方法, 其中, 包括以下步骤: a )接收用户输入的词条; A method for prompting an input candidate word based on a context scenario, comprising the steps of: a) receiving a term input by a user;
b )基于所述用户对所述词条的输入历史和任意非本地的与所述词条相关 的上下文场景, 生成第一候选词集合; 以及  b) generating a first set of candidate words based on an input history of the entry by the user and any non-local contextual context associated with the entry;
c )将所述第一候选词集合提供给所述用户。  c) providing the first set of candidate words to the user.
2、 根据权利要求 1所述的方法, 其中, 所述步骤 b )进一步包括步骤: 基于所述用户对所述词条输入的历史记录, 生成第二候选词集合; 基于任意非本地的与所述词条相关的上下文场景, 生成第三候选词集合; 根据所述第二候选词集合和所述第三候选词集合生成第一候选词集合。 2. The method according to claim 1, wherein the step b) further comprises the steps of: generating a second candidate word set based on the history record entered by the user on the entry; based on any non-local context Generating a third candidate word set according to the context scenario related to the entry; and generating a first candidate word set according to the second candidate word set and the third candidate word set.
3、 根据权利要求 2所述的方法, 其中, 所述步骤 b )进一步包括步骤: 接收非本地用户所输入的文本信息, 并对所述文本信息进行切词, 形成 至少一个类词; 3. The method according to claim 2, wherein the step b) further comprises the steps of: receiving text information input by a non-local user, and performing a word-cutting on the text information to form at least one type of word;
将所述类词存储于预存词汇库;  Storing the class words in a pre-stored vocabulary;
根据所述用户输入的词条, 基于所述预存词汇库, 生成第三候选词集合。  And generating, according to the entry entered by the user, a third candidate word set based on the pre-stored vocabulary.
4、 根据权利要求 2或 3所述的方法, 其中, 将所述第二候选词集合和所述 第三候选词集合进行加权, 生成第一候选词集合。 The method according to claim 2 or 3, wherein the second candidate word set and the third candidate word set are weighted to generate a first candidate word set.
5、 根据权利要求 1~4任意一项所述的方法, 其中, 所述词条是各种语言 的字符、 拼音中的一种或者它们的组合。 The method according to any one of claims 1 to 4, wherein the entry is one of characters of various languages, pinyin, or a combination thereof.
6、 根据权利要求 1或 2所述的方法, 其中, 所述上下文场景为用户接收的 短信或浏览的网页的上下文信息。 The method according to claim 1 or 2, wherein the context scenario is a text message received by a user or context information of a browsed webpage.
7、 根据权利要求 6所述的方法, 其中, 在所述上下文信息中出现的词条 优先出现在用户输入的第一候选词集合中。 7. The method according to claim 6, wherein the entry appearing in the context information It appears preferentially in the first set of candidate words entered by the user.
8、 一种基于上下文场景的输入候选词提示系统, 其中, 包括: 8. An input candidate prompting system based on a context scenario, wherein:
接收装置, 用于接收用户输入的词条;  a receiving device, configured to receive a term input by a user;
生成装置, 用于基于所述用户对所述词条的输入历史和任意非本地的与 所述词条相关的上下文场景, 生成第一候选词集合; 以及  Generating means for generating a first candidate word set based on an input history of the term by the user and an arbitrary non-local context scenario related to the term;
提供装置, 用于将所述第一候选词集合提供给所述用户。  Providing means for providing the first candidate word set to the user.
9、 根据权利要求 8所述的系统, 其中, 所述生成装置进一步用于: 基于所述用户对所述词条输入的历史记录, 生成第二候选词集合; 基于任意非本地的与所述词条相关的上下文场景, 生成第三候选词集合; 根据所述第二候选词集合和所述第三候选词集合生成第一候选词集合。 9. The system according to claim 8, wherein the generating means is further configured to: generate a second candidate word set based on the historical record input by the user on the entry; based on any non-local and a term related context scenario, generating a third candidate word set; generating a first candidate word set according to the second candidate word set and the third candidate word set.
10、 根据权利要求 9所述的系统, 其中, 所述生成装置还包括: 10. The system according to claim 9, wherein the generating means further comprises:
类词生成模块, 用于接收非本地用户所输入的文本信息, 并对所述文本 信息进行切词, 形成至少一个类词;  a word generating module, configured to receive text information input by a non-local user, and perform word cutting on the text information to form at least one type of word;
存储模块, 用于将所述类词存储于预存词汇库;  a storage module, configured to store the word in a pre-stored vocabulary;
生成模块, 用于根据所述用户输入的词条, 基于所述预存词汇库, 生成 第三候选词集合。  And a generating module, configured to generate, according to the entry entered by the user, a third candidate word set based on the pre-stored vocabulary.
1 1、 根据权利要求 9或 10所述的系统, 其中, 所述生成装置用于将所述第 二候选词集合和所述第三候选词集合进行加权, 生成第一候选词集合。 The system according to claim 9 or 10, wherein the generating means is configured to weight the second candidate word set and the third candidate word set to generate a first candidate word set.
12、 根据权利要求 8~1 1任意一项所述的系统, 其中, 所述词条是各种语 言的字符、 拼音中的一种或者它们的组合。 The system according to any one of claims 8 to 11, wherein the entry is one of characters of a variety of languages, one of pinyin, or a combination thereof.
13、 根据权利要求 8或 9所述的系统, 其中, 所述上下文场景为用户接收 的短信或浏览的网页的上下文信息。 The system according to claim 8 or 9, wherein the context scenario is a text message received by a user or context information of a browsed webpage.
14、 根据权利要求 8所述的系统, 其中, 在所述上下文信息中出现的词条 优先出现在用户输入的第一候选词集合中。 14. The system according to claim 8, wherein the term appearing in the context information preferentially appears in a first candidate word set input by a user.
PCT/CN2012/079960 2012-03-28 2012-08-10 Method and system for prompting input candidate words based on context scenario WO2013143252A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210086810.4 2012-03-28
CN201210086810.4A CN103365833B (en) 2012-03-28 2012-03-28 A kind of input candidate word reminding method based on context and system

Publications (1)

Publication Number Publication Date
WO2013143252A1 true WO2013143252A1 (en) 2013-10-03

Family

ID=49258150

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/079960 WO2013143252A1 (en) 2012-03-28 2012-08-10 Method and system for prompting input candidate words based on context scenario

Country Status (2)

Country Link
CN (1) CN103365833B (en)
WO (1) WO2013143252A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992210A (en) * 2017-10-11 2018-05-04 捷开通讯(深圳)有限公司 Input method vocabulary recommends method, intelligent terminal and the device with store function

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399906B (en) * 2013-07-29 2015-07-29 百度在线网络技术(北京)有限公司 The method and apparatus of candidate word is provided based on social relationships when inputting
CN104133855B (en) * 2014-07-11 2017-12-19 中安消技术有限公司 A kind of method and device of input method intelligent association
CN104281274A (en) * 2014-09-03 2015-01-14 深圳市金立通信设备有限公司 Input method
CN104268166B (en) 2014-09-09 2017-04-19 北京搜狗科技发展有限公司 Input method, device and electronic device
CN104699265A (en) * 2015-03-20 2015-06-10 上海触乐信息科技有限公司 Text input method and text input device
CN104834633A (en) * 2015-05-29 2015-08-12 厦门大学 Cloud translation input method and system
CN105867647A (en) * 2016-03-24 2016-08-17 珠海市魅族科技有限公司 A candidate character display method and candidate character display device
CN105938401B (en) * 2016-05-24 2019-04-12 珠海市魅族科技有限公司 Entry recommended method and device
CN108153755A (en) * 2016-12-05 2018-06-12 北京搜狗科技发展有限公司 Method, apparatus and electronic equipment are recommended in a kind of input
CN109144284B (en) * 2017-06-15 2022-07-15 百度在线网络技术(北京)有限公司 Information display method and device
CN107831915A (en) * 2017-10-17 2018-03-23 北京三快在线科技有限公司 One kind input complementing method, device, electronic equipment and readable storage medium storing program for executing
CN107943317B (en) * 2017-11-01 2021-08-06 北京小米移动软件有限公司 Input method and device
CN108170293A (en) * 2017-12-29 2018-06-15 北京奇虎科技有限公司 Input the personalized recommendation method and device of association
CN108227955A (en) * 2017-12-29 2018-06-29 北京奇虎科技有限公司 It is a kind of that the method and device for recommending input association is searched for based on user's history
CN112740148A (en) * 2018-09-28 2021-04-30 华为技术有限公司 Method for inputting information into input box and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1707409A (en) * 2003-09-19 2005-12-14 美国在线服务公司 Contextual prediction of user words and user actions
CN101587471A (en) * 2008-05-19 2009-11-25 黄晓凤 Multi-language hybrid input method
CN101779200A (en) * 2007-06-14 2010-07-14 谷歌股份有限公司 Dictionary word and phrase determination
CN102314461A (en) * 2010-06-30 2012-01-11 北京搜狗科技发展有限公司 Navigation prompt method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334774B (en) * 2007-06-29 2013-08-14 北京搜狗科技发展有限公司 Character input method and input method system
CN101290632B (en) * 2008-05-30 2011-09-14 北京搜狗科技发展有限公司 Input method for user words participating in intelligent word-making and input method system
CN102063452A (en) * 2010-05-31 2011-05-18 百度在线网络技术(北京)有限公司 Method, equipment, server and system for inputting characters by user

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1707409A (en) * 2003-09-19 2005-12-14 美国在线服务公司 Contextual prediction of user words and user actions
CN101779200A (en) * 2007-06-14 2010-07-14 谷歌股份有限公司 Dictionary word and phrase determination
CN101587471A (en) * 2008-05-19 2009-11-25 黄晓凤 Multi-language hybrid input method
CN102314461A (en) * 2010-06-30 2012-01-11 北京搜狗科技发展有限公司 Navigation prompt method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992210A (en) * 2017-10-11 2018-05-04 捷开通讯(深圳)有限公司 Input method vocabulary recommends method, intelligent terminal and the device with store function

Also Published As

Publication number Publication date
CN103365833A (en) 2013-10-23
CN103365833B (en) 2016-06-08

Similar Documents

Publication Publication Date Title
WO2013143252A1 (en) Method and system for prompting input candidate words based on context scenario
US9176944B1 (en) Selectively processing user input
US11093711B2 (en) Entity-specific conversational artificial intelligence
US20140122407A1 (en) Chatbot system and method having auto-select input message with quality response
US20190068527A1 (en) Method and system for conducting an automated conversation with a virtual agent system
US20170344224A1 (en) Suggesting emojis to users for insertion into text-based messages
US10928996B2 (en) Systems, devices and methods for electronic determination and communication of location information
WO2019158014A1 (en) Computer-implemented method for dialoguing with user and computer system
CN108984650B (en) Computer-readable recording medium and computer device
JP2018036621A (en) Information input method and device
WO2018014341A1 (en) Method and terminal device for presenting candidate item
US8775165B1 (en) Personalized transliteration interface
US20060241933A1 (en) Predictive conversion of user input
JP2017120627A (en) Method, device and computer program for transmitting and receiving message
EP2472428B1 (en) Response determining device, response determining method, response determining program, recording medium and response determining system
CN105183761A (en) Sensitive word replacement method and apparatus
WO2020177592A1 (en) Painting question answering method and device, painting question answering system, and readable storage medium
WO2017092294A1 (en) Webpage generation method and device
WO2016203805A1 (en) Information processing device, information processing system, information processing method, and program
CN111213136A (en) Generation of domain-specific models in networked systems
CN106371711A (en) Information input method and electronic equipment
RU2677379C2 (en) Method of forming a user query
US20170229118A1 (en) Linguistic model database for linguistic recognition, linguistic recognition device and linguistic recognition method, and linguistic recognition system
JP5895777B2 (en) Information classification program and information processing apparatus
CN105929979A (en) Long-sentence input method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12872761

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12872761

Country of ref document: EP

Kind code of ref document: A1