CN111665956A - Processing method and device of candidate character string, electronic equipment and storage medium - Google Patents

Processing method and device of candidate character string, electronic equipment and storage medium Download PDF

Info

Publication number
CN111665956A
CN111665956A CN202010304923.1A CN202010304923A CN111665956A CN 111665956 A CN111665956 A CN 111665956A CN 202010304923 A CN202010304923 A CN 202010304923A CN 111665956 A CN111665956 A CN 111665956A
Authority
CN
China
Prior art keywords
character
error correction
character string
candidate
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010304923.1A
Other languages
Chinese (zh)
Other versions
CN111665956B (en
Inventor
王鑫
孙明明
李平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010304923.1A priority Critical patent/CN111665956B/en
Publication of CN111665956A publication Critical patent/CN111665956A/en
Application granted granted Critical
Publication of CN111665956B publication Critical patent/CN111665956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0236Character input methods using selection techniques to select from displayed items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Document Processing Apparatus (AREA)
  • User Interface Of Digital Computer (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a processing method and device of candidate character strings, electronic equipment and a storage medium, and relates to the field of information recommendation. The specific implementation scheme is as follows: acquiring an input original character string based on coordinates of a plurality of input points input by a user; and performing character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings. According to the technology of the application, the character granularity error correction can be carried out on the character strings, so that the candidate character strings after error correction are more in line with the expectation of a user, the accuracy of the obtained candidate character strings can be effectively improved, and the input accuracy and the input efficiency of the input method are further improved.

Description

Processing method and device of candidate character string, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for processing a candidate string, an electronic device, and a storage medium.
Background
Mobile devices (e.g., smart phones, tablets) play a very important role in everyday life, with more and more internet activity going on through mobile devices. In many internet activities, the most important communication method is to input characters through an input method of mobile equipment. Due to the limitation of the volume of the mobile device, the display screen of the mobile device is small, and the character area on the soft keyboard of the screen is also small, so that a user can easily touch the character area in the input process, input errors are generated, and the user has to delete and input again.
For example, in order to improve input efficiency, a conventional input method may acquire a word similar to the spelling of the input information or similar in meaning from the input information of the user, and recommend the word as a candidate word to the user.
However, the candidate words obtained in the above manner are difficult to predict the true intention of the user, and the predicted candidate words are less accurate.
Disclosure of Invention
In order to solve the technical problem, the application provides a method and a device for processing a candidate character string, an electronic device and a storage medium.
According to a first aspect, a method for processing candidate character strings in an input method is provided, which includes:
acquiring an input original character string based on coordinates of a plurality of input points input by a user;
and performing character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings.
According to a second aspect, there is provided an apparatus for processing a candidate character string in an input method, comprising:
the character string acquisition module is used for acquiring an input original character string based on the coordinates of a plurality of input points input by a user;
and the character error correction module is used for carrying out character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings.
According to a third aspect, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.
According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.
According to the method and the device, the technical problem that the accuracy of the predicted candidate words is poor in the prior art is solved, the character granularity correction of the character strings can be achieved, the corrected candidate character strings can better meet the expectation of a user, the accuracy of the obtained candidate character strings can be effectively improved, and the input accuracy and the input efficiency of the input method are further improved. Therefore, according to the technical scheme, the experience degree of the user using the input method can be effectively enhanced, and the viscosity of the user on the input method is further enhanced.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present application;
FIG. 2 is a schematic diagram according to a second embodiment of the present application;
FIG. 3 is a schematic view of a third embodiment of the present application;
FIG. 4 is a schematic view of a fourth embodiment of the present application;
FIG. 5 is a schematic view of a fifth embodiment of the present application;
FIG. 6 is a schematic view of a sixth embodiment of the present application;
fig. 7 is a block diagram of an electronic device for implementing a method for processing a candidate character string in an input method according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
FIG. 1 is a schematic diagram according to a first embodiment of the present application; as shown in fig. 1, the present application provides a method for processing a candidate character string in an input method, which may specifically include the following steps:
s101, acquiring an input original character string based on coordinates of a plurality of input points input by a user;
s102, based on the original character strings, carrying out character error correction according to a preset character error correction strategy to obtain a plurality of candidate character strings.
The processing device for executing the candidate character string in the subject input method of the processing method of the candidate character string in the input method of the present embodiment may be provided in the input method for recommending a plurality of candidate character strings based on the original character string input in the input method.
Specifically, when a user inputs an original character string by using an input method, a soft keyboard of the input method can be opened, and a corresponding position is clicked according to characters identified in the soft keyboard, so as to realize the input of the characters at the position. I.e. on the user side, the user clicks on the position. On one side of the input method, a coordinate system is established, and the character corresponding to the clicked coordinate position in the soft keyboard is detected, so that which character the user wants to input by clicking is determined. When the user input is stopped, the stop duration is detected and determined to be greater than the preset duration threshold, the user input can be considered to be finished, the user input of the original character string is determined to be finished, and at the moment, the original character string input by the user can be obtained according to the sequence input by the user. Alternatively, the user may input an end symbol when the input is completed, and determine that the input of the original character string is completed. In practical applications, the original character string input by the user may include a plurality of characters of two or more.
In this embodiment, a plurality of candidate character strings may be obtained according to a preset character error correction policy based on an original character string. For example, the character error correction may be directly performed on the original character string to obtain a part of candidate character strings, or the character error correction may be performed again based on the candidate character strings that have undergone the character error correction processing to obtain corresponding candidate character strings. In practical application, one, two or more character error correction strategies can be adopted according to actual requirements to perform character error correction processing on an original character string or a candidate character string subjected to character error correction processing, so as to obtain a plurality of candidate character strings.
The candidate character string of the embodiment is based on the original character string, and error correction processing of character granularity is performed, so that the accuracy of the candidate character string can be effectively improved. For example, error correction at character granularity can correct characters that a user incorrectly enters due to incorrect pronunciation or misreading; and the error problems of various characters such as clicking adjacent characters and the like caused by clicking deviation when clicking the area of the character to be input due to misoperation of the user can be corrected.
In the processing method of the candidate character string in the input method of the embodiment, the input original character string is obtained based on the coordinates of a plurality of input points input by the user; and based on the original character string, character error correction is carried out according to a preset character error correction strategy to obtain a plurality of candidate character strings, the technical problem of poor accuracy of predicted candidate words in the prior art can be solved, character granularity error correction is carried out on the character strings, the candidate character strings after error correction are enabled to be more in line with the expectation of a user, the accuracy of the obtained candidate character strings can be effectively improved, and the input accuracy and the input efficiency of the input method are further improved. Therefore, the technical scheme of the embodiment can also effectively enhance the experience of the user in using the input method, and further enhance the viscosity of the user on the input method.
FIG. 2 is a schematic view of a second embodiment of the present application; as shown in fig. 2, in this embodiment, three character error correction modes, in which a character error correction strategy sequentially includes character replacement, character reordering, and character completion, are taken as an example to describe the technical solution of the present application. As shown in fig. 2, the method for processing candidate character strings in the input method of this embodiment may specifically include the following steps:
s201, collecting a coordinate sequence input by a user, wherein the coordinate sequence comprises coordinates of a plurality of input points input in sequence;
s202, acquiring an input original character string according to a mapping relation between the coordinate area and characters in the soft keyboard and coordinates of each input point;
each character in the soft keyboard of the mobile device corresponds to a certain visible area in the screen. When a user inputs each character in the original character string, the user clicks the visible area corresponding to the character, and then the character is input. Correspondingly, in the processing device of the candidate words of the input method, a mapping relation between the coordinate area of the visible area of each character in the coordinate system of the screen and the character is established. When the coordinates of the input point are detected in this way, it is determined which character the coordinates fall into the coordinate area of which character, and it is considered which character the user is inputting.
When the user inputs an original character string, the processing device of the candidate word of the input method may first detect a coordinate sequence input by the user, where the coordinate sequence includes coordinates of a plurality of input points input in sequence. And then, acquiring an input original character string according to the mapping relation between the coordinate area and the characters in the soft keyboard and the coordinates of each input point.
It should be noted that when a user clicks input, a click point may fall within a certain key of the soft keyboard, that is, a visible area of a character, or may fall outside the visible area, and at this time, according to a certain policy, if the click point is closer to the visible area of which character, it is determined which character is input, so as to obtain a character corresponding to each input point, and further obtain an original character string.
Steps S201 and S202 are an implementation of step S101 in the embodiment shown in fig. 1. Optionally, in practical application, when the coordinates of each input point are collected, the corresponding characters are obtained in real time, and after the end character is collected, the collected characters are arranged according to the collecting sequence to obtain the original character string. Or the input original character string may also be obtained in other manners, which is not described in detail herein.
S203, replacing characters in the original character string according to a character replacement table counted in advance to obtain a corresponding candidate character string;
s204, according to a character sequencing table which is counted in advance, carrying out sequencing on characters in the original character string and the candidate character string after the characters are replaced to obtain a corresponding candidate character string;
s205, according to a pre-counted character completion table, completing the characters in the original character string, the candidate character string after character replacement and the candidate character string after character sequence adjustment to obtain a corresponding candidate character string.
Steps S203 to S205 of the present embodiment are an implementation manner of step S102 of the embodiment shown in fig. 1.
Alternatively, in practical applications, only the characters in the candidate character string after the character replacement may be sorted in step S204, or only the candidate character string after the character sorting may be complemented in step S205 to be the final candidate character string.
Further optionally, in step S102 of the embodiment shown in fig. 1, based on the original character string, performing character error correction according to a preset character error correction policy to obtain a plurality of candidate character strings, which may specifically include: and based on the original character string, performing character error correction according to at least one of character replacement, character sequence adjustment and character completion to obtain a plurality of candidate character strings.
That is, the character error correction strategy of the present application may include: any one of character replacement, character order adjustment and character completion modes; or any one of a combination of character replacement and character alignment, a combination of character replacement and character completion, and a combination of character alignment and character completion; or a mode of simultaneously comprising character replacement, character order adjustment and character completion. In the present embodiment, when two or more error correction methods are included, it is preferable that if there is a character replacement method, the character replacement method is arranged first, the character arrangement method is next, and the character completion method is always arranged in the last order.
The following describes the three character error correction modes in detail.
The first error correction method for character replacement: the character error correction is performed in a character replacement manner based on the original character string, and may include: and replacing the characters in the original character string according to a character replacement table counted in advance to obtain a corresponding candidate character string.
For example, the pre-counted character substitution table may be counted according to the historical input conditions of all users using the input method. For example, i- > o, i- > u, g- > h, g- > f, etc. may be recorded in the character substitution table, where the character before the symbol "- >" is the substituted character and the character after the symbol is the substituted character. The replaced character and the replacing character involved in the character replacement are adjacent characters in the full keyboard, and the user is easy to operate mistakenly to cause character input errors. In addition, other characters which are easy to misspell, such as i- > e, e- > i, f- > h, h- > f and the like, can also be included in the character substitution table. In short, the character substitution table may include all the character mapping relations that may be substituted, so that characters in the original character string may be substituted according to the character substitution table, and candidate character strings may be obtained. It should be noted that, in this embodiment, when performing character replacement, if N characters that can be replaced exist in an original character string, and N is greater than 1, only any one of the characters may be replaced to obtain a candidate character string; and 2-N characters with any number can be replaced to obtain candidate character strings. And during replacement, if M kinds of replaced characters can exist after a certain character is replaced, different candidate character strings in M correspondingly exist. In short, all possible character replacements can be performed on the original character string according to the character replacement table, so that all possible candidate character strings can be obtained. For example, if the original character string is "hello", and a- > e, a- > i, o- > e, o- > u exists in the character substitution table, the candidate character string obtained correspondingly may include hello, helle, hellu, hello, hellle, hellu, halle, and hellu.
The second error correction mode for character justification: based on the original character string, the character error correction is performed in a character order adjusting manner, which may include: according to a character sequencing table which is counted in advance, carrying out sequencing on characters in the original character string and/or the candidate character string after the characters are replaced to obtain a corresponding candidate character string;
for example, when a user uses both hands for touch screen input, it may often happen that the input sequence is wrong, such as the user inputs "people" with both hands, but since the right hand is faster than the left hand, o is input immediately after p is input, resulting in a final result of "poiple". To correct this situation, it is necessary to reorder the user inputs and extend the possible inputs.
Similarly, the character sorting table can also be obtained according to the statistics of the historical input conditions of all users using the input method. For example, gi, go, gu, ig, og, ug, ig, ih, if, fi, hi, etc. may be included in the character ordering table. If the error correction mode of character replacement is not included before the character sequence is adjusted in the character error correction strategy of this embodiment, the error correction of the character sequence at this time needs to detect whether a pair of consecutive characters capable of being adjusted in sequence exists in the original character string according to the character sequence table, and if there is one, the character sequence is directly adjusted to obtain a candidate character string. If there are a plurality of character pairs, any one, two, a plurality of or even all the character pairs can be sequenced, and all possible candidate character strings after the character sequencing are obtained. If the character error correction strategy of this embodiment further includes a character replacement error correction manner, in addition to performing character order adjustment processing on the original character string according to the above-described manner to obtain a plurality of candidate character strings, the same character order adjustment processing needs to be performed on all the candidate character strings after character replacement processing according to the same manner, so as to obtain all possible candidate character strings.
The third error correction mode for character completion: based on the original character string, performing character error correction in a character completion manner, which may include: and according to a pre-counted character completion table, completing the original character string, the candidate character string after character replacement and/or the characters in the candidate character string after character order adjustment to obtain the corresponding candidate character string.
For example, when a user inputs a word, the user often depends on an automatic completion function of an input method to complete the word according to a word prefix. For example, if the user inputs "sa", the input method is expected to list possible candidates "same", "sad", "satisfactor", etc. The input method therefore requires completion of the word based on the word prefix. In consideration of the requirement, a character completion table may be obtained by statistics according to the historical input conditions of all users using the input method, and each character string that can be completed and all possible character strings after completion are recorded in the character completion table.
If the character error correction strategy of this embodiment does not include the error correction modes of character replacement and character order adjustment before completing the characters, at this time, the original character string is completed directly according to the character completion table, and the candidate character string is obtained.
If the character error correction strategy of this embodiment further includes an error correction mode of character replacement before character completion, at this time, in addition to performing character completion processing on the original character string according to the above-mentioned mode to obtain a plurality of candidate character strings, the same character completion processing needs to be performed on all candidate character strings after character replacement processing according to the same mode, so as to obtain all possible candidate character strings.
If the character error correction strategy of this embodiment further includes a character-order correction manner before completing the characters, at this time, in addition to performing character completion processing on the original character string according to the above manner to obtain a plurality of candidate character strings, the same character completion processing needs to be performed on all the candidate character strings after the character-order processing according to the same manner, so as to obtain all possible candidate character strings.
If the character error correction strategy of this embodiment also includes an error correction mode of character replacement and character order adjustment before character completion, at this time, in addition to performing character completion processing on the original character string according to the above-described mode to obtain a plurality of candidate character strings, it is also necessary to perform character completion processing on all candidate character strings after character replacement processing and all candidate character strings after character order adjustment processing, respectively, according to the same mode, to obtain all possible candidate character strings. Namely the implementation of step S203-step S205 of the present embodiment.
Based on the description of the above embodiment, based on the original character string, the character error correction is performed according to the preset character error correction strategy of this embodiment, so as to obtain a plurality of candidate character strings.
In the processing method of the candidate character string in the input method of the embodiment, a coordinate sequence input by a user is collected, and the coordinate sequence comprises coordinates of a plurality of input points input in sequence; and the input original character string is obtained according to the mapping relation between the coordinate area and the characters in the soft keyboard and the coordinates of each input point, so that the accuracy of the acquired original character string can be ensured.
Further, in this embodiment, the character error correction may be implemented based on the character replacement, the character sequence adjustment and/or the character completion mode, a plurality of candidate character strings are expanded, and the candidate character strings are further enriched.
FIG. 3 is a schematic view of a third embodiment of the present application; as shown in fig. 3, on the basis of the technical solution of the embodiment shown in fig. 1 or fig. 2, the method for processing candidate character strings in the input method of this embodiment, after performing character error correction according to a preset character error correction policy based on an original character string in step S102 of the embodiment shown in fig. 1 to obtain a plurality of candidate character strings, may further include the following steps:
s301, obtaining the occurrence probability of each candidate character string;
s302, sequencing the candidate character strings according to the sequence of the appearance probability from big to small;
s303, recommending a plurality of candidate character strings to the user according to the sorting sequence.
The technical solution of this embodiment is used to implement recommendation of multiple candidate character strings obtained by the embodiments shown in fig. 1 and fig. 2. In order to improve recommendation efficiency, in this embodiment, recommendation may be performed to the user based on the occurrence probability of each candidate character string. The higher the recommendation probability is, the more the corresponding candidate character string is relatively in line with the expectation of the user, and during recommendation, the priority recommendation can be performed or the recommendation can be performed in the front of the sequence.
In this embodiment, the probability of occurrence of each candidate character string may be obtained by referring to the error correction probability when each candidate character string is corrected, and the higher the error correction probability is, the higher the probability of using the corrected candidate character string in the history of using the input method by the user is, and correspondingly, the higher the probability of occurrence of the candidate character string is. Therefore, in the present embodiment, for each candidate character string, the probability of occurrence of the candidate character string can be generated by referring to the error correction probability of the error correction in the error correction method employed when the character error correction is performed on the candidate character string. And the error correction probability of each error correction mode can be obtained based on historical data statistics.
For example, optionally, the step S301 obtains the occurrence probability of each candidate character string, and may specifically include the following steps:
(1) acquiring corresponding error correction probability when each candidate character string is subjected to character error correction processing;
for example, when performing character replacement error correction processing, the error correction probability corresponding to the character replacement error correction processing performed on each candidate character string may be acquired based on the replacement probability of the replacement character counted in advance.
Specifically, when the character substitution table is counted according to the historical input data of all users using the input method, the probability of character substitution can be counted. For example, for a character i, in statistics, the total number of historical inputs is 10000 times, wherein the number of times of being replaced by o is 800 times, and the number of times of being replaced by u is 400 times, then the error correction probability corresponding to the corresponding i- > o is 0.08, and the error correction probability corresponding to the corresponding i- > u is 0.04. If i is not replaced by other characters in the historical statistics, the rest is the probability that i is the i itself. When a candidate character string is subjected to character replacement error correction, and replacement error correction of a plurality of characters is simultaneously performed, it is necessary to multiply the replacement probability of each character in the candidate character string as the error correction probability of the candidate character string.
For another example, when performing the character sequence adjustment error correction processing, the error correction probability corresponding to the character sequence adjustment error correction processing performed on each candidate character string may be obtained according to the sequence adjustment probability of the sequence-adjusted character counted in advance.
Similarly, when the character ordering table is counted according to the historical input data of all users using the input method, the probability of character ordering can be counted. For example, when statistics is performed based on the history data, the history of gi is input 100 times in total, where the number of times gi is replaced with ig is 60 times, and it can be considered that the probability of gi being replaced with ig is 0.6, and the probability of gi being gi is 0.4. Similarly, if multiple sets of sequence adjustment error correction are performed simultaneously when the candidate character string is subjected to character sequence adjustment error correction, the sequence adjustment probabilities of each set of character sequence adjustment in the candidate character string need to be multiplied to serve as the error correction probability of the candidate character string.
For example, when the character completion error correction processing is performed, the error correction probability corresponding to the character completion error correction processing performed on each candidate character string can be obtained based on the previously counted character completion probabilities.
Similarly, when the character completion table is counted according to the historical input data of all users using the input method, the probability of character completion can be counted. For example, when counting up based on historical data, the user has historically entered "sa" 1000 times in total, wherein "same" is selected 400 times in total, "sad" is selected 300 times in total, "satisfactory" is selected 100 times in total, and so on. Correspondingly, the completion probability of "same" is 0.4, the completion probability of "sad" is 0.3, and the completion probability of "satisfactory" is 0.1. For each candidate character string, the completion probability is the corresponding error correction probability.
(2) For each candidate character string, the probability of occurrence of the corresponding candidate character string is generated according to the error correction probability of the character error correction processing.
In practical applications, if a certain candidate character string is subjected to more than two kinds of error correction processing, the probability of occurrence of the corresponding candidate character string needs to be generated together with the probabilities of the two kinds of error correction processing.
For example, for each candidate character string, the error correction probabilities at the time of various error correction processes performed on the candidate character string may be multiplied to obtain the final appearance probability of the candidate character string. Or after taking logarithm of error correction probability of the candidate character string during various error correction processes, performing weighted summation to obtain the final occurrence probability of the candidate character string. In the weighted summation, the weight of the error correction probability of each error correction process can be set according to actual requirements.
Finally, sequencing the candidate character strings according to the sequence of the occurrence probability from large to small; and recommending a plurality of candidate character strings to the user according to the sorting order. In practical applications, the top N candidate strings may be preferred to be recommended.
In the processing method of the candidate character strings in the input method of the embodiment, the occurrence probability of each candidate character string is obtained; sequencing the candidate character strings according to the sequence of the appearance probability from large to small; the candidate character strings with higher occurrence probability can be preferentially recommended to the user, and the candidate character strings with higher occurrence probability are more in line with the expectation of the user, so that the recommended candidate character strings are more in line with the expectation of the user, and the input accuracy and the input efficiency of the input method are improved.
FIG. 4 is a schematic view of a fourth embodiment of the present application; the processing method of candidate character strings in the input method of the present embodiment further describes the technical solution of the present application in more detail on the basis of the technical solutions of the embodiments shown in fig. 1 to fig. 3. As shown in fig. 4, the method for processing candidate character strings in the input method of this embodiment may specifically include the following steps:
s401, collecting a coordinate sequence input by a user, wherein the coordinate sequence comprises coordinates of a plurality of input points input in sequence;
s402, acquiring an input original character string according to a mapping relation between the coordinate area and characters in the soft keyboard and coordinates of each input point;
in this embodiment, the original character string may be represented as: g ═ G1:t>. The original character string G includes the 1 st to t-th characters.
For details, reference may be made to step S201 and step S202 in the above embodiment shown in fig. 2, which are not described herein again.
S403, replacing characters in the original character string according to a character replacement table which is counted in advance to obtain corresponding candidate character strings, and calculating error correction probability when each candidate character string is subjected to character replacement;
in this embodiment, in the character replacement table counted in advance, not only the character pairs that can be replaced are recorded, but also the probability that the "replaced character" is replaced by the "replaced character" in each character pair is recorded.
For example, for the original string G, the expected character m is usedk(1. ltoreq. k. ltoreq.t) replacing the character G in the original character string GkObtaining candidate character string M ═ M1:t> (ii). For example, the user inputs "hu", the last character is u, but since i is a neighboring character of u, i replaces u is recorded in the character replacement table, and based on this, "hi" is generated as a candidate character string. In this embodiment, the character substitution table may further record the probability that i replaces u, and so on, and the character substitution table may record the probability that "the replaced character" is replaced by "the replaced character" in all the character replacement pairs.
In addition, the error correction probability when performing character replacement for each candidate character string can be represented as p (M | G), and is specifically represented by the following formula:
Figure BDA0002455396290000111
the above formula represents the error correction probability of the candidate character string M with respect to the original character string G, where N represents the characters replaced in G to M, and N represents the total number of the replaced characters, that is, P (M | G) is equal to the product of the character replacement probabilities of the replaced characters in G to M.
In the character replacement scenario, it is assumed that misreading and homophonic misuse of words are independent of contextual input.
In practical applications, this step can be implemented in a separate unit, such as a character replacer.
S404, according to a character sequencing table which is counted in advance, carrying out sequencing on characters in the candidate character strings after the characters are replaced to obtain corresponding candidate character strings, and calculating the error correction probability of each candidate character string during character sequencing;
similarly, in the present embodiment, in the character sorting table counted in advance, not only the sorted character pairs but also the probability of each character pair being sorted are recorded.
For example, for each adjacent character pair M in the candidate character string MkAnd mk+1And (k is more than or equal to 1 and less than or equal to t-1) performing sequencing to obtain the reordered candidate character string R. Under the assumption of independence, p (R | M) can be expressed as:
Figure BDA0002455396290000121
the above formula represents the error correction probability of the candidate character string R after the sequence adjustment relative to the candidate character string M before the sequence adjustment, wherein MkAnd mk+1Representing adjacent characters in the candidate character string before the transposition, rnAnd rn+1And the character pairs which are adjacent in the candidate character string after the sequence adjustment are represented, N represents the character pairs after the sequence adjustment, and N-1 represents that N-1 character pairs which need the sequence adjustment exist in the candidate character string M before the sequence adjustment. Based on the above formula, it can be known that P (R | M) is equal to the product of the probabilities of the candidate string M to all the justification characters in the candidate string R.
According to the requirements of practical application, the method of the step can be used for carrying out sequence adjustment on the original character strings to obtain some candidate character strings.
In practical applications, this step can be implemented in a separate unit, such as a sequencer.
S405, according to a pre-counted character completion table, completing the original character string, the candidate character string after character replacement and the characters in the candidate character string after character sequence adjustment to obtain a corresponding candidate character string; calculating the error correction probability of each candidate character string during character completion;
similarly, in the present embodiment, in the pre-counted character completion table, not only the character string before completion and the character string after completion are recorded, but also the completion probability of each character string after completion is recorded.
In the embodiment, during the completion, the high-frequency word of the candidate character string R including the prefix may be searched to obtain the candidate gain, and the candidate character string set obtained finally
Figure BDA0002455396290000131
Can be expressed as:
Figure BDA0002455396290000132
on the word frequency dictionary, the error correction probability P (F | R) at the time of character completion can be calculated by the maximum likelihood estimation method and recorded in the character completion table. Wherein, F represents the candidate character string after character completion, and R represents the candidate character string before character completion, namely the candidate character string after the character sequence adjustment.
Similarly, according to the requirement of practical application, the step can also complement the original character string and/or the candidate character string after character replacement to obtain some candidate character strings.
In practical applications, this step can be implemented in a separate unit, such as a character completer.
S406, generating the occurrence probability of each candidate character string in all the obtained candidate character strings according to the error correction probability of the candidate character string during character replacement, the error correction probability of character sequence adjustment and the error correction probability of character completion;
in the present embodiment, the first and second electrodes are,the candidates may be scored using a probabilistic co-product model or a log-linear model, i.e., the probability of occurrence of the candidate character string is calculated. For example, when generating the appearance probability of each candidate character string, the error correction probability at the time of character replacement obtained in step S403, the error correction probability at the time of character rearrangement obtained in step S404, and the error correction probability at the time of character completion obtained in step S405 may be directly multiplied by each other to be the appearance probability of the candidate character string. In this case, the weight coefficient corresponding to each sub-term is 1. I.e. the set of candidate strings finally generated
Figure BDA0002455396290000133
The probability of each candidate character string in (a) may be expressed as:
Figure BDA0002455396290000134
where p (f) represents the probability of occurrence of a candidate string.
Or, a weighted sum of logarithms of the sub-terms may be taken, and specifically expressed by the following formula:
log(P(F))=a*log(P(F|R))+b*log(P(R|M))+c*log(P(M|G))
wherein a, b and c are the weights of the respective terms, respectively.
In practical application, the error correction probabilities of the three sub-items may be further adopted, and then the occurrence probabilities of the candidate character strings are calculated by other mathematics, which is not described in detail herein.
In the present embodiment, the error correction processing through the above-described character replacement, character reordering, and character completion is taken as an example, so that error correction probabilities of three error correction methods exist at the same time. In practical application, only one or two error correction modes may exist according to different preset character error correction strategies, and at this time, error correction probabilities corresponding to error correction modes that are not adopted in the formula are removed, and the calculation modes are similar and are not described herein again.
In practical applications, this step may be implemented in a separate unit, such as a candidate scorer.
S407, sequencing the candidate character strings according to the sequence of the occurrence probability from large to small;
alternatively, in practical applications, this step may be implemented in a separate unit, such as a sequencer.
And S408, recommending a plurality of candidate character strings to the user according to the sorting sequence.
In specific implementation, when a plurality of candidate character strings are recommended to the user according to the sorting order, if the number of the candidate character strings is large, only the N candidate character strings in the top of the sorting order may be recommended to the user.
By adopting the technical scheme, the method for processing the candidate character strings in the input method of the embodiment can obtain a plurality of candidate character strings by sequentially performing character replacement, character sequence adjustment and character completion on the original character strings, can comprehensively predict the candidate character strings which are required to be input by the user, calculate the occurrence probability of each candidate character string, and recommend the candidate character strings to the user according to the sequence of the occurrence probabilities from large to small, and can effectively ensure that the candidate character strings in the front of the sequence are the candidate character strings which are most possibly in line with the expectation of the user and are also most likely to be seen by the user. Therefore, by adopting the technical scheme of the embodiment, the candidate character strings can be effectively screened, and the candidate character strings meeting the expectation of the user can be effectively recommended to the user, so that the input accuracy and the input efficiency of the input method can be effectively improved.
FIG. 5 is a schematic view of a fifth embodiment of the present application; as shown in fig. 5, the processing apparatus 500 for candidate character strings in the input method of the present embodiment includes:
a character string obtaining module 501, configured to obtain an input original character string based on coordinates of a plurality of input points input by a user;
the character error correction module 502 is configured to perform character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings.
The implementation principle and technical effect of implementing the processing of the candidate character string in the input method by using the modules in the apparatus 500 for processing the candidate character string in the input method in this embodiment are the same as those of the related method embodiment, and reference may be made to the description of the related method embodiment in detail, which is not repeated herein.
FIG. 6 is a schematic view of a sixth embodiment of the present application; as shown in fig. 6, the processing apparatus 500 for candidate character strings in the input method according to the present embodiment further introduces the technical solution of the present invention in more detail on the basis of the technical solution of the embodiment shown in fig. 5.
Specifically, the character error correction module 502 is specifically configured to perform character error correction according to at least one of character replacement, character order adjustment, and character completion based on the original character string, so as to obtain a plurality of candidate character strings.
For example, in this embodiment, the character error correction module 502 may include at least one of the following:
the character replacer 5021 is used for replacing characters in the original character string according to a character replacement table counted in advance to obtain a corresponding candidate character string;
the character sequencer 5022 is used for sequencing characters in the original character strings and/or the candidate character strings after character replacement according to a character sequencing table counted in advance to obtain corresponding candidate character strings; and/or
And the character complementing device 5023 is used for complementing the original character string, the candidate character string after character replacement and/or the characters in the candidate character string after character sequencing according to a character complementing table counted in advance to obtain the corresponding candidate character string.
Further optionally, as shown in fig. 6, the apparatus 500 for processing candidate character strings in the input method according to the present embodiment further includes:
a probability obtaining module 503, configured to obtain occurrence probabilities of the candidate character strings;
a sorting module 504, configured to sort the multiple candidate character strings according to a descending order of the occurrence probability;
and a recommending module 505, configured to recommend the multiple candidate character strings to the user according to the sorting order.
Further optionally, the probability obtaining module 503 includes:
an error correction probability obtaining unit 5031, configured to obtain a corresponding error correction probability when the character error correction processing is performed on each candidate character string;
an occurrence probability generating unit 5032 configured to generate, for each candidate character string, an occurrence probability of the corresponding candidate character string according to the error correction probability of the character error correction processing.
Further optionally, an error correction probability obtaining unit 5031, configured to:
when character replacement error correction processing is carried out, acquiring error correction probability corresponding to character replacement error correction processing carried out on each candidate character string according to the replacement probability of the replacement character counted in advance;
when character sequence-adjusting error-correcting processing is carried out, acquiring error-correcting probability corresponding to the character sequence-adjusting error-correcting processing of each candidate character string according to the sequence-adjusting probability of the sequence-adjusting character counted in advance; and/or
And when the character completion error correction processing is carried out, acquiring the error correction probability corresponding to the character completion error correction processing carried out on each candidate character string according to the character completion probability counted in advance.
Further optionally, the character string obtaining module 501 includes:
the acquisition unit 5011 is configured to acquire a coordinate sequence input by a user, where the coordinate sequence includes coordinates of a plurality of input points input in sequence;
the character generator 5012 is configured to obtain an input original character string according to a mapping relationship between the coordinate area and characters in the soft keyboard and coordinates of each input point.
The implementation principle and technical effect of implementing the processing of the candidate character string in the input method by using the modules in the apparatus 500 for processing the candidate character string in the input method in this embodiment are the same as those of the related method embodiment, and reference may be made to the description of the related method embodiment in detail, which is not repeated herein.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 7, the embodiment of the present application is a block diagram of an electronic device implementing a processing method of a candidate character string in an input method. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 7, one processor 701 is taken as an example.
The memory 702 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to execute the processing method of the candidate character string in the input method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute a method of processing a candidate character string in an input method provided by the present application.
The memory 702, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules (e.g., related modules shown in fig. 5 and 6) corresponding to the processing method of the candidate character string in the input method in the embodiment of the present application. The processor 701 executes various functional applications of the server and data processing, i.e., a processing method of candidate character strings in the input method in the above-described method embodiment, by executing the non-transitory software program, instructions, and modules stored in the memory 702.
The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of an electronic device that implements a processing method of a candidate character string in an input method, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 702 may optionally include a memory remotely disposed from the processor 701, and these remote memories may be connected via a network to an electronic device that implements a processing method of a candidate character string in an input method. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device implementing the method for processing the candidate character string in the input method may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic apparatus implementing a processing method of a candidate character string in an input method, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the input original character string is obtained through the coordinates of a plurality of input points input by a user; and based on the original character string, character error correction is carried out according to a preset character error correction strategy to obtain a plurality of candidate character strings, the technical problem of poor accuracy of predicted candidate words in the prior art can be solved, character granularity error correction is carried out on the character strings, the candidate character strings after error correction are enabled to be more in line with the expectation of a user, the accuracy of the obtained candidate character strings can be effectively improved, and the input accuracy and the input efficiency of the input method are further improved. Therefore, according to the technical scheme, the experience degree of the user using the input method can be effectively enhanced, and the viscosity of the user on the input method is further enhanced.
According to the technical scheme of the embodiment of the application, a coordinate sequence input by a user is collected, wherein the coordinate sequence comprises coordinates of a plurality of input points which are input in sequence; and the input original character string is obtained according to the mapping relation between the coordinate area and the characters in the soft keyboard and the coordinates of each input point, so that the accuracy of the acquired original character string can be ensured.
According to the technical scheme of the embodiment of the application, character error correction can be realized based on the character replacement, character sequence adjustment and/or character completion modes, a plurality of candidate character strings are expanded, and the candidate character strings are further enriched.
According to the technical scheme of the embodiment of the application, the occurrence probability of each candidate character string can be obtained; sequencing the candidate character strings according to the sequence of the appearance probability from large to small; the candidate character strings with higher occurrence probability can be preferentially recommended to the user, and the candidate character strings with higher occurrence probability are more in line with the expectation of the user, so that the recommended candidate character strings are more in line with the expectation of the user, and the input accuracy and the input efficiency of the input method are improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (16)

1. A method for processing candidate character strings in an input method is characterized by comprising the following steps:
acquiring an input original character string based on coordinates of a plurality of input points input by a user;
and performing character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings.
2. The method of claim 1, wherein performing character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings comprises:
and performing character error correction according to at least one of character replacement, character sequence adjustment and character completion on the basis of the original character string to obtain a plurality of candidate character strings.
3. The method of claim 2, wherein performing character error correction by character replacement based on the original character string comprises:
replacing characters in the original character string according to a character replacement table which is counted in advance to obtain a corresponding candidate character string;
based on the original character string, carrying out character error correction in a character order adjusting mode, wherein the character error correction comprises the following steps:
according to a character sequencing table which is counted in advance, carrying out sequencing on characters in the original character string and/or the candidate character string after the characters are replaced to obtain a corresponding candidate character string; and/or
Based on the original character string, carrying out character error correction according to a character completion mode, wherein the character error correction comprises the following steps:
and according to a character completion table which is counted in advance, completing the original character string, the candidate character string after character replacement and/or the characters in the candidate character string after character order adjustment to obtain the corresponding candidate character string.
4. The method according to any one of claims 1 to 3, wherein after performing character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings, the method further comprises:
acquiring the occurrence probability of each candidate character string;
sequencing the candidate character strings according to the sequence of the occurrence probability from large to small;
recommending the candidate character strings to the user according to the sorting order.
5. The method of claim 4, wherein obtaining the probability of occurrence of each candidate string comprises:
acquiring corresponding error correction probability when each candidate character string is subjected to character error correction processing;
and for each candidate character string, generating the occurrence probability of the corresponding candidate character string according to the error correction probability of character error correction processing.
6. The method according to claim 5, wherein obtaining the error correction probability corresponding to each candidate character string when performing character error correction processing comprises:
when character replacement error correction processing is carried out, acquiring error correction probability corresponding to the character replacement error correction processing of each candidate character string according to the replacement probability of the replacement character counted in advance;
when character sequence-adjusting error-correcting processing is carried out, acquiring error-correcting probability corresponding to the character sequence-adjusting error-correcting processing carried out on each candidate character string according to the sequence-adjusting probability of the sequence-adjusting character counted in advance; and/or
And when the character completion error correction processing is carried out, acquiring the error correction probability corresponding to the character completion error correction processing carried out on each candidate character string according to the character completion probability counted in advance.
7. The method of any one of claims 1-3 and 5-6, wherein obtaining the input original character string based on coordinates of a plurality of input points input by a user comprises:
acquiring a coordinate sequence input by a user, wherein the coordinate sequence comprises coordinates of the plurality of input points input in sequence;
and acquiring the input original character string according to the mapping relation between the coordinate area and the characters in the soft keyboard and the coordinates of each input point.
8. An apparatus for processing a candidate character string in an input method, comprising:
the character string acquisition module is used for acquiring an input original character string based on the coordinates of a plurality of input points input by a user;
and the character error correction module is used for carrying out character error correction according to a preset character error correction strategy based on the original character string to obtain a plurality of candidate character strings.
9. The apparatus of claim 8, wherein the character error correction module is configured to:
and performing character error correction according to at least one of character replacement, character sequence adjustment and character completion on the basis of the original character string to obtain a plurality of candidate character strings.
10. The apparatus of claim 9, wherein the character error correction module comprises at least one of:
the character replacer is used for replacing characters in the original character string according to a character replacement table which is counted in advance to obtain a corresponding candidate character string;
the character sequencer is used for sequencing the characters in the original character string and/or the candidate character string after the character replacement according to a character sequencing table counted in advance to obtain a corresponding candidate character string; and/or
And the character complementer is used for completing the original character strings, the candidate character strings after character replacement and/or the characters in the candidate character strings after character sequencing according to a character completion table counted in advance to obtain corresponding candidate character strings.
11. The apparatus of any of claims 8-10, further comprising:
a probability obtaining module, configured to obtain occurrence probabilities of the candidate character strings;
the sorting module is used for sorting the candidate character strings according to the sequence of the occurrence probability from large to small;
and the recommending module is used for recommending the candidate character strings to the user according to the sorting sequence.
12. The apparatus of claim 11, wherein the probability obtaining module comprises:
an error correction probability obtaining unit, configured to obtain a corresponding error correction probability when each candidate character string is subjected to character error correction processing;
and the occurrence probability generating unit is used for generating the occurrence probability of the corresponding candidate character string according to the error correction probability of character error correction processing for each candidate character string.
13. The apparatus according to claim 12, wherein the error correction probability obtaining unit is configured to:
when character replacement error correction processing is carried out, acquiring error correction probability corresponding to the character replacement error correction processing of each candidate character string according to the replacement probability of the replacement character counted in advance;
when character sequence-adjusting error-correcting processing is carried out, acquiring error-correcting probability corresponding to the character sequence-adjusting error-correcting processing carried out on each candidate character string according to the sequence-adjusting probability of the sequence-adjusting character counted in advance; and/or
And when the character completion error correction processing is carried out, acquiring the error correction probability corresponding to the character completion error correction processing carried out on each candidate character string according to the character completion probability counted in advance.
14. The apparatus according to any one of claims 8-10 and 12-13, wherein the character string obtaining module comprises:
the acquisition unit is used for acquiring a coordinate sequence input by a user, and the coordinate sequence comprises coordinates of the plurality of input points which are input in sequence;
and the character generator is used for acquiring the input original character string according to the mapping relation between the coordinate area and the characters in the soft keyboard and the coordinates of each input point.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202010304923.1A 2020-04-17 2020-04-17 Candidate character string processing method and device, electronic equipment and storage medium Active CN111665956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010304923.1A CN111665956B (en) 2020-04-17 2020-04-17 Candidate character string processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010304923.1A CN111665956B (en) 2020-04-17 2020-04-17 Candidate character string processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111665956A true CN111665956A (en) 2020-09-15
CN111665956B CN111665956B (en) 2023-07-25

Family

ID=72382809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010304923.1A Active CN111665956B (en) 2020-04-17 2020-04-17 Candidate character string processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111665956B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112684910A (en) * 2020-12-29 2021-04-20 维沃移动通信有限公司 Input method candidate word display method and device and electronic equipment
CN113190160A (en) * 2021-02-26 2021-07-30 清华大学 Input error correction method, computing device and medium for analyzing hand tremor false touch
CN113849071A (en) * 2021-09-10 2021-12-28 维沃移动通信有限公司 Character string processing method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0359761A (en) * 1989-07-27 1991-03-14 Matsushita Electric Ind Co Ltd Device for correcting spelling error of english word
JPH11328317A (en) * 1998-05-11 1999-11-30 Nippon Telegr & Teleph Corp <Ntt> Method and device for correcting japanese character recognition error and recording medium with error correcting program recorded
CN101727271A (en) * 2008-10-22 2010-06-09 北京搜狗科技发展有限公司 Method and device for providing error correcting prompt and input method system
US20110066421A1 (en) * 2009-09-11 2011-03-17 Electronics And Telecommunications Research Institute User-interactive automatic translation device and method for mobile device
CN103365573A (en) * 2012-03-27 2013-10-23 北京搜狗科技发展有限公司 Method and device for identifying multi-key input characters
US20160132471A1 (en) * 2013-11-13 2016-05-12 Keukey Inc. Method for revising errors by means of correlation decisions between character strings
CN106774970A (en) * 2015-11-24 2017-05-31 北京搜狗科技发展有限公司 The method and apparatus being ranked up to the candidate item of input method
CN107102746A (en) * 2016-02-19 2017-08-29 北京搜狗科技发展有限公司 Candidate word generation method, device and the device generated for candidate word
CN110083819A (en) * 2018-01-26 2019-08-02 北京京东尚科信息技术有限公司 Spell error correction method, device, medium and electronic equipment
CN110488990A (en) * 2019-08-12 2019-11-22 腾讯科技(深圳)有限公司 Input error correction method and device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0359761A (en) * 1989-07-27 1991-03-14 Matsushita Electric Ind Co Ltd Device for correcting spelling error of english word
JPH11328317A (en) * 1998-05-11 1999-11-30 Nippon Telegr & Teleph Corp <Ntt> Method and device for correcting japanese character recognition error and recording medium with error correcting program recorded
CN101727271A (en) * 2008-10-22 2010-06-09 北京搜狗科技发展有限公司 Method and device for providing error correcting prompt and input method system
US20110066421A1 (en) * 2009-09-11 2011-03-17 Electronics And Telecommunications Research Institute User-interactive automatic translation device and method for mobile device
CN103365573A (en) * 2012-03-27 2013-10-23 北京搜狗科技发展有限公司 Method and device for identifying multi-key input characters
US20160132471A1 (en) * 2013-11-13 2016-05-12 Keukey Inc. Method for revising errors by means of correlation decisions between character strings
CN106774970A (en) * 2015-11-24 2017-05-31 北京搜狗科技发展有限公司 The method and apparatus being ranked up to the candidate item of input method
CN107102746A (en) * 2016-02-19 2017-08-29 北京搜狗科技发展有限公司 Candidate word generation method, device and the device generated for candidate word
CN110083819A (en) * 2018-01-26 2019-08-02 北京京东尚科信息技术有限公司 Spell error correction method, device, medium and electronic equipment
CN110488990A (en) * 2019-08-12 2019-11-22 腾讯科技(深圳)有限公司 Input error correction method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
H. KANEKO: "A class of m-ary asymmetric symbol error correcting codes constructed by graph coloring", 《PROCEEDINGS. 2001 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY》, pages 225 *
李德银: "一种字符串纠错方法及其应用研究", 《计算机应用与软件》, pages 41 - 45 *
段良涛: "中文文本校对技术研究", pages 4602 - 4604 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112684910A (en) * 2020-12-29 2021-04-20 维沃移动通信有限公司 Input method candidate word display method and device and electronic equipment
WO2022143341A1 (en) * 2020-12-29 2022-07-07 维沃移动通信有限公司 Input method candidate word display method and apparatus, and electronic device
CN113190160A (en) * 2021-02-26 2021-07-30 清华大学 Input error correction method, computing device and medium for analyzing hand tremor false touch
CN113849071A (en) * 2021-09-10 2021-12-28 维沃移动通信有限公司 Character string processing method and device

Also Published As

Publication number Publication date
CN111665956B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
EP3862892A1 (en) Session recommendation method and apparatus, and electronic device
CN111665956A (en) Processing method and device of candidate character string, electronic equipment and storage medium
CN112001169B (en) Text error correction method and device, electronic equipment and readable storage medium
US20210200963A1 (en) Machine translation model training method, apparatus, electronic device and storage medium
CN111079945B (en) End-to-end model training method and device
CN112580324B (en) Text error correction method, device, electronic equipment and storage medium
CN111859997A (en) Model training method and device in machine translation, electronic equipment and storage medium
CN110717340B (en) Recommendation method, recommendation device, electronic equipment and storage medium
EP3896595A1 (en) Text key information extracting method, apparatus, electronic device, storage medium, and computer program product
EP3923177A1 (en) Method and apparatus for correcting character errors, electronic device and stroage medium
CN110427436B (en) Method and device for calculating entity similarity
EP3896580A1 (en) Method and apparatus for generating conversation, electronic device, storage medium and computer program product
CN112329453B (en) Method, device, equipment and storage medium for generating sample chapter
CN112328798A (en) Text classification method and device
CN111708477B (en) Key identification method, device, equipment and storage medium
CN111563198B (en) Material recall method, device, equipment and storage medium
CN111640511A (en) Medical fact verification method and device, electronic equipment and storage medium
US20210216710A1 (en) Method and apparatus for performing word segmentation on text, device, and medium
CN111753147A (en) Similarity processing method, device, server and storage medium
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium
CN111665955A (en) Processing method and device of candidate character string, electronic equipment and storage medium
CN111090341A (en) Input method candidate result display method, related equipment and readable storage medium
CN112990127B (en) Target identification method and device, electronic equipment and storage medium
CN111125445B (en) Community theme generation method and device, electronic equipment and storage medium
CN111931524A (en) Method, apparatus, device and storage medium for outputting information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant