WO2014012485A1 - Procédé, appareil et système de reconnaissance de mots basés sur une entrée continue de plusieurs mots - Google Patents

Procédé, appareil et système de reconnaissance de mots basés sur une entrée continue de plusieurs mots Download PDF

Info

Publication number
WO2014012485A1
WO2014012485A1 PCT/CN2013/079492 CN2013079492W WO2014012485A1 WO 2014012485 A1 WO2014012485 A1 WO 2014012485A1 CN 2013079492 W CN2013079492 W CN 2013079492W WO 2014012485 A1 WO2014012485 A1 WO 2014012485A1
Authority
WO
WIPO (PCT)
Prior art keywords
candidate
trajectory
track
continuous input
current
Prior art date
Application number
PCT/CN2013/079492
Other languages
English (en)
Chinese (zh)
Inventor
刘炳林
王玲
Original Assignee
重庆优腾信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 重庆优腾信息技术有限公司 filed Critical 重庆优腾信息技术有限公司
Publication of WO2014012485A1 publication Critical patent/WO2014012485A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04886Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures by partitioning the display area of the touch-screen or the surface of the digitising tablet into independently controllable areas, e.g. virtual keyboards or menus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04883Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text

Definitions

  • Word recognition method device and system based on continuous input of multiple words
  • the present invention relates to the field of continuous input technology, and more particularly to a word recognition method, apparatus and system based on continuous input of multiple words.
  • the touch screen has been applied to various fields as a simple, convenient and fast way of human-computer interaction. Techniques for word input based on continuous sliding of a touch screen are also becoming more widespread.
  • the U.S. Patent No. 7,250,938 and U.S. Patent Application Serial No. 10/752,405 the entire disclosure of which is in the form of the user's continuous sliding operation, the electronic device can detect the continuous input trajectory of the contact according to the continuous sliding operation of the user. Characters can be recognized by consecutively inputting the characters of the trajectory combined with the lexicon query.
  • a gesture input by a user is recognized as a word existing in a thesaurus, that is, a recognition result string is a word existing in the thesaurus.
  • a phrase consisting of multiple words or words
  • the electronic device cannot recognize the words that the user needs to input, resulting in no recognition results or the recognized words do not match the words that the user needs to input.
  • the present invention provides a word recognition method, apparatus and system based on continuous input of multiple words, which can perform word segmentation on a continuous input track input by a user, and obtain a word combination corresponding to the continuous input track. So that the user can directly input multiple words or a whole sentence through a single continuous gesture, and this kind of input There is no need to have a single entry in the vocabulary corresponding to the pre-entered content, which significantly increases the input rate.
  • a word recognition method based on continuous input of multiple words comprising:
  • the trajectory feature data of the continuous input trajectory includes: a trajectory start point of the continuous input trajectory, a path of the stroke, and an end point of the trajectory;
  • the preset vocabulary matches at least one candidate matching the trajectory feature data of different track segments in the continuous input trajectory, including:
  • trajectory feature data of the continuous input trajectory including:
  • trajectory feature data determining, according to the layout, the trajectory feature data of the continuous input trajectory, wherein the trajectory feature data comprises: a trajectory start point of the continuous input trajectory, a path of the stroke, a stroked character, and a track end point.
  • the at least one candidate matching the trajectory feature data of the different trajectory segments in the continuous input trajectory from the preset vocabulary includes:
  • the operation of matching the candidate from the lexicon is performed according to the trajectory feature data of the unrecognized trajectory segment, until the continuous input trajectory recognition ends, and the continuous input is obtained.
  • the candidate for matching the trajectory feature data of each track segment in the trajectory includes:
  • the trajectory feature data further includes at least one word segmentation mark drawn by the continuous input trajectory
  • step A the method further includes:
  • the step B includes:
  • [27] B1 according to the trajectory feature data of the continuous input trajectory between the current recognition starting point and the word segment identifier, matching at least one candidate track starting from the current recognition starting point from the vocabulary will match The candidate is used as the current candidate, and proceeds to step E3;
  • [28] B2 according to the trajectory feature data of the trajectory segment located after the current recognition starting point in the continuous input trajectory, matching at least one starting from the current identification starting point from the lexicon
  • the candidate for the trajectory, the matching candidate is used as the current candidate, and 3 ⁇ 4A step C;
  • the method further includes:
  • determining, in the step C, whether the current trajectory segment recognizes the end comprising: determining whether a trajectory segment located after the current recognition starting point in the continuous input trajectory is identified as ending;
  • step D the starting point of the current track segment is taken as the new current recognition starting point, and the execution step is returned, including:
  • the starting point of the current track segment is taken as the new current recognition starting point, and the execution step ⁇ 2 is returned to obtain the next level candidate of the current candidate.
  • matching at least one candidate corresponding to the trajectory segment starting from the current recognition starting point from the vocabulary includes:
  • the coded character string candidate is to be determined and / or the word converted from the pending encoded string candidate
  • the candidate is a candidate corresponding to the track segment starting from the current recognition starting point
  • the determining the track segment matched by the current candidate comprises: using the track segment to be matched as the track segment whose track feature data matches the current candidate.
  • At least one candidate matching the trajectory feature data of the different trajectory segments in the continuous input trajectory is matched from the preset vocabulary, including:
  • the matching degree of the to-be-coded character string candidate and the trajectory feature data of the to-be-matched track segment is calculated by using any one or more of the following parameters:
  • the method further includes:
  • step B when the step B is performed, if the candidate cannot be matched in the track segment within the preset length starting from the current recognition starting point, the current recognition starting point is moved backward, and the The track segment located in the continuous input track after the current recognition start point is regarded as the new current track segment, and the operation of matching the candidate from the lexicon matching the trajectory feature data of the track segment after the current recognition start point is continued.
  • Optional also includes:
  • the candidate is displayed based on the connection order relationship between the track segments matched with the respective candidates, and the evaluation score of the candidate.
  • the method further comprises:
  • determining the evaluation score of the candidate including:
  • the evaluation factor comprises any one or more of the following: a word frequency of the candidate, a lexical rule matching degree of the candidate, a path matching of the candidate with the matching track segment. Degree, the number of characters in the encoded string corresponding to the candidate, the length of the track segment matched with the candidate, the number of characters stroked by the track segment matched with the candidate, and/or the continuous input track segment The number of characters passed.
  • the evaluation factor further includes any one or more of the following: whether the candidate has a corresponding lower-level candidate, an evaluation score of the next-level candidate of the candidate, wherein, the candidate The next level candidate of the item is the candidate that matches the end point of the track segment matched by the candidate as the recognition start point.
  • the evaluation score of the candidate is calculated by using the 51 ⁇ 2 matching degree, including:
  • the path matching degree is determined according to the area value of the closed area.
  • the displaying the candidate relationship according to the connection order relationship between the track segments matched with the respective candidates, and the evaluation score of the candidate including:
  • the distinguishing display comprises: using color to distinguish each candidate included in the candidate candidate; or , setting a space or identifier between each candidate of the combined candidate;
  • the method further includes:
  • the candidate to be processed is determined according to the sliding trajectory of the contact, and when the contact lifting operation is detected Executing the preset instruction;
  • the preset instruction comprises:
  • [79] deleting the candidate to be processed from the candidate presentation area, and receiving the continuous sliding operation of the user on the keyboard area, acquiring the continuous input trajectory input at the current time, and utilizing the trajectory of the continuous input trajectory with the current time.
  • the candidates matching the feature data replace the candidate to be processed.
  • the present invention also provides a word recognition method based on continuous input of multiple words, comprising:
  • the electronic device detects a continuous input trajectory on the keyboard area, and acquires trajectory feature data of the continuous input trajectory, wherein the continuous input trajectory is related to a position of the coded character sequence to be input by the user on the keyboard layout;
  • the electronic device sends information including at least the trajectory feature data to a specified server, where the trajectory feature data includes at least
  • the server according to the information, matching at least one candidate matching the trajectory feature data of different trajectory segments of the continuous input trajectory from the vocabulary, and returning the matched candidate to the Electronic equipment;
  • the electronic device receives the server returning at least one candidate that matches trajectory feature data of different track segments in the continuous input trajectory.
  • the present invention also provides a word recognition device based on continuous input of multiple words, comprising: [86] Thesaurus;
  • a trajectory retrieval unit for detecting a continuous input trajectory on a keyboard area
  • a trajectory data information acquiring unit configured to acquire trajectory feature data of the continuous input trajectory, wherein the continuous input trajectory is related to a position of a coded character to be input by a user on a layout;
  • a word matching unit configured to match, from the preset vocabulary, at least one candidate that matches trajectory feature data of different trajectory segments in the continuous input trajectory.
  • the trajectory data information acquiring unit includes: determining trajectory feature data of the continuous input trajectory according to a keyboard layout, where the trajectory feature data includes: a trajectory starting point and a stroke of the continuous input trajectory The path of the passage, the characters of the stroke, and the end of the trajectory.
  • the character matching unit comprises:
  • a candidate retrieval unit configured to determine, according to the trajectory feature data of the unidentified trajectory segment in the continuous input trajectory, when the candidate is matched from the lexicon, the trajectory feature data is matched with the matched candidate Track segment
  • a loop judging unit configured to determine, when the candidate retrieving unit matches the candidate and determine a trajectory segment that matches the candidate, whether the continuous input trajectory after the trajectory segment is already in the continuous input trajectory After the identification is completed, and when it is determined that there is still an unidentified track segment after the track segment in the continuous input track, the operation of the candidate search unit is returned, until there is no unidentified track segment in the continuous input track, and the difference is obtained.
  • the candidate for the trajectory feature data of the track segment matches.
  • the candidate retrieval unit includes:
  • an initial starting point setting unit configured to use a starting point of the track of the continuous input track as a current recognition starting point, and use a track segment of the continuous input track after the current recognition starting point as a current track segment, and execute a candidate The operation of the loop matching unit;
  • the candidate loop matching unit is configured to match, according to the trajectory feature data of the current trajectory segment, at least one trajectory of the trajectory starting from the current recognition starting point, and matching the candidate candidates as Current candidate and performing the operation of the determining unit;
  • the loop judgment unit includes:
  • a loop judging subunit configured to determine whether the current trajectory segment is recognized end, and if so, at least one candidate matching the trajectory feature data of the different trajectory segments; if not, determining the trajectory feature data a track segment that matches the current candidate, and performs an operation of the cycle start point setting unit;
  • the loop start point setting unit is configured to use a track segment of the continuous input track after the track segment matched by the current candidate as a current track segment, and use a starting point of the current track segment as a new one.
  • the starting point is currently identified, and the operation of the candidate loop matching unit is triggered to obtain the next level candidate of the current candidate.
  • the trajectory feature data acquired by the trajectory data information acquiring unit further includes: at least one word segmentation identifier drawn by the continuous input trajectory;
  • the candidate retrieval unit further includes:
  • a word segmentation determining unit configured to determine whether the word segmentation identifier is included in the continuous input track after the current recognition starting point after the initial starting point setting unit sets the current recognition starting point, and if yes, execute The operation of the recent word segment identification determining unit; if not, the operation of the second candidate loop matching unit is performed;
  • the recent participle identifier determining unit is configured to determine a word segment identifier of the continuous input track that is located after the current recognition starting point and is closest to the current recognition starting point, and execute the first candidate loop matching unit Operation
  • the loop matching unit includes:
  • a first loop matching unit configured to: according to the trajectory feature data of the continuous input trajectory between the current recognition starting point and the word segment identifier, at least one trajectory starting from the current recognition starting point is matched from the vocabulary 3 ⁇ 4 ⁇
  • the candidate candidate should be the candidate candidate as the current candidate, and execute the operation of the word segmentation loop determining unit;
  • the second loop matching unit comprises: trajectory feature data for trajectory segments located after the current recognition starting point in the continuous input trajectory, and matching at least one trajectory starting from the current recognition starting point from the vocabulary 3 ⁇ 4 ⁇
  • the candidate candidate should use the matched candidate as the current candidate and execute the operation of the loop judgment subunit;
  • the device further includes: [108] a word segmentation loop determining unit, configured to determine whether a continuous input track between the current recognition start point and the most recent word segment identifier is recognized, and if yes, in a continuous input track after the most recent word segmentation identifier Setting a new current recognition starting point, and returning to perform the operation of the word segmentation identifying unit; if not, determining a track segment in which the track feature data matches the current candidate, and placing the continuous input track in the current candidate a continuous input trajectory between the end point of the matched trajectory segment and the nearest participle identifier is used as the current trajectory segment, and the starting point of the current trajectory segment is used as a new current recognition starting point, and the first loop matching unit is returned to obtain the The next level of candidates for the current candidate;
  • the loop determination subunit is configured to determine whether a track segment located after the current recognition start point in the continuous input track is end-of-recognition, and if yes, obtain at least one candidate that matches track feature data of different track segments. If not, determining a track segment whose track feature data matches the current candidate, and performing an operation of the cycle start point setting unit;
  • the loop start point setting unit includes: configured to return a start point of the current track segment as a new current recognition start point, and perform an operation of the second candidate loop matching unit to obtain the current candidate Next level candidate.
  • the candidate loop matching unit includes:
  • a first retrieval unit configured to retrieve a candidate encoded character string candidate from the thesaurus
  • the to-be-matched track segment determining unit is configured to determine a recognition end point corresponding to the to-be-coded character string candidate on the current track segment, and to treat the track between the current recognition start point and the end point of the identified segment as a waiting point Matching track segments;
  • a key data determining unit configured to determine key data of each of the to-be-determined encoded character string candidates on a keyboard layout
  • a candidate determining unit configured to: when it is determined that the matching degree of the key data of each character in the pending encoded character string candidate and the trajectory feature data of the to-be-matched track segment meets a preset condition, a pending candidate for the encoded character string candidate and/or the candidate for the pending encoded character string candidate as a candidate for the trajectory starting at the current recognition starting point;
  • the loop determining subunit comprising: determining whether the current track segment is recognized Do not end, if yes, obtain at least one candidate that matches the trajectory feature data of the different trajectory segments; if not, the trajectory segment to be matched is used as the trajectory segment in which the trajectory feature data matches the current candidate, and is executed
  • the cycle start point sets the operation of the unit.
  • word matching unit including:
  • a warp string determining unit configured to determine a sequence of character strings drawn by the continuous input track
  • a second searching unit configured to retrieve, according to the sequence of the characters drawn by the continuous input track, the encoded character string candidate from the thesaurus;
  • a candidate matching unit configured to determine, according to a correspondence between a character included in the encoded character string candidate and a string sequence drawn by the continuous input track, corresponding to the encoded character string candidate The track segment to be matched, when the matching degree between the coded character candidate and the track feature data of the track segment to be matched reaches a preset value, the coded string candidate and/or the coded string candidate are converted out.
  • a term candidate as a candidate matching the trajectory feature data of the track segment to be matched, and using the candidate as the current candidate;
  • a string identification determining unit configured to determine whether a string sequence after the last letter of the encoded string corresponding to the current candidate still exists in the sequence of the string drawn by the continuous input track, and if not, obtain At least one candidate matching the trajectory feature data of the different trajectory segments; if yes, continuing to execute the string following the last letter of the encoded string corresponding to the current candidate in the sequence of strings drawn according to the continuous input trajectory
  • the sequence from the lexicon, matches the operation of the encoded string candidate, and performs the operation of the candidate matching unit until the last character stroked by the continuous input trajectory is recognized.
  • Optional also includes:
  • a score determining unit configured to determine a rating score of the candidate according to a preset evaluation rule
  • a presentation unit configured to display a candidate according to a connection order relationship between the track segments matched with the respective candidates, and an evaluation score of the candidate
  • a word input unit for performing a word input according to a user selection result and/or a confirmation operation for the presented candidate includes: calculating an evaluation score of the candidate according to a preset evaluation factor;
  • the evaluation factor comprises any one or more of the following: a word frequency of the candidate, a lexical rule matching degree of the candidate, a path matching degree of the candidate with the matched track segment, and an encoding corresponding to the candidate
  • a word frequency of the candidate a lexical rule matching degree of the candidate
  • a path matching degree of the candidate with the matched track segment a encoding corresponding to the candidate
  • the number of characters in the string the length of the track segment that matches the candidate, the number of characters stroked by the track segment that matches the candidate, and/or the number of characters stroked by the continuous input track segment.
  • the score determining unit calculates an evaluation score of the candidate according to the preset evaluation factor, wherein the evaluation factor further includes any one or more of the following: whether the candidate exists corresponding to the candidate The next level candidate, the evaluation score of the next level candidate of the candidate, wherein the next level candidate of the candidate is the candidate whose matching end point of the track segment matched by the candidate is used as the recognition starting point.
  • the present invention also provides a word recognition system based on continuous input of multiple words, comprising: an electronic device and a server;
  • the electronic device is configured to detect a continuous input trajectory on the area, acquire trajectory feature data of the continuous input trajectory, send information including at least the trajectory feature data to a designated server, and receive the server Returning at least one candidate that matches trajectory feature data of different trajectory segments in the continuous input trajectory; wherein the continuous input trajectory is related to a position of a sequence of coded characters to be input by the user on a keyboard layout;
  • the server receives data of the information that is sent by the electronic device and includes at least the trajectory feature data; and according to the information, trajectory feature data of different trajectory segments from the continuous input trajectory are matched from the lexicon Matching at least one candidate and returning the matched candidate to the electronic device.
  • the present invention provides a word recognition method, device and system based on continuous input of multiple words, which detects continuous input on a keyboard area, compared with the prior art.
  • the encoded character candidate matching the trajectory feature data of the different trajectory segments of the continuous input trajectory is matched from the preset vocabulary.
  • At least one candidate matching the trajectory feature data of different trajectory segments of a continuous input trajectory can be respectively matched from the lexicon, so the user can also perform continuous input of the encoded character string corresponding to the plurality of phrases by one operation.
  • the electronic device processes the continuous input trajectory data to obtain at least one candidate matching the trajectory feature data of the different trajectory segments, and the simultaneous input operation of the multiple words can be realized by one continuous sliding operation, thereby improving the input. Efficiency also reduces the amount of data processed by electronic devices.
  • FIG. 1 is a flow chart showing an embodiment of a word recognition method based on multi-word continuous input according to the present invention
  • Figure 2 is a diagram showing a continuous input trajectory detected on a region when a word input is performed in an example of the present invention
  • FIG. 3 is a flow chart showing another embodiment of a word recognition method based on continuous input of multiple words
  • FIG. 4 is a flow chart showing an implementation manner of matching candidates from a thesaurus according to trajectory feature data of a track segment
  • Figure 5 is a diagram showing another embodiment of a word recognition method based on multi-word continuous input of the present invention.
  • FIG. 6a is a schematic diagram showing a trajectory of a key position in a keyboard layout of a character string candidate "wo" to be determined in the present invention
  • FIG. 6b is a schematic diagram showing the link trajectory of each character in the pending layout of the encoded character string candidate "en” in the keyboard layout in the present invention
  • Figure 7a shows the layout of the input string "jiuhenhao" in the present invention A corresponding schematic diagram of a continuous input track;
  • Figure 7b shows a tree structure diagram of each word candidate obtained by identifying the continuous input trajectory in Figure 7a;
  • Figure 8 is a diagram showing another embodiment of a word recognition method based on multi-word continuous input of the present invention.
  • FIG. 9 is a schematic diagram showing a continuous input track of position information including a word segmentation identifier of the present invention.
  • Figure 10 shows a schematic representation of a continuous input trajectory containing an erroneous trajectory input in the present invention
  • FIG. 11 is a schematic diagram showing adjustment of candidates according to a user's selection result in the process of presenting candidates in the present invention.
  • FIG. 12a and FIG. 12b are schematic diagrams showing an operation gesture for triggering generation of a preset instruction for a candidate for presentation in the present invention
  • Figure 13 is a block diagram showing an embodiment of a word recognition apparatus based on continuous input of multiple words
  • Figure 14 is a block diagram showing another embodiment of a word recognition apparatus based on continuous input of multiple words
  • FIG. 15 is a block diagram showing a structure of a word matching unit of a word recognition device based on multi-word continuous input of the present invention
  • Figure 16 is a block diagram showing an embodiment of a word recognition system based on multi-word continuous input of the present invention.
  • the embodiment of the invention discloses a word recognition method based on continuous input of multiple words. When detecting a continuous input trajectory on a keyboard area, the method matches different trajectory segments of the continuous input trajectory from the lexicon.
  • the word segmentation is judged in the process, which simplifies the user's operation process and improves the input speed.
  • FIG. 1 a flow of an embodiment of a word recognition method based on continuous input of multiple words is shown.
  • the method of this embodiment can be applied to any electronic device having an input processing function.
  • the method in this embodiment includes:
  • Step 101 Detect continuous input trajectory on the keyboard area, and obtain trajectory feature data of the continuous input trajectory, wherein the continuous input trajectory is related to the position of the coded character to be input by the user on the keyboard layout.
  • the keyboard area is an area in which the layout of the continuous sliding operation can be detected.
  • the keyboard area includes only the keyboard layout, and the keyboard area includes not only the area where the button is provided, but also the area where the button is not provided.
  • the keyboard area can be a virtual keyboard displayed on the touch screen, or a virtual keyboard generated by the electronic device through projection, or a physical sensing keyboard (including contact or non-contact sensing).
  • the letter button is displayed on the keyboard area, and other character buttons can be displayed.
  • the user can slide on the keyboard area by a finger or other touch pen, and the contacts will sequentially scroll through the plurality of keys on the keyboard area during the user's sliding process, so that the finger or the touch pen or the like is contacted.
  • the sliding operation will form a continuous sliding track.
  • detecting the continuous input track on the keyboard area does not necessarily mean that the contact must have physical contact with the keyboard area, or may be obtained by interactive recognition technology such as camera, electric induction, thermal sensing, light sensing, pointing device of the electronic device. To the user's continuous sliding track on the keyboard area.
  • the continuous sliding track includes content of each character sequence on the keyboard area that the contacts sequentially pass through, and a stroke path of the contacts.
  • the change of the contact on the keyboard area is obtained, and a series of position change sequence data is obtained to obtain a continuous input trajectory, and the trajectory feature of the continuous input trajectory is obtained.
  • the trajectory feature data should at least include the coordinate position of each point on the continuous input trajectory, that is, the trajectory starting point and the stroke including the continuous input trajectory.
  • the continuous input track is related to the position of the corresponding key sequence on the keyboard layout of the code string to be input by the user.
  • the keyboard layout of the keyboard area is different, and the continuous input track that needs to be input will also be different.
  • Step 102 Matching at least one candidate matching the trajectory feature data of the different track segments in the continuous input track from the preset vocabulary.
  • a candidate matching the trajectory feature data of the different trajectory segments of the continuous input trajectory from the preset lexicon is performed.
  • the recognition result is a plurality of candidates that are consecutively connected to the different paragraphs of the continuous input trajectory, and the overall vocabulary that does not need to be consecutively connected to the plurality of candidates exists in the lexicon.
  • the character string recognition method based on continuous input of multiple words of the present invention can be applied not only to the input of words composed of letters, such as the input of English words, but also to the words based on coding. Input.
  • the method of the present invention is applied to an input based on an encoded word, or an input capable of converting a coded string into a word, after identifying a coded string candidate that matches a different track segment of the continuous input track, It also includes: Get the term candidates corresponding to the encoded string candidates. Taking Pinyin input Chinese characters as an example, when the coded character candidates are recognized, the pinyin codes corresponding to the Chinese characters are obtained, and according to the pinyin coding, the corresponding Chinese characters or words can be obtained.
  • an encoded string candidate may convert multiple possible term candidates
  • the word frequency and lexical matching according to the term candidate may be The terms and the candidate terms such as the relationship between the candidate terms of the different track segments are scored, and the candidate terms satisfying the requirements and the display order thereof are determined according to the scores.
  • the scenes to which the method according to the present invention is applied differ in the types of candidates that are matched, and it can be said that the matched candidates are related to the preset thesaurus.
  • the matched candidate is an encoded character string candidate composed of one or more characters (letters).
  • the present invention is applied to Chinese input, it is generally required to match the encoded character string candidate, that is, pinyin coding, and then the word candidate composed of the Chinese characters can be obtained after the matched pinyin coding is converted, and then the different track segments are obtained.
  • the candidate for matching the trajectory feature data may include a string encoding candidate and/or a term candidate converted by the string encoding.
  • the present invention does not integrally match a continuous input track as a track segment that matches a word, but in the process of string matching the entire continuous input track. And segmenting the continuous input trajectory to determine a combination of segment identification results that best match the entire input trajectory.
  • trajectory segments in the continuous input trajectory may be understood as a part of the trajectory segment of the continuous input trajectory, that is, the sub-track segment of the continuous input trajectory, or may be the continuous Enter the complete track segment of the track.
  • Segment recognition does not mean that the trace is artificially divided into multiple paragraphs, but the whole trajectory is identified as multiple recognition results (ie candidates) with a connection relationship, and this is more The recognition results with the front-to-back connection relationship correspond to the different paragraphs of the trajectory.
  • the identification process does not need to start after the end of the entire track input.
  • the input track can be identified during the continuous input process of the user. After the subsequent sliding input is continued, the recognition is continued.
  • the recognition result and the newly input trajectory data are recognized to improve the recognition efficiency.
  • the input process synchronization identification can make full use of the computing power of the device, and the recognition result can be given soon after the user input action is terminated.
  • the candidate process corresponding to the trajectory feature data matching a certain trajectory segment in the lexicon in addition to considering the order of the characters constituting the candidate and the sequence of the character sequence traversed by the trajectory, can also comprehensively consider the word frequency of the candidate, The word length and the matching of the key of the encoded character string corresponding to the candidate in the keyboard layout with the path matching degree of the continuous input track to match the candidate corresponding to the trajectory feature data of a certain track segment.
  • the continuous input trajectory on the area is detected, and after the trajectory feature data of the continuous input trajectory is acquired, the trajectory feature data of the different trajectory segments of the continuous input trajectory are matched from the preset vocabulary.
  • Matching candidates although the candidate corresponding to the trajectory feature data of each track segment is stored in the lexicon, but corresponding to a plurality of words or phrases obtained after combining the trajectory feature data candidates of the plurality of trajectory segments Words or phrases may not be pre-stored in the thesaurus, so even if the word combination to be entered by the user is not stored as a term in the thesaurus, since the present invention can separately match different tracks from the thesaurus
  • the user can also input the trajectory of the encoded character string corresponding to the multi-phrase by inputting a continuous input trajectory, and the electronic device processes the continuous input trajectory data to obtain the code corresponding to the different trajectory segments.
  • String candidates and term candidates which are then input according to the user's choice. Successive sliding operation input can simultaneously operate multiple words,
  • the coordinate positions of the points in the continuous input trajectory can be acquired, thereby obtaining partial feature data of the continuous input trajectory, that is, when the keyboard area is detected.
  • the start point of the track including at least the continuous input track, the path of the stroke, and the end point of the track can be acquired.
  • the path of the stroke also reflects the change of the coordinate position of each point, the path of the stroke, and the path may include the position information of each point and the inflection point in the continuous input track. It may also include information such as the angle of the track into and out of the relevant ⁇ area.
  • the trajectory feature data of the continuous input trajectory may further include characters drawn by the continuous input trajectory, thereby acquiring the keyboard area.
  • the continuous input track may be: determining track feature data of the continuous input track according to the keyboard layout, wherein the track feature data includes a track start point of the continuous input track, a stroke path, a stroked character, and a track end point.
  • determining the trajectory feature data of the continuous input trajectory according to the keyboard layout may be: determining trajectory feature data including the sequence of characters drawn by the continuous input trajectory according to the keyboard layout when the continuous input trajectory is acquired; When the continuous input trajectory is acquired, only the trajectory feature data of the continuous input trajectory that does not include the stroked character sequence is acquired, and then the encoded character string is obtained according to the keyboard layout as needed to determine the character drawn by the continuous input trajectory. sequence.
  • the manner of matching at least one candidate matching the trajectory feature data of the different trajectory segments in the continuous input trajectory from the preset vocabulary is also There may be more than one.
  • the trajectory feature data of the continuous input trajectory is determined according to the keyboard layout, the trajectory feature data includes a sequence of character strings drawn by the continuous input trajectory, and the process of matching at least one candidate corresponding to the trajectory feature data of the different trajectory segments is performed.
  • the continuous input trajectory it is acquired on a keyboard provided with a plurality of characters off, so when matching the candidate from the lexicon according to the trajectory feature data of the continuous input trajectory, the first obtained is Encoding the character string candidate, when obtaining the encoded character string candidate, determining the matching of the encoded character string candidate according to the correspondence between each character included in the encoded character string candidate and each character drawn by the track segment
  • the end point position corresponding to the coded character candidate on the continuous input track needs to be identified, and the track segment between the recognition start point corresponding to the coded character candidate and the position point is used as the track feature.
  • a track segment whose data matches the encoded string candidate. If the encoded string candidate needs to be converted to a term candidate, the term segment matching the track segment is the same as the track segment matched by the encoded string candidate.
  • the operation of matching the encoded character string candidate from the thesaurus is performed according to the track feature data of the unidentified track segment, until the continuous input track recognition ends, and the continuous input track is obtained.
  • the encoded string candidate corresponding to each track segment.
  • FIG. 3 a schematic flowchart of another embodiment of a word recognition method based on continuous input of multiple words according to the present invention is shown. This embodiment is an implementation manner of the embodiment shown in FIG. Examples of methods include:
  • Step 301 Detect a continuous input track on the keyboard area, the continuous input track being related to the position of the sequence of coded characters to be input by the user on the keyboard layout.
  • Step 302 Determine trajectory feature data of the continuous input trajectory according to a keyboard layout, wherein the trajectory feature data includes a track point of the continuous input track, a path of the stroke, a stroked character, and a track end point.
  • the step may be: when the continuous input track is detected, the above track feature data is acquired in real time, or the track may be acquired at the end of the continuous input track input. Sign the data.
  • Step 303 The starting point of the track of the continuous input track is taken as the current recognition starting point, and the track segment after the current identifying starting point is taken as the current track segment.
  • Step 304 According to the trajectory feature data of the current trajectory segment, matching at least one trajectory corresponding to the trajectory starting from the current recognition starting point from the vocabulary, and using the identified candidate as the current candidate.
  • the track segment located in the continuous input track after the current recognition start point is used as the current track segment to be identified, and the track segment before the current recognition start point in the continuous input track and the current identification are distinguished.
  • the track segment after the start point is distinguished.
  • step 305 using the trajectory segment after the current recognition starting point as the current trajectory segment to be identified, identifying the current trajectory segment to be identified, and when the coding segment candidate is matched in the backward trajectory segment starting from the recognition starting point, The track segments matched by the candidate items are also determined correspondingly, that is, the track segments whose track feature data matches the candidate are also determined correspondingly. After that, the judgment action of step 305 is also required in the present invention.
  • each candidate may have a different recognition end point in the current trajectory segment, that is, it can be identified starting from the same recognition starting point.
  • the recognition starting points corresponding to each candidate are the same but the recognition end points may be different, so that the track segments matched by the plurality of candidates identified by the same recognition starting point are also different.
  • Step 305 Determine whether the current track segment is recognized end, if yes, obtain at least one candidate that matches the track feature data of the different track segment, and end; if not, determine the track feature data and the current candidate The track segment is matched and executed Step 306.
  • the embodiment of the present invention uses the track segment as the loop recognition subject to describe, in fact, the string sequence drawn by the continuous input track is used as the loop body for identification, and the steps are similar, that is, the stroke is to be The string sequence segmentation is recognized as a different word.
  • the recognition process also uses the path feature data to calculate the matching degree of its candidate, and is not limited to the combination of strings, which will be described later.
  • all methods of identifying a continuous gesture associated with the position of the encoded character sequence of the recognition result in the keyboard layout as a plurality of successively connected candidates are applications of the core idea of the present invention.
  • the last letter of the current encoded character string candidate is the last stroke of the continuous input track, and the last letter of the current encoded character string candidate is the track end point of the continuous input track. At the time, the current track segment recognition ends.
  • step 306 are also re-executed for other matched code string candidates.
  • [200] 306 The trajectory segment of the continuous input trajectory after the trajectory segment matched by the current coding candidate is used as the current trajectory segment, and the starting point of the current trajectory segment is used as a new current recognition starting point, and the process returns to step 304. , to get the next level of candidates for the current candidate.
  • determining the track segment that matches the track feature data with the current candidate may also be considered as the recognition start point set when the code string is recognized and the recognition start point of the next new setting. Track segment.
  • determining the identification starting point of the current track segment can be set according to requirements, generally the starting point of the current track segment can be used as the recognition starting point of the current track segment; or, the first one of the current track segment is drawn The character is used as the starting point for the recognition of the current track segment.
  • the starting point of the current track segment can be used as the recognition starting point of the current track segment; or, the first one of the current track segment is drawn The character is used as the starting point for the recognition of the current track segment.
  • any candidate that satisfies the matching condition may be matched, or only selected.
  • a candidate whose matching degree reaches the preset value From the perspective of performance and usability, it is usually only necessary to match candidates whose matching degree reaches a preset value.
  • the accuracy of the identified code string candidate is improved, so that the identified code string matches the track feature data of the continuous input track, and
  • the invention can accurately reflect the actual input intention of the user, and the present invention also proposes several more reliable and alternative methods for identifying the continuous input trajectory.
  • step 304 in FIG. 3 an implementation method of matching candidates from the lexicon according to the trajectory feature data of the trajectory segment is illustrated, to describe step 304 in FIG. 3, including:
  • Step 401 Retrieving a candidate coded character candidate from the vocabulary, determining a corresponding recognition end position of the pending coded character candidate on the current trajectory segment, and trajecting the current recognition start point and the recognition end point As the track segment to be matched.
  • an optional manner is performed according to the sequence of the string of the continuous input trajectory drawn in conjunction with the lexicon, and of course, the vocabulary can be directly stored. Encoding the character string candidate, and then performing the subsequent steps 402 and 403 in sequence to determine whether the retrieved encoded character string candidate is an encoded character string candidate that matches the track feature data of the current track segment.
  • the matching recognition order of the path can be matched forward by character according to the character sequence of the trajectory, and each character corresponds to a specific position on the trajectory, and is set as the current recognition end point, and can also be used in the current recognition trajectory segment.
  • the current recognition end point is moved backwards one by one, and the trajectory between the current recognition start point and the current recognition end point is set as the current matching trajectory segment.
  • the end position of the trajectory segment required for determining each character in the pending encoded character string candidate is used as the recognition end point. position.
  • the characters drawn by the continuous input track and the position of the track segment have a corresponding relationship. Therefore, according to the sequence of the string drawn by the continuous input track segment and the characters in the candidate to be encoded string candidate Correspondence relationship, to determine the trajectory segment to be used from the current recognition starting point to match the last letter of the pending encoded string, and then obtain the recognition end point.
  • the sequence of the characters included in the retrieved encoded character string candidates may not be in the same order as the characters drawn in the continuous input trajectory. Completely unified, or the retrieved encoded string candidates contain letters that have not been stroked by the continuous input track. In this case, it is necessary to determine from the current recognition starting point to match each character in the encoded character string candidate. The point reached is used as the recognition end point.
  • the encoded string is not retrieved according to the sequence of strings stroked by the continuous input track, but the encoded string candidates are directly retrieved from the thesaurus, then according to the encoded string candidates and consecutive Enter the common character in the sequence of characters drawn by the trajectory to determine the end point of the recognition.
  • the path entered by the user is jiuhgfre
  • the pending code given in the thesaurus is liyu, starting from the character drawn by the first input path, retrieving the first common key in the pending code, i, then Continuing, until the last common key is found as u, the position of the last common key u is used to determine the recognition end point corresponding to the pending encoded string in the current track segment.
  • Step 402 Determine key data of each of the pending encoded character string candidates on the keyboard layout.
  • the key data of each character on the keyboard layout can be determined.
  • the key data may include a coordinate range of the character on the keyboard area, ⁇ total coverage, and a coordinate of the preset point of the letter in the keyboard layout, and may also be included in the keyboard layout.
  • the coordinates of the marker points are the same, and the coordinates of the marker points can be assigned to the letters.
  • the button borders are not set like the physical keyboard, that is, there are only letters and symbols in the keyboard layout, and there is no clear boundary between the characters.
  • character markers can usually be set as needed, such as near the center of the character.
  • the coordinate range is generally used for displaying an interaction judgment when a button boundary and a user input, and the coordinate range may include a display boundary coordinate range and a logical coordinate range, the former is used to determine a display appearance boundary of the button, and the latter is used for correlation operation processing, Such as judging whether the button is pressed or the track is
  • the logical coordinate range of the button is set to be the same or similar to the button boundary, or the display boundary coordinate range is the logical coordinate range
  • the marker point can usually be used for the calculation processing of the key position, such as by judging the user contact.
  • the logical coordinate range of the key position can be inconsistent with the displayed key boundary range.
  • the existing keyboard key error correction method can set the logical coordinate range of the key to the display boundary coordinate range for centering.
  • the user can be regarded as clicking the corresponding button without having to click precisely into the display boundary of the keyboard.
  • the logical coordinate range is enlarged, and the candidates for processing the same interactive action will be greatly increased, and the computing burden on the device is heavy, especially for mobile devices, which means power consumption.
  • the present invention also proposes a keyboard fault tolerance mode: different logical coordinate ranges are set for letters at different positions. Further, the letters at the edge of the keyboard are set to a larger logical coordinate range.
  • the method is advantageous for fault tolerance without greatly increasing the recognition operation amount and reducing the recognition efficiency.
  • This method can effectively improve fault tolerance by determining the characters of the continuous input track stroke according to the keyboard layout.
  • Step 403 If the key of each character in the pending encoded character string candidate converts the pending encoded character string candidate and/or the pending encoded character string candidate
  • the term candidate is a candidate corresponding to the track segment starting from the current recognition starting point.
  • determining that the track segment matched by the current candidate comprises: using the track segment to be matched as the track segment whose track feature data matches the current candidate.
  • [222] 1) a current identification starting point, a total length of the sequential connection line between the flag point in the keyboard layout and the end point of the to-be-matched track segment, and the length of the current matching track segment
  • the relationship may be referred to as a standard key connection
  • the total length of the sequential connection is the total length of the standard key connection track.
  • the degree of matching can also be calculated by the difference between the two, and the ratio of the difference to the length of the current track segment or the length of the wire track.
  • the distance between the mark point of each character in the encoded character string in the keyboard layout and the track segment to be matched may include a key position of each character in the to-be-coded character string and a maximum distance of the current matching track segment, and an average value and a maximum value of a closest distance between a key position of each character and the current matching track segment, The smaller the distance value, the higher the path matching degree.
  • a parameter algorithm that uses distance to reflect the degree of matching includes dividing the correlation distance by the button size (such as the height or width of the button) to obtain a ratio reflecting the path matching degree. In general, the average value of the largest The smaller, and the smaller the maximum value of the maximum distance, the higher the degree of matching of the pending encoded string with the current matching track segment.
  • the characters drawn by the to-be-matched track segment do not belong to the character of the pending encoded character string candidate, and the standard key position of each character in the pending encoded string is in the keyboard layout.
  • the total distance After determining that the character to be matched by the track segment to be matched does not belong to the character in the candidate code string candidate, calculate the key position of the character that is not in the candidate code string candidate to be connected to the standard key bit.
  • the nearest distance, and its mean and maximum value, the mean and maximum value of the large distance are used to calculate the matching degree. Generally, the larger the mean value and the maximum value are, indicating that the pending code string misses the key position. The character, the match should be lowered.
  • the area value of the closed area enclosed by the standard key line and the track segment to be matched may be calculated, and it is determined whether the area value is within a preset threshold.
  • the The path matching degree between the standard key line and the track segment to be matched reaches a preset value, and the preset condition is satisfied.
  • the matching degree of the standard key line and the track segment to be matched is higher, the area of the closed area formed by the standard key line and the track segment to be matched is smaller, and therefore, the calculation is performed in this manner.
  • the path matching degree generally, the smaller the area of the closed area, the higher the path matching degree value of the candidate coded character candidate is.
  • the simulated standard key line trajectory of each character is the current recognition starting point, a key position of each character included in the to-be-coded character string (eg, may be a center position of the key position), and a connection of the determined pending coded character string candidate corresponding to the identification end position on the current track segment A line track, the line track and the current track segment just enclose a closed area.
  • the correspondence between the input path and the key position on the standard key line trajectory can be considered, and the area of the closed area is calculated only between the corresponding key positions, and at the same time, along the input trajectory
  • the starting point sequence is calculated backwards, and the input trajectory and standard key line connecting paragraphs that have participated in the calculation are not repeatedly involved in the calculation.
  • the closed line of the simulation and the closed area of the composition of the track segment to be matched may be more than one, therefore, the closed area is calculated
  • the area should be the sum of the total areas of the closed areas.
  • the standard key line connection may coincide with the track segment to be matched, and the area value of the closed area is zero.
  • any of the above factors for calculating the matching degree of the to-be-coded character string and the track segment can be used alone to calculate the matching degree, or any combination of several can be used together, and can also be used according to various
  • the element is weighted to determine the matching degree of the encoded string with its corresponding track segment.
  • the matching degree between the retrieved encoded encoded character string and each character string drawn by a certain track segment may also be considered to further determine the pending encoded string. Whether it matches the track segment. For example, the matching order of each character in the to-be-coded character string candidate and the matching order of the characters in the track segment.
  • This method can also be used as an implementation method for matching pending coded character string candidates from the thesaurus according to the trajectory feature data of the continuous input trajectory.
  • the pending encoded string with the matching degree of the string sequence drawn by the current track segment reaching the preset value is retrieved from the thesaurus, and the pending encoded string is obtained.
  • the candidate is used as the retrieved encoded string candidate.
  • the coded string candidates retrieved in this way may be deviated from the coded string candidates that the user actually needs to input. Therefore, in practice, the path matching degree is still combined to determine the pending encoded string. The degree to which the candidate matches the trajectory feature data of the continuous input trajectory.
  • step 401 the process of retrieving the pending encoded character string candidate from the thesaurus in step 401 is based on the sequence of characters drawn by the current track segment, and the pending coded character string candidates are retrieved from the thesaurus.
  • the character of the current track segment and the sequence of each character are generally used. For fault tolerance, those characters that are inconsistent with the track segment character sequence may also be retrieved. Encoded string candidates. When calculating the matching degree, the characters appear in the encoding string candidates in the order in which they are inconsistent with the characters in the tracked strokes, or when there are redundant or missing characters, the matching degree is reduced.
  • the matching candidate Word frequency usually the frequency at which candidate words corresponding to the pending encoded string appear within a certain linguistic material range
  • the matching degree of the candidate with other candidate lexical rules and the number of characters in the encoded string corresponding to the candidate , the number of characters included in the track segment matched by the candidate, and the like, to further determine the matching degree between the candidate and the to-be-determined track segment.
  • the lexical rule matching degree between the candidate and the other pending candidate refers to whether the candidate corresponding to the trajectory segment having the connection relationship has the grammatical matching condition required to be combined into the phrase, for example, if the previous candidate is The adjective, and the current coding candidate is a noun, conforming to the Chinese lexical rule of "adjective + noun", the matching degree of the current candidate is increased. Similarly, the matching degree of candidates that do not conform to the lexical matching rule can be reduced, and the lexical matching degree can be detected both forward and backward. Usually the lexical rule matching degree is more suitable for the matching calculation of words.
  • the score of the encoded character string candidate is calculated by the path matching degree in the present invention
  • the pending encoded character string candidate is retrieved from the thesaurus
  • each character in the to-be-determined encoded character string is calculated in the keyboard area.
  • the path matching degree between the connection track and the current track segment when the path matching degree reaches a preset value, determining that the to-be-coded character string candidate is the matched coded string candidate, and the to-be-coded
  • the path matching degree of the string candidate is used as the score of the encoded character string candidate. The higher the path matching degree, the higher the score of the encoded character string candidate.
  • the score of the encoded string is generally comprehensively evaluated according to the path matching degree, the word frequency of the coded character string candidate, and the number of characters included in the to-be-coded character string.
  • the process of matching the candidate may be After matching multiple candidates in the track segment starting from the same recognition starting point, it is determined for each candidate whether the current track segment is recognized or not, that is, in the continuous input track, the track feature data and the candidate Whether there is still an unidentified track segment after the matched track segment until the end of the continuous input track segment recognition.
  • FIG. 5 a schematic flowchart of another embodiment of a word recognition method based on continuous input of multiple words according to the present invention is shown. This embodiment is an implementation manner of the embodiment shown in FIG. 3. The embodiment includes:
  • Step 501 Detect a continuous input trajectory on the keyboard area, wherein the continuous input trajectory is related to the corresponding key sequence position of the encoded character string to be input by the user on the keyboard layout.
  • Step 502 Determine trajectory feature data of the continuous input trajectory according to a keyboard layout, wherein the trajectory feature data includes a track point of the continuous input track, a path of the stroke, a stroked character, and a track end point.
  • Step 503 The track point of the continuous input track is taken as the current recognition starting point, and the track segment after the current recognition starting point is taken as the current track segment.
  • Step 504 According to the trajectory feature data of the current trajectory segment, matching from the lexicon, at least one candidate starting from the current recognition starting point.
  • the continuous input trajectory drawn by the user input "we” input string “women” shown in FIG. 2 is introduced as an example, and is drawn according to the continuous input trajectory.
  • the encoded string is used to retrieve the pending encoded character string candidate from the thesaurus, and whether the pending encoded character string candidate is used as the matched encoded character string candidate is taken as an example according to the path matching degree.
  • the sequence of characters stroked by the continuous input track is "wertyuiokmjjhgfedfbn", and it is apparent that each character in the sequence of characters also corresponds to a specific position and range on the input track.
  • the entire continuous input trajectory is taken as the current trajectory segment, and the starting point of the current trajectory segment is the trajectory of the continuous input trajectory.
  • the process of matching the pending encoded string from the lexicon according to the trajectory feature data of the current trajectory segment and the existing The way is similar. That is, according to the trajectory feature data of the current trajectory segment, it can be analyzed that the trajectory starting point of the continuous input trajectory is located on the letter button "w", and the recognition is performed.
  • the "W” corresponding to the starting point is used as the starting point of the string matching, and it is judged whether the letter "W” is the encoded character string candidate existing in the thesaurus. Since the encoded character string candidate does not exist in the thesaurus, "w" and the The next character of the track segment is combined, that is, "we”. Since the "we” code string candidate does not exist in the lexicon, the points on the current track segment are sequentially moved backward to determine the current track segment. If the stroke character is "r”, then "w” and “r”, and “we” and “r” are combined, respectively, and whether the two code string candidates are present in the lexicon, if If it does not exist, it will continue to move backward along the current track segment.
  • the encoded string first identified is "wei"
  • the encoded string is a pending encoded string candidate existing in the thesaurus.
  • FIG. 6a and FIG. 6b respectively, a schematic diagram of the key line trajectory of each character in the candidate code string candidates " wo " and " en “ in the keyboard layout, wherein the thicker line represents each character in the to-be-coded string.
  • the closed area of the shaded portion of the figure is the closed area composed of the line track and the current track segment (in this case, the detected continuous input track).
  • one way of forming the closed area is: connecting the start point of the track of the current track segment with the center of the key of the first character "w" in the candidate "wo", and then "w"
  • the center of the key is connected to the center of the key of the next character "0”
  • the critical point of the key of the current track segment drawn through "0” is connected to the center of the key of the character "0” (note that Generally, the track segment between the start point of the track and the critical point is used as the track segment corresponding to the coded character candidate, thereby obtaining the closed region of the shaded portion shown in FIG. 6.
  • the matching degree can be determined by calculating the area of the closed area in Figs. 6a and 6b. It can be seen that the area of the closed area in Fig. 6a is smaller, and the area of the closed area in Fig. 6b is smaller. Larger, the larger the area, the lower the matching.
  • the path matching degree of "wo” and "en” After calculating the path matching degree according to the area of the closed area, it can be judged whether the path matching degree of "wo" and "en” reaches the preset value respectively, assuming that the path matching degree of "wo” is 85, the path matching of "en” The degree is 30, and the matching value of the matching degree requirement is greater than 60, then the path matching degree of "wo” satisfies the requirement, "wo” can be used as the matching code string candidate; the path matching degree of "en” Below 60, if the preset value is not reached, "en” cannot be used as a matching code string candidate.
  • the other pending encoded strings are also determined by the way of the path matching degree as above as the matched encoded string candidates.
  • the corresponding path segment is also determined, that is, the recognition starting point and the recognition end point of the corresponding path segment are also It should be determined that the track segment to be matched, such as "wo", is a track segment of the letter sequence "wertyuio", and "wo" (which may also include the term candidate "I") and the track to be matched.
  • the trajectory segment to be matched is a trajectory segment whose trajectory feature data matches "wo" (which may also include the candidate term "I").
  • the search for all possible candidates in a one-time manner based on the trajectory string and the lexicon is computationally intensive.
  • the path matching degree calculation usually starts from the recognition starting point to the path before the track position corresponding to the last character of the encoded character string to be retrieved and the encoded string to be retrieved.
  • the retrieval by the identification starting point may be suspended to improve the efficiency of the retrieval. Aborting the search for the recognition starting point does not mean suspending all of the search work.
  • the search will continue. For example, when the matching degree of the calculation "wei" meets the requirements, the search for the backward group words is continued, “weih” belongs to the part of “weihe” in the thesaurus, but the matching degree of "weih” has not met the requirements, based on “weih” "Continue to the backward group words such as “weihe” can no longer continue, of course, “weihe” will not have the opportunity to become a candidate, nor need to calculate the matching degree.
  • the matched candidates may also include the word candidates after the encoded string conversion.
  • the encoded character candidate “weihe” may also be converted into Chinese characters.
  • the candidate terms such as "why", "dimensional”, etc., specifically convert the string candidate into which term candidate, and need to combine the word frequency of the term candidate with the previously selected term candidate.
  • the lexical matching rules and other factors are determined.
  • each of the pending encoded strings may be first The matching degree of the candidate is recorded so as to be used for calculating the evaluation score of the matching degree of different candidates.
  • an evaluation score can also be used to reflect various comprehensive evaluation results, which reflect the path matching degree, word frequency, matching with other option lexical rules, and subsequent candidates. Factors such as rating.
  • Step 505 sequentially identify each candidate as a current candidate, and determine whether the current track segment is recognized for the current candidate. If yes, execute step 506; if no, execute step 507.
  • the track point of the continuous input track is used as the recognition starting point, and the coded character string candidate is matched from the thesaurus, and then the encoded character string is matched from the thesaurus according to the track feature of the continuous input track. The process is over.
  • the input "women” shown in FIG. 2 is taken as an example, and the matched candidate is used as an example of the encoded character string candidate.
  • the identified encoded character string candidates include: "wo”, “wei”, “women”, "er”,
  • the track segment after "0" is used as a new track segment to perform subsequent operations until the continuous input track recognition ends.
  • "women” or “rumen”
  • the current track If there are no other track segments after the "n” key in the track, then for "women", the current track segment recognition ends.
  • the process of determining whether the current track segment is recognized or not is similar.
  • the matching candidate is a term candidate composed of Chinese characters
  • the process of judging whether the current track segment is recognized or not is also the same as this, except that after matching the encoded string, the encoded string will be encoded.
  • the candidate is converted to a term candidate.
  • Step 506 Determine whether each of the identified candidates has been processed as the current candidate in step 505, and if so, obtain the encoded character string candidates that match the trajectory feature data of the different trajectory segments. And at least one coded string candidate corresponding to each track segment ends, and if not, returns to step 505.
  • the obtained recognition result matching the continuous input trajectory has multiple groups, such as "wo-men”, “wei-o-men”, “tui-o-men”, “yu-o-” Me”, “women”, etc., the candidate codes displayed by "-" in each group are sequentially corresponding to different track segments of the continuous input track.
  • Step 507 The track segment after the track segment corresponding to the current candidate in the continuous input track is used as the current track segment, and the starting point of the current track segment is used as the new current recognition starting point, and the process returns to step 504 to Get the next level of candidates for the current candidate.
  • the process of retrieving the encoded character string candidate is based on the method.
  • the trajectory feature data of the trajectory is continuously input, so when the encoded character string candidate is retrieved, the trajectory segment corresponding to the encoded character string candidate can also be determined, that is, the trajectory starting point of the current trajectory segment is to the encoded character string candidate.
  • the track segment between the points on the current track segment of the last character is used as the track segment corresponding to the coded character candidate. If "wo" is retrieved in the above example, the start point of the track from the continuous input track can be determined.
  • a track segment between points of "0" is used as a track segment that matches the encoded string "wo".
  • the trajectory segment here is mainly used to describe and explain and identify the corresponding relationship between the result and the different paragraphs of the trajectory, and does not mean that the trajectory needs to be explicitly segmented in the recognition process.
  • the method includes: determining that the first character of the new current track segment is "k", "k” is used as a string matching starting point, and the character starting from “k” and the current track segment is drawn after “k”
  • the string sequence ie "mjjhgfedfbn”
  • is matched in order such as "ke”, “ken”
  • the next character "m” is used as a string matching starting point, and the matching is started with “m” until the current track
  • Each candidate coded character candidate in the segment is matched, and if "me” or "men” is obtained, the matching process performs path matching degree calculation, and determines the coded character candidate in which the matching degree in the current track segment satisfies the requirement.
  • the group word process can synchronously calculate the matching degree.
  • the group word and the matching degree calculation starting from the current recognition starting point can be ended to reduce the calculation amount.
  • the matching character string candidates with the matching degree include "ke”, “ken”, “me”, “men”, then for the encoding string "ke", the current track segment is still not recognized, and still needs to be
  • the track segment after the key of the "e” in the encoded string “ke” in the continuous input track is taken as the new current track segment, that is, after the key position of the character "e” and the stroke character is sequentially
  • the track segment of the string "dfbn” continues the string candidate matching process as above until the end of the current track segment recognition, which is similar for "me”.
  • At least one encoded character string candidate matching the trajectory feature data of different trajectory segments of the continuous input trajectory is matched from the lexicon, so that even if multiple lexicons to be input by the user cannot be combined
  • the continuous input trajectory of the plurality of words can be matched by the method of the present invention, thereby obtaining the encoded character string candidates of the different trajectory segments.
  • the present invention still Will continue the matching operation of the unrecognized track segment after "0", and then identify the "men” code string candidate (or the term candidate "men", “door”, etc.) Obtaining an encoded character string candidate matching the trajectory feature data of two track segments of a continuous input trajectory, and then obtaining a coded character string candidate to be input by the user and a corresponding word candidate through subsequent display output, thereby implementing The "we" input process can be completed with one continuous input.
  • the present invention can identify multiple sets of recognition results corresponding to different paragraphs, such as “wo-men”, “wei-o-men”, “tui-o-men””,”yu-o-me”,”women”, etc., can be based on the user
  • the selection of the recognition result is used to determine the final input content, and the preferred input result can be recommended to the user and the user's modification can be accepted by sorting the matching degree of the candidate and the path.
  • Another optimization method is: automatically optimizing the multi-group recognition results by comprehensive scoring including path matching degree to recommend the matching result that best matches the different paragraphs of the trajectory to the user.
  • Figure 7a shows the continuous sliding path of the contact in the process of sequentially drawing the string "jiuhenhao" on the keyboard area by the continuous input method.
  • the matching process of the continuous input track is in the shape of a tree. Schematic diagram to show that See Fig. 7b, which is a candidate coded string obtained in the matching process of the continuous input track of Fig. 7a, which is shown in a tree structure diagram.
  • Fig. 7b is a candidate coded string obtained in the matching process of the continuous input track of Fig. 7a, which is shown in a tree structure diagram.
  • the recognition result of the subsequent segment adjacent to one track segment is referred to as the current coded character.
  • the next-level encoded string of the string is referred to as the current coded character.
  • the recognition result of the adjacent track segment is presented in the relationship of the parent-child node, and the plurality of encoded strings of the same recognition start point are recognized ( The corresponding path recognition end point may be different), and the relationship is represented by the sibling node of the same level.
  • the encoded character string candidates are matched from the lexicon based on the trajectory feature data of the continuous input trajectory.
  • the sequence of the string of the continuous input trajectory is "jiuhgfderghnbhgfdsasdfgio"
  • the entire continuous input trajectory is used as the current trajectory segment
  • the continuous input trajectory is As the current recognition starting point
  • the track point matches the first-level coded string candidates shown in Figure 7b from the thesaurus, namely the three encoded strings "jiu”, "ji” and "jiuge”.
  • the candidate of course, the number of matched coded string candidates is related to the trajectory characteristics of the continuous input trajectory and is related to the set matching degree threshold. This example is only illustrative The number of first-level encoded string candidates that actually match may be many.
  • the track segment that matches "jiu” in the continuous input track is from the track point of the continuous input track to the continuous input track by "u”
  • the continuous input track after the "u” in the continuous input track is taken as the new current track segment
  • the sliding of the continuous input track Based on the path, the next character after the "u” is determined as "h”, and the track point of the new current track segment is used as the current recognition starting point (of course, it may be a point on the "h” key.
  • the current recognition starting point starting with the current recognition starting point to match the current track segment to the encoded character string candidate, that is, first using "h” as the string matching starting point, and combining with the subsequent stroked characters, and then the string
  • the next character “g” after the matching start point is moved from “h” to "h”, and the string matching is continued, and so on, and the track segment after the character "u” is obtained as the current track segment. match Encoding the next stage after the second level encoded string candidates, i.e., to obtain "jiu” candidate string: "henhao", "hen", “gen”, "he”.
  • next-level encoded string candidate of jiuge As shown in the figure, for "j iuge", the corresponding next-level encoded character string candidate, that is, the second-level encoded character string candidate "hao”, and the next-level encoded character string candidate corresponding to "ji" are available. This is not shown in detail in Figure 7b.
  • the coded string candidates at the third level corresponding to "hen” in the figure are: “hao” and "ha”.
  • the matching degree of each candidate is calculated, so that when the candidate presentation is performed, each candidate option is displayed according to the degree of the candidate matching degree, and the candidate and the continuous
  • the matching degree of the trajectory feature data of the input trajectory is one of several factors such as the path matching degree, the string matching degree corresponding to the candidate, the closed area area, the word frequency, and the like as the score for calculating the candidate.
  • the matching degree of the candidate with the trajectory feature data of the continuous input trajectory is reflected.
  • the number indicated above each code string candidate in Figure 7b is the matching degree of each code string candidate, such as the matching rating of "jiu" It is divided into 92, and the matching score of "jiuge” is "95". If the best input result is judged directly by the path matching degree, the first level should select "jiuge". Obviously, this and the user's expected input result are Inconsistent.
  • the detected continuous input trajectory is sequentially subjected to a plurality of encoded character strings having a certain relationship, and in the process of performing the recognition result optimization, if only the trajectory feature data of the individual trajectory segments is referred to If the matching degree of the matching candidate is used to preferentially identify the result, the input result close to the user's expectation may not be obtained.
  • the first level is the most The superior code is "jiuge"
  • the next level of "jiuge" is the second level candidate "hao”.
  • the preferred result according to this principle may result in continuous input of the first part of the track segment.
  • the score is very high, and the subsequent track segment has no corresponding recognition result. Such comprehensive recognition result is obviously not close to the user's true intention. Therefore, at the time, it is also necessary to comprehensively consider the relationship between the coding character candidate and the matching degree of the recognition result of the trajectory feature data of the entire continuous input trajectory. Therefore, in the present invention, when calculating the candidate grading, the candidate is also required. Whether there is a score of its corresponding lower-level candidate and its next-level candidate is used as an evaluation factor for calculating the candidate score.
  • the candidate is referred to as a current candidate, and whether there is still a subsequent trajectory segment after the trajectory segment matched by the current candidate in the continuous input trajectory, and the subsequent Whether a new candidate (ie, the next candidate of the current candidate) is matched in the track segment is used as a scoring factor for the current candidate. If it is set that there is a next-level candidate in the current candidate, the score of the current candidate is lowered, and the highest score of the next-level candidate of the current candidate is also used as the score for adjusting the current candidate.
  • the basis for an optional scoring calculation is to accumulate the score of the next-level candidate according to a specific algorithm to the score of the upper-level candidate, such as the track segment matched according to the next-level candidate.
  • the length of the path, the number of characters crossed by the path, the number of characters in the encoded character string corresponding to the candidate, the word frequency coefficient of the candidate, and the product of the next-level candidate score of course, Other algorithms that achieve similar effects. This design principle helps to avoid identifying the entire path as a single single character or short word, closer to the actual needs.
  • the process of calculating the path matching degree may refer to any one or several of the embodiments shown in FIG. 3 to calculate the path matching degree.
  • the score is calculated according to the path matching degree and other scoring factors, and the weighting calculation is based on the word frequency and the word length as an example:
  • Word length adjustment The word length refers to the number of characters contained in the encoded string, and takes the number of characters included in the encoded string as a bonus score as an example.
  • the number of characters covered by the trajectory segment corresponding to the encoded character string candidate, the score of the next-level encoded character string candidate of the encoded character string candidate, and the trajectory corresponding to the next-level encoded character string candidate is adjusted by factors such as the number of characters crossed by the segment.
  • the track segment corresponding to "ha” has 7 characters.
  • the upper-level coded character candidate "hen” of "hao” and “ha” can be calculated at the same level, and The score of the coded string candidate having the same recognition starting point as “hen”, that is, the scores of "henhao", “hen”, “gen”, “he”, and “ge” are calculated. And step by step until the score of "jiu” is adjusted.
  • the scoring adjustment process for the encoded string candidates "ji", “jiuge” and their corresponding lower-level encoded string candidates is similar, and finally the second level, "HenHao" score is higher than the same parent node.
  • the "jiu" score is higher than the other sibling options under the same parent node, and the best combination of the candidates corresponding to the different paragraphs of the continuous input track of "Jiu HenHao" is successfully obtained.
  • a method for calculating the score from the bottom up in the candidate search process In the implementation, other calculation orders such as top-down calculation can be used after all the candidates are determined.
  • the score can be used directly to identify the result preference, and the higher the score in the same level candidate, the more the user input intent is matched.
  • the calculation process of the score can be directly calculated in the process of matching the coded string candidates, or can be calculated after all the coded string candidates are matched.
  • the present invention does not limit the implementation order, and may even The calculation process of matching degree and score is integrated in one parameter and calculated once.
  • the matching candidate is a term candidate converted from the encoded string, in the process of calculating the matching degree score of the candidate, it is also necessary to consider each of the encoded character candidates when performing the word frequency adjustment.
  • the word frequency of the converted term candidate such as the comparison code string candidate "hao”
  • the word frequency of each term candidate such as "good”, “number” and “consumption” converted by "hao”.
  • the lexical matching degree between the term candidate and the previous or lower level term candidate thereby determining which of the "good” "number” and “loss” the converted term candidate can be. Or which of them, and determine the match score for the entry candidate.
  • the present invention in addition to adopting the embodiment shown in FIG. 5, after matching the candidates with the same identification starting point first, and then sequentially matching the next-level candidates of the respective candidates, that is, In addition to the candidates for different trajectories in the "broad-first mode", the present invention can also use the "depth-first" algorithm to match different The candidate for the track should be.
  • the breadth priority is started at the beginning of the recognition of the current track segment, and all the candidate candidates that match the identified starting point are matched, and then each of the candidates is separately judged whether the current track segment is recognized or not. If the end is not recognized, the new current track segment in the continuous input track after the track segment matched by the candidate is determined to match the next level candidate of the candidate from the new current track segment until the candidate of each candidate. The next level of candidates are matched, and then each of the matching next level candidates is used as the current candidate, and then continues to match the next level of the current candidate, so that the loop exists until each candidate exists. The first level candidates are matched.
  • the depth-first method starts from the recognition starting point of the current track segment, and after matching a candidate, determines whether the track segment has a track segment after the track segment data matches the candidate track segment. Whether the current track segment is recognized or not, if the current track segment is not recognized, the track segment after the track segment in which the track feature data matches the candidate in the continuous input track is taken as the current track segment, and the recognition starting point of the current track segment is determined. And, when a lower-level candidate of the candidate is matched from the recognition starting point, continue to use the lower-level candidate as the current candidate, and continue to perform a lower-level candidate matching the current candidate The operation, so loop.
  • the recognition starting point of the current trajectory segment is returned, and the operation of matching the encoded character string candidate that satisfies the trajectory feature data of the current trajectory segment from the lexicon is continued from the recognition starting point. , so loop until the end is not recognized.
  • the different track segments of the continuous input track shown in FIG. 7a are matched to obtain the respective encoded strings in the tree structure shown in FIG. 7b.
  • the order of the candidates is as follows: First, the starting point of the trajectory of the continuous input trajectory is used as the recognition starting point, and the first-level encoded character string candidates are matched: "jiu”, "ji” and “jiuge", and then the three first levels are respectively
  • the encoded string candidate is used as the current encoded string candidate to match the next-level encoded string of the current encoded string candidate.
  • the candidate is the second-level encoded string candidate.
  • next-level encoded string candidates of the candidate "jiu” are matched by the method of the present invention, and "ji" and “jiuge” are simultaneously used.
  • the next-level encoded string candidates are all matched, and then the second-level encoded character string candidates are obtained, and then the next-level encoded character string candidates of the second-level encoded character strings are respectively matched, and each The encoded string candidate at the third level, and so on until the fifth level is recognized, such as the next-level encoded character string candidate of "si" is "0", and the current track segment corresponding to "0"
  • the current track segment recognition corresponding to "0” ends.
  • the next-level coded character of "di” and the "0" of the list option the corresponding current track segment is also recognized to end, so that the entire continuous input
  • Trajectory feature data matching from the lexicon except "hao"
  • Other current track to the track segment feature data matches the encoded string candidates, as also matched “ha”.
  • the current trajectory segment does not recognize the end, then the trajectory segment in the continuous input trajectory after the stroke "a” is taken as the new current trajectory segment, and continues to match the first one of the "ha”
  • the first-level coded string candidate is assumed to be "so”. Since "0" is the last stroke of the continuous input track, the current track segment recognition ends, and the recognition start point of the current track segment corresponding to "so" is returned.
  • the recognition starting point can also match other encoded character string candidates, that is, whether there are still other next-level encoded character string candidates of "ha", thus performing a recursive loop, Until the coded character candidates corresponding to the trajectory feature data of each track segment in the continuous input track are matched, the coded character string candidates of each level are matched, and the result is visually displayed in a tree structure as shown in FIG. 7b.
  • the word segmentation area may also be set in advance on the keyboard area, as shown in Fig. 13, the word segmentation boundary is the word segmentation area. If the user needs to input a plurality of consecutive words, each word may be completed. After the input of the character string corresponding to the word or the phrase, the letter key area is divided into the word segment identification area and then entered into the letter key area again, and the continuous input of the character string of the subsequent multiple words is continued, thereby obtaining the inclusion of at least one word segmentation.
  • the continuous input trajectory of the identified position is stored in the trajectory information of the continuous input trajectory in the position information of the entry and exit segment identification area, and is used as a word segmentation identifier.
  • the trajectory feature data of the continuous input trajectory is acquired, and the trajectory feature data includes the start point, the end point, the stroke path, and the stroked character of the trajectory.
  • the data information of the word segmentation mark drawn by the continuous input track is included.
  • the word segmentation area preset on the keyboard area may be one or more designated buttons, or may be a specific area in the keyboard area.
  • the segmentation identifier of the continuous input trajectory is used as the division basis of the continuous input trajectory, and the continuous input trajectory is divided into multiple segments segmented by the word segmentation identifier.
  • Continuously input the track segment that is: it is not allowed to form a candidate code across the word segmentation. For example, if there is a word segmentation between the stroke letters h and a, then the word segmentation cannot be formed across the word segmentation, that is, the two letters are composed. The "ha" is not in accordance with the candidate formation rules.
  • the above matching method is still used to match the encoded string candidates from the thesaurus.
  • FIG. 8 there is shown a flow chart of another embodiment of a word recognition method based on multi-word continuous input according to the present invention.
  • This embodiment takes an example of at least one word segmentation area in a continuous input track. Describes the matching process for encoding string candidates.
  • This embodiment includes:
  • Step 801 Detect a continuous input trajectory on the keyboard area, wherein the continuous input trajectory is related to a corresponding key sequence position of the encoded character string to be input by the user on the keyboard layout.
  • Step 802 Determine trajectory feature data of the continuous input trajectory according to a keyboard layout, where the trajectory feature data includes a track point of the continuous input track, a path of the stroke, a stroked character, a track end point, and a continuous input track The word segmentation of the scripture.
  • Step 803 The track of the continuous input track is taken as the current recognition starting point.
  • Step 804 Determine whether the continuous input trajectory after the current recognition starting point includes a word segmentation identifier, and if yes, proceed to step 805; if no, proceed to 810.
  • the trajectory segment between each word segmentation identifier the trajectory segment between the trajectory start point of the continuous input trajectory and the first segmented word segmentation identifier, and The track segment between the last segmented segmentation identifier of the continuous input track and the end point of the track is matched as a separate track segment.
  • Step 805 Determine a word segment identifier of the continuous input track that is located after the current recognition starting point and is closest to the current recognition starting point, and proceeds to step 806.
  • the detected continuous input trajectory may include data of a plurality of word segmentation marks that have been drawn, after determining the current starting point, it is necessary to determine a word segment identification point closest to the current identification starting point, and the current identification starting point is The trajectory segment between the segmentation identifier and the vocabulary identification point is used as an input trajectory segment to perform candidate matching operations
  • the vocabulary feature data According to the trajectory feature data of the continuous input trajectory between the current recognition starting point and the segmentation identifier, the vocabulary feature data The candidate in the library matching at least one track starting from the current recognition starting point will match the selected one as the current candidate, ⁇ step 807. The track segment between the current recognition starting point and the nearest word segment identifier is used as a continuous input.
  • a trajectory which in turn adopts any one of the at least one encoded character string candidate matching the trajectory feature data of the different trajectory segments of the continuous input trajectory from the lexicon described in the embodiment shown in FIG. 3, 4 or 5
  • the specific process can refer to the description of the above embodiment, and will not be mentioned here.
  • Step 807 Determine whether the continuous input trajectory between the current recognition starting point and the nearest participle identifier is recognized to end, if yes, go to step 808; if no, go to step 809.
  • Step 808 setting a new current recognition starting point in the continuous input track after the most recent word segmentation, and returning to step 804;
  • a new current recognition starting point is set in the continuous input trajectory after the nearest participle identifier, and the most recent participle identifier can generally be used as the new current recognition starting point, so as to be located in the recent input of the continuous input trajectory
  • the continuous input trajectory after the word segmentation is identified to match the candidate of the trajectory feature of the continuous input trajectory after the most recent segmentation identifier at the current time.
  • Step 809 Determine a track segment whose track feature data matches the current candidate, and use the continuous input track in the continuous input track between the end point of the track segment matched by the current candidate to the nearest word segment identifier as the current The track segment is used as the new current recognition starting point, and the process returns to step 806 to obtain the next level candidate of the current candidate.
  • step 809 If it is determined in step 809 that the track segment between the current recognition start point and the nearest word segment identifier is not recognized, a new current track segment needs to be determined in the segment track segment for subsequent matching. This process is similar to the implementation of the previously described embodiment without the word segmentation.
  • Step 810 According to the trajectory feature data of the trajectory segment located in the continuous input trajectory after the current recognition starting point, at least one trajectory selected from the current recognition starting point is matched from the vocabulary, so as to match The candidate continues to match its next-level candidate as the current candidate.
  • the operations of steps 303 to 306 in the embodiment shown in FIG. 3 may be used as the current track segment after the current recognition starting point. Starting from the current recognition starting point, according to the trajectory feature data of the current trajectory segment, the operations of steps 304 to 306 are repeatedly performed to match the trajectory feature data of different trajectory segments of the continuous input trajectory after the current recognition starting point from the lexicon.
  • Step 810 can include the following sub-steps M1 through M3: [326] M1, matching at least one candidate from the lexicon according to trajectory feature data of the continuous input trajectory segment after the current recognition starting point;
  • [327] M2 determining whether the track segment located after the current recognition start point in the continuous input track ends, and if so, obtaining at least one candidate that matches the track feature data of the different track segments; if not, determining The track feature data matches the track segment of the current candidate, and executes the step M3.
  • FIG. 9 a schematic diagram of a continuous input trajectory including position information of a word segmentation identifier in the present invention is shown, which is assumed in the keyboard area.
  • the area of the letter ⁇ area is the default word segmentation key area.
  • 901 is the boundary line between the letter ⁇ area and the word segmentation area.
  • the user inputs "torque extra large” (the corresponding pinyin code is “lijuteda”), and assumes that there are “torque” and "extraordinary” words in the thesaurus.
  • the candidate and the corresponding string encoding "liju”, “teda”, but without “torque extra large” and its string encoding the user continuously strokes the letters of "liju” in the process of continuous input, and then draws The letter key area enters the word segmentation area, and is further divided into the letter area from the word segmentation area to sequentially route the letters of "teda”.
  • the continuous input trajectory is marked "u”
  • the dotted line segment in the figure is the track segment drawn in the word segmentation area, and then the word segmentation area is assigned to the letter button.
  • On the key of t in order to continue to slide continuously through the corresponding letters.
  • the system detects a continuous input trajectory as shown in FIG. 9, it can be determined that the continuous input trajectory includes a word segment identification area (ie, the trajectory segment data of the broken line portion is word segmentation identification data information), and the system continuously inputs The track in the track before and after the word segmentation area
  • the segments are identified as two trajectory segments respectively.
  • candidate matching is performed according to the trajectory feature data of the trajectory segment between the starting point of the continuous input trajectory and the starting point of the broken line segment, and the "torque"("liju" can be obtained.
  • the candidate then the candidate matching of the trajectory feature data of the trajectory segment according to the end point of the broken line segment to the end point of the continuous input trajectory (at the "A" letter button), can be obtained to contain "tele"("teda") Among the candidates.
  • the trajectory feature data according to the current trajectory segment is matched, and at least one trajectory starting from the current recognition starting point is matched from the vocabulary. It will appear that the candidate track cannot be matched in the long track segment after the start of the current recognition start point. Therefore, repeatedly starting the recognition with the recognition start point will bring more useless data processing, which increases the system data processing amount.
  • the current recognition starting point is moved backward, and the track segment of the continuous input track that is located after the current recognition starting point is As a new current track segment, the operation of matching the candidates in the lexicon that match the trajectory feature data of the track segment after the current recognition start point is continued.
  • the encoded character string candidate matching the trajectory feature data of the current trajectory segment is not found from the lexicon, and usually includes
  • the trajectory matching degree of all candidates starting from the recognition starting point is lower than the preset threshold, it is determined whether the current trajectory segment is recognized end, and if not, the current recognition starting point is moved backward, and the continuous input trajectory is located.
  • the track segment after the current recognition start point is used as the current track segment, and the new recognition start point after the subsequent move is started to start the operation of matching the candidate.
  • the recognition starting point can be moved to the position corresponding to the next character of the input trajectory, as shown in FIG. 10, when the user needs to input "nikeyi"("youcan"), the normal The input path should be followed by n, i, k, e, y, i.
  • a typical input gesture stroke sequence consists of "njikiuytrertyui", but in practice The user accidentally strokes the f key and then strokes the k, e, y, i, and the character sequence of the stroke is "njihgfghjkiuytrertyui", which identifies "ni” and other candidates (if any).
  • the new path recognition starting point is set to the corresponding position after i (the starting point of the path matching degree is also the corresponding starting point), and the matching degree of any candidate of the backward combination of the characters of the path between h and k is found. If the corresponding position of f is identified, and the matching degree of the track segment has fallen below the preset matching degree threshold, the matching degree of the pending coding candidate that is continuously recognized backward by the starting point may not meet the requirement. If the entire path is not recognized, in order to improve the fault tolerance rate, the current recognition starting point is moved backward to the position corresponding to the next letter, and the track segment located in the continuous input track after the current recognition starting point is used as the current track segment.
  • the continuous input trajectory may be re-recognized by the user-activated temporary adjustment matching degree threshold. At least one encoded character string candidate is matched from the lexicon according to the trajectory feature data of the current trajectory segment.
  • the invention may further comprise:
  • the matching threshold is restored to the previously saved matching threshold, that is, the original matching threshold of the system.
  • a button for triggering the adjustment of the matching threshold may be set on the keyboard layout.
  • a button may be specified in the original button of the keyboard area to adjust the matching degree, or in the keyboard layout.
  • Add the "Tolerance Recognition” button on the top Adjust the matching threshold by adjusting the “Tolerance Recognition” button or button. If the user's real intention is to input "not urgent" (pinyin code “buji”), but because the system preset matching threshold is too high, The best segmentation recognition result of automatic recognition is 2 words (corresponding to different path segments respectively), and the user's desired "not urgent"("buji") is neither a preferred recognition result nor an optional candidate.
  • Temporary tolerance recognition in addition to down-conversion matching requirements, can also include the logical coordinate range of the temporary expansion button, which also helps to improve the fault tolerance of the identification process.
  • the continuous input trajectory on the keyboard area is detected, and the trajectory feature data of the continuous input trajectory is determined according to the keyboard layout, and the obtained trajectory feature data includes the continuous input trajectory.
  • the operation of matching the encoded character string candidates from the thesaurus is described as an example.
  • the trajectory feature data of the continuous input trajectory can be determined only according to the coordinate position of each point on the continuous input trajectory, and the trajectory of the continuous input trajectory acquired at this time is obtained.
  • the feature data does not include the sequence of strings that the continuous input track has drawn through.
  • the sequence of the string drawn by the continuous input trajectory is determined, and then according to the foregoing various embodiments.
  • the manner of matching the encoded string is described, and at least one encoded string candidate matching the different trajectory segments of the continuous input trajectory is matched from the lexicon.
  • the stroked character string sequence may also be added as the trajectory feature data to The trajectory feature data of the continuous input trajectory is determined.
  • the recognition starting point is set in the trajectory segment, the recognition starting point is matched backwards one by one, but in actual application, the character string drawn in the continuous input track can be matched successively from the first stroked character, and when matching the candidate
  • the track feature data according to the stroke path of the continuous input track, the start point of the track, the end point of the track, and the inflection point of the track are used to determine whether the matched candidate is a candidate whose matching degree satisfies the requirement.
  • the method can include the following steps:
  • the trajectory feature data of the continuous input trajectory is determined according to the keyboard layout, the trajectory feature data includes a sequence of character strings drawn by the continuous input trajectory. If the obtained trajectory feature data does not include the character string sequence drawn by the continuous input trajectory, the sequence of the character string drawn by the continuous input trajectory may be determined according to the keyboard layout and combined with other trajectory feature data of the continuous input trajectory.
  • the process can refer to the operation of retrieving encoded string candidates from the thesaurus as described above.
  • determining the process and the front of the track segment to be matched corresponding to the coded character string candidate according to the correspondence between the character included in the coded string candidate and the string sequence drawn by the continuous input track The embodiment shown in Fig. 4 determines the manner in which the end point is identified, and the description of the embodiment of Fig. 4 can be referred to. Certainly, determining the trajectory segment corresponding to the encoded character string candidate may also be understood as a position between the position points in the continuous input trajectory that will be matched when the recognition starting point is matched to the encoded character string corresponding to the current candidate. Track segment.
  • the manner of calculating the matching degree between the to-be-coded character string candidate and the track feature data of the track segment to be matched reaches a preset value is the same as the manner of calculating the matching degree in the above embodiment, Reference is made to the description of calculating the matching degree in any of the foregoing embodiments.
  • the word input may also be performed in accordance with a selection result and/or a confirmation operation of the displayed encoded character string candidate by the user.
  • the user can select from the candidates and confirm to input the words as needed. If the candidate currently presented is the candidate that the user needs to input, the user can perform a corresponding confirmation operation to input the words.
  • the process of presenting each candidate includes: displaying a candidate according to a connection order relationship between the track segments matched with the respective candidates, and an evaluation score of the candidate.
  • the core of the present invention is to identify a plurality of words corresponding to different paragraphs of the entire trajectory.
  • the trajectory segments matching the candidates are determined in order to facilitate the connection order of the candidates, according to the description.
  • the track segment corresponding to "henhao” is after the "u” key to the continuous input track The end point. Then the track segment corresponding to "henhao” corresponding to "jiu” is the two track segments having the connection relationship in the continuous input track, and the time of the track segment corresponding to "jiu” is crossed in the continuous input track. It is earlier than the time of the track segment corresponding to "( ⁇ ”. In other words, it can be said that in the trajectory sequence, the track segment corresponding to "jiu” in the continuous input track is located before the track segment corresponding to "henhao” , and the end of the track segment corresponding to "jiu” is connected to the head end of the track segment corresponding to "henhao". When the candidate is displayed, "jiu" and “henhao” can be combined to form a combined code string candidate. Show.
  • henhao corresponds to the same trajectory segment corresponding to other encoding character candidates, and there is no sequential connection between the trajectory segments corresponding to "henhao".
  • the trajectory segment corresponding to "henhao” is the trajectory segment from the "u” key position to the end point of the continuous input trajectory, and the trajectory segment corresponding to "hen” is from the "u” key position to "n”
  • the track segment between the key positions it can be seen that the starting point of "henhao” is the same as the starting point of the track segment corresponding to "hen", but the end point of the track segment corresponding to "henhao"
  • next-level encoded string candidate "so" of "ha” is combined in turn, such as "jiu hen ha so".
  • the combination between the other encoded strings that are matched is similar to this, and is not listed here.
  • each candidate when candidates are displayed, each candidate can be combined according to the connection order relationship between the track segments matched with the respective candidates, and the combined candidates are determined according to the evaluation scores of the candidates.
  • the presentation order of the combination candidates in the candidate column is determined and presented according to the evaluation scores of the candidates to be combined.
  • the candidate column can be displayed either in a separate area on the keyboard layout or directly in the input target position.
  • each candidate included in the combination candidate is displayed in a differentiated manner when the combination candidate is presented.
  • the distinguishing display includes: using colors to distinguish each candidate included in the candidate candidate; or, setting a space or an identifier between each candidate of the combined candidate, such as using "-" "1", a space, etc., or Different candidates use different colors and fonts to distinguish the various candidates included in the combination candidate.
  • performing word input according to the user's selection result of the displayed candidate including:
  • the track of the track segment matching the candidate at the same level as the candidate clicked by the user has the same track start point as the track segment matched by the candidate clicked by the user.
  • the candidate with the highest score in the string candidate is "henhao”, and "jiu” is combined with "henhao” to get the combined coded string candidate "jiu henhao" (usually showing "is good") And show it.
  • FIG. 11 is a schematic diagram showing the matching candidate presentation after the continuous input trajectory in FIG. 7a is recognized, and the candidate candidates are adjusted according to the user's selection result. Taking “jiuhenhao” as an example, in Figure 11, the candidates “jiu” and “henhao” are combined into a combination candidate in the candidate column, but the two candidates are separated by spaces.
  • the pop-up list shows candidates that are at the same level as “henhao” (see Figure 7b), including “henhao”, “hen”, “gen”, “he”, “ge” Etc., among these candidates, "henhao” Corresponding to a path paragraph, “gen”, “hen” corresponds to a path paragraph, “he”, “ge” corresponds to a path paragraph.
  • the user selects "hen”, see Figure 7b, the next preferred candidate "hao” for "hen” will be displayed, click “hao”, the pop-up selection list will list “hao", "ha”, the user can press Need to choose. For a recognition result with more words, if you change the previous candidate, it may cause a huge change in all subsequent recognition results.
  • the entry can be completed to facilitate the calculation based on the specific terms, such as For the code "jiu”, the optional words include “just”, “wine”, “old”, etc., the word frequency of each term, and the lexical matching degree of the preceding and following terms may be different. Therefore, the word frequency, part of speech and other attributes of the entry itself are used to identify the process to improve the accuracy of the recognition result.
  • the code and the term can usually be displayed in comparison to facilitate the user to edit.
  • the encoded string with the highest score in the actual application may not be the encoded string candidate corresponding to the word that the user most wants to input, therefore, in order to facilitate the user to select the displayed encoded string, and the user Can quickly select the code string candidate that you need. Therefore, when determining the code string candidate, when displaying the code string candidate, you can also first display the track point as the recognition start point. The first level encodes the character string candidate, and then displays the next-level encoded character string candidate of the selected character string candidate selected by the user, and continues to display the subsequent encoded character string candidate according to the user's selection result.
  • another way to display encoded string candidates is: determining the respective coded string candidates and/or the front-back connection relationship between the track segments corresponding to the respective coded character candidates as the basis for the presentation order of the respective coded string candidates.
  • a presentation batch of the entry corresponding to the encoded string according to the presentation batch, the coded character candidate and/or the term candidate in the same presentation batch are displayed in the candidate column, and according to the same presentation
  • the scores of the coded string candidates and/or the term candidates of the batch determine the order in which the various coded string candidates and/or term candidates in the candidate column are presented.
  • performing the word input according to the user's selection result and/or confirmation operation of the displayed encoded character string candidate comprises: according to the same batch of encoded character string candidates displayed by the user in the candidate column and/or Or the result of the selection of the term candidate performs an input operation, and performs an operation of presenting the encoded character candidate and/or the term candidate of the next batch after the selection result, so that the user performs the selection input until the end .
  • the first-level encoded character string candidates such as "jiu”, “ji”, and “jiuge” may be sequentially displayed in the candidate column according to the score of the first-level encoded character string candidate. If the user selects "jiu”, it will continue to display the next-level encoded string candidates of "jiu”, that is, “henhao”, “hen", “gen”, “he”, etc., if the user selects "hen” continues to display the next-level encoded string candidates for "hen” for the user to select.
  • the method further includes:
  • the preset processing instruction includes:
  • [377] delete the candidate to be processed from the candidate presentation area, and receive the user at the key
  • a continuous sliding operation on the disc area acquires a continuous input trajectory input at the current time, and replaces the candidate to be processed with a candidate matching the trajectory feature data of the continuous input trajectory at the current time.
  • the advantage of this design is that the editing operation gesture and the word selection gesture of the candidate column are combined to make the operation more convenient.
  • the sliding operation corresponding to the preset processing instruction is that the user gestures in the stroke candidate column is clockwise, or from the right line, when detecting that the user issues a corresponding sliding operation:
  • the sliding track determines the candidates to be processed, combines them into a single phrase, and adds the phrase to the thesaurus.
  • the manner of determining the new word to be added according to the sliding trajectory may also be determining the candidate included in the gesture trajectory.
  • the method for determining a candidate to be processed according to the sliding trajectory includes: acquiring a candidate according to the trajectory stroke, or determining a candidate included in the gesture trajectory when the user adopts the gesture of the gesture.
  • the candidate including the word "very high” can be obtained.
  • the delete operation is performed in a similar manner, such as when the user scribes from right to left or draws a candidate by counterclockwise, the trigger system will select the candidate "very good” from the candidate column.
  • Another instruction is to delete the candidate on the candidate column and use the next input recognition result as its substitute content. For example, if the candidate needs to be replaced with a V-shaped swipe gesture, the related candidate will be deleted. At the same time, continue to receive the user's input gesture in the keyboard area, and insert the recognition result into the position of the deleted candidate.
  • the code and the word itself are the same.
  • the code is the word itself, while for other input methods, the code and the word itself are not.
  • the code and the actual word are correspondence.
  • Most of the calculation of the recognition process may be related to the code, such as the path matching degree of the present invention, the length of the code, the word frequency, and may also be related to the word itself, such as word frequency, lexical rule matching. Degrees, etc.
  • the word frequency of the word corresponding to the encoding may also be used, such as The word frequency of the word with the highest word frequency in the same code, or the mean value of the highest word frequency of the first few, or even the score can be calculated based on the word.
  • the lexical rule matching degree is used to calculate the score, which is more suitable for words, and is calculated for different words of the same code. Score their respective scores.
  • the recognition result candidate For the interactive editing of the recognition result candidate, it can be used for the encoding or the word, as in the word selection operation in the previous embodiment, or directly click the candidate word "very good” to replace it with "follow", and then follow
  • the selection list includes "good” and "ha”, etc.
  • the ideas are similar and will not be described again.
  • the recognition calculation and interaction process of the present invention are mainly described by coding, and the processing methods and interaction processes are also applicable to words to some extent, and can be flexibly adopted in implementation.
  • the execution process of matching at least one candidate matching the trajectory feature data of the different trajectory segments in the continuous input trajectory from the preset vocabulary may also be performed on the server side
  • the present invention also provides another word recognition method based on continuous input of multiple words, which is applied to a system composed of an electronic device and a server.
  • the electronic device detects a continuous input trajectory on the keyboard area, and obtains the After continuously inputting the trajectory feature data of the trajectory, at least the trajectory feature data information is sent to the designated server.
  • the server matches at least one candidate matching the trajectory feature data of the different trajectory segments of the continuous input trajectory from the preset vocabulary according to the information that the at least the trajectory feature data information is received, And matching the matched at least one candidate matching the trajectory feature data of the different trajectory segments of the continuous input trajectory to the electronic device detecting the continuous input trajectory.
  • the electronic device receives the server to return at least one candidate that matches the trajectory feature data of the different trajectory segments in the continuous input trajectory.
  • the trajectory feature data included in the information sent to the server may include a start point, an end point of the continuous input trajectory, a sequence of characters drawn by the continuous input trajectory on the keyboard layout, and a path of the stroke, and a keyboard Information about the layout, such as key layout information on the keyboard, the angle of entry of the continuous input track on a particular button, the angle of departure, the inflection point information of the track, and the like.
  • the information that the electronic device sends to the server end that includes at least the trajectory feature data may be directly sent, or the trajectory feature data on the electronic device side may be correspondingly converted to generate an identification condition, and then sent to the server.
  • the information sent on the electronic device may further include keyboard layout information, which may be set by multiple keyboard layouts preset on the AJ server, such as setting corresponding to different conventional client resolutions on the server.
  • the keyboard layout is assigned and the corresponding identifier is assigned.
  • the client sends the keyboard layout information, it only needs to send the key disc layout identifier according to the agreed protocol.
  • the keyboard layout data may also be the key position arrangement data used by the client device. .
  • the present invention also provides a word recognition device based on continuous input of multiple words, and referring to FIG. 13, a method for identifying a word recognition device based on continuous input of multiple words according to the present invention is shown.
  • the apparatus includes: a thesaurus 1310, a track retrieval unit 1320, a track data information obtaining unit 1330, and a word matching unit 1340.
  • the thesaurus 1310 stores a number of word candidates and word frequencies.
  • the word candidate may include a word consisting of a coded string, that is, a coded string candidate, such as a Chinese pinyin code, or an English word, etc.; and a term candidate consisting of one or more Chinese characters.
  • the trajectory retrieval unit 1320 is configured to detect a continuous input trajectory on the ⁇ : area.
  • the track data information obtaining unit 1330 is configured to acquire track feature data of the continuous input track, wherein the continuous input track is related to a corresponding key sequence position of the code string to be input by the user on the keyboard layout.
  • the word matching unit 1340 is configured to match, from the preset vocabulary, at least one candidate that matches trajectory feature data of different trajectory segments in the continuous input trajectory.
  • the trajectory data information acquiring unit 1330 includes: determining trajectory feature data of the continuous input trajectory according to a keyboard layout, where the trajectory feature data includes: a trajectory starting point of the continuous input trajectory, and a stroke The path, the stroked characters, and the end of the track.
  • FIG. 14 there is shown a schematic structural diagram of another embodiment of the word recognition device based on multi-word continuous input according to the present invention.
  • the word matching unit 1340 of the embodiment includes:
  • a candidate retrieving unit 1341 configured to determine, according to the trajectory feature data of the unidentified trajectory segment in the continuous input trajectory, when the candidate is matched from the lexicon, the trajectory feature data and the matched candidate are determined. Matched track segment;
  • the loop judging unit 1342 is configured to determine, when the candidate retrieving unit matches the candidate and determine the trajectory segment that matches the candidate, whether the continuous input trajectory after the trajectory segment is in the continuous input trajectory Has been identified, and when it is determined that there is still an unidentified track segment after the track segment in the continuous input track, the execution is returned. The operation of the candidate retrieval unit until there are no unidentified track segments in the continuous input track, and candidates for matching the track feature data of the different track segments are obtained.
  • the candidate retrieval unit 1341 may include an initial start point setting unit and a candidate loop matching unit.
  • an initial starting point setting unit is configured to use a starting point of the track of the continuous input track as a current recognition starting point, and use a track segment located in the continuous input track after the current recognition starting point as the current track segment, and execute The candidate loop matches the operation of the unit.
  • a candidate loop matching unit configured to match, according to the trajectory feature data of the current trajectory segment, at least one trajectory of the trajectory starting from the current recognition starting point, and matching the candidate candidates as The candidate is currently executed and the operation of the determining unit is performed.
  • the loop judging unit 1342 includes: a loop judging subunit and a loop starting point setting unit.
  • the loop determining subunit is configured to determine whether the current track segment is recognized to end, and if so, obtain at least one candidate that matches the track feature data of the different track segments; if not, determine the track a track segment whose feature data matches the current candidate, and performs an operation of the loop start point setting unit;
  • the loop start point setting unit is configured to use a track segment of the continuous input track after the track segment matched by the current candidate as a current track segment, and use a starting point of the current track segment as a new one.
  • the starting point is currently identified, and the operation of the candidate loop matching unit is triggered to obtain the next level candidate of the current candidate.
  • trajectory feature data acquired by the trajectory data information acquiring unit 1330 further includes: at least one word segmentation mark drawn by the continuous input trajectory;
  • the candidate retrieval unit 1341 further includes:
  • a word segmentation determining unit configured to determine whether the word segmentation identifier is included in the continuous input track after the current recognition starting point after the initial recognition starting point setting unit sets the current recognition starting point, and if yes, execute The operation of the recent word segment identification determining unit; if not, the operation of the second candidate loop matching unit is performed.
  • a recent participle identification determining unit configured to determine that the continuous input track is located in the After the current identification starting point, and the word segment identifier closest to the current identification starting point, and performing the operation of the first candidate loop matching unit;
  • the loop matching unit includes:
  • a first loop matching unit configured to: according to the trajectory feature data of the continuous input trajectory between the current recognition starting point and the word segment identifier, at least one trajectory starting from the current recognition starting point is matched from the vocabulary ⁇ 1 "The candidate should be selected as the current candidate, and the operation of the word segmentation loop determination unit is performed;
  • the second loop matching unit includes: trajectory feature data for trajectory segments located after the current recognition starting point in the continuous input trajectory, and matching at least one trajectory segment starting from the current recognition starting point from the lexicon The candidate, the matched candidate is used as the current candidate, and the operation of the loop judgment subunit is performed;
  • the apparatus further includes: a word segmentation loop determining unit, configured to determine whether the continuous input track between the current recognition start point and the most recent word segmentation identifier is recognized, and if yes, after the recent word segmentation identifier Setting a new current recognition starting point in the continuous input track, and returning to perform the operation of the word segment identification determining unit; if not, determining a track segment in which the track feature data matches the current candidate, and the continuous input track is a continuous input trajectory between the end point of the track segment matched by the current candidate to the nearest word segment identifier as the current track segment, and the starting point of the current track segment as a new current recognition starting point, returning to perform the first loop matching Unit to get the next level candidate for the current candidate.
  • a word segmentation loop determining unit configured to determine whether the continuous input track between the current recognition start point and the most recent word segmentation identifier is recognized, and if yes, after the recent word segmentation identifier Setting a new current recognition starting point in the continuous input track, and returning to perform the operation
  • the loop determining subunit includes: determining whether the trajectory segment located after the current recognition starting point in the continuous input trajectory is ended, and if yes, obtaining trajectory feature data matching the different trajectory segments. At least one candidate; if not, determining a track segment whose track feature data matches the current candidate, and performing an operation of the cycle start point setting unit;
  • the loop start point setting unit includes: for using the start point of the current track segment as a new current recognition start point, returning to perform the operation of the second candidate loop matching unit, to obtain the next candidate of the current candidate Level candidate.
  • a trajectory segment determining unit configured to determine a recognition end point corresponding to the to-be-coded character string candidate on the current trajectory segment, and to treat a trajectory between the current recognition starting point and the end point of the identifying segment as a waiting Matching track segments;
  • a key data determining unit configured to determine key data of each of the to-be-determined encoded character string candidates on a keyboard layout
  • a candidate determining unit configured to: when it is determined that the matching degree of the key bit data of each of the to-be-determined encoded character string candidates and the track feature data of the to-be-matched track segment meets a preset condition, a pending candidate for the encoded character string candidate and/or the candidate for the pending encoded character string candidate as a candidate for the trajectory starting at the current recognition starting point;
  • the loop determining subunit comprises: determining whether the current track segment is recognized end, and if so, obtaining at least one candidate that matches the track feature data of the different track segments; if not, The to-be-matched track segment is used as a track segment in which the track feature data matches the current candidate, and the operation of the cycle start point setting unit is executed.
  • the word matching unit 1340 includes: The character string determining unit 1343, the second retrieving unit 1344, the candidate matching unit 1345, and the character string recognition judging unit 1346.
  • the stroke character string determining unit 1343 is configured to determine a sequence of character strings drawn by the continuous input track.
  • the second retrieving unit 1344 is configured to retrieve the encoded character string candidate from the thesaurus according to the sequence of the characters drawn by the continuous input trajectory.
  • the candidate matching unit 1345 is configured to determine, according to the correspondence between the character included in the encoded character string candidate and the string sequence drawn by the continuous input trajectory, corresponding to the encoded character string candidate The track segment to be matched, when the code string is encoded, the code string candidate and/or the word candidate converted from the code string candidate are matched as the track feature data of the track segment to be matched. The candidate, and the candidate as the current candidate.
  • the string identification determining unit 1346 is configured to determine whether a sequence of strings after the last letter of the encoded string corresponding to the current candidate still exists in the sequence of characters crossed by the continuous input track, and if not, Obtaining at least one candidate matching the trajectory feature data of the different trajectory segments; if yes, continuing to execute the character after the last letter of the encoded string corresponding to the current candidate in the sequence of the string drawn by the continuous input trajectory A sequence of strings that matches the operation of the encoded string candidate from the lexicon and performs the operation of the candidate matching unit until the last character stroked by the continuous input trajectory is recognized.
  • the method may further include:
  • a score determining unit configured to determine, according to a preset evaluation rule, an evaluation score of the candidate
  • a presentation unit configured to display a candidate according to a connection order relationship between the track segments matched with the respective candidates, and an evaluation score of the candidate
  • a word input unit for performing a word input according to a user's selection result and/or confirmation operation for the presented candidate.
  • the score determining unit includes: an evaluation score for calculating the candidate according to the preset evaluation factor; wherein the evaluation factor includes any one or more of the following: word frequency of the candidate, candidate The lexical rule matching degree, the path matching degree of the candidate track segment, the number of characters in the encoded string corresponding to the candidate, the length of the track segment matched with the candidate, and the track segment matched with the candidate The number of characters drawn and/or the number of characters stroked by the continuous input track segment.
  • the score determining unit calculates an evaluation score of the candidate according to the preset evaluation factor, wherein the evaluation factor further includes any one or more of the following: whether the candidate exists corresponding to the next one.
  • the candidate of the level the evaluation score of the candidate of the next level of the candidate, wherein the candidate of the next level of the candidate is the candidate that matches the end point of the track segment matched by the candidate as the recognition starting point.
  • the present invention also provides a word recognition system based on continuous input of multiple words.
  • a word recognition system based on continuous input of multiple words is provided.
  • the system includes : electronic device 1 and server 2;
  • the electronic device 1 is configured to detect a continuous input track on a keyboard area, and obtain The track feature data of the continuous input track sends information including at least the track feature data to a designated server, and receives the server returning to match the track feature data of different track segments in the continuous input track. At least one candidate; wherein the continuous input track is related to a position of a sequence of coded characters to be input by the user on a keyboard layout; the track feature data includes at least: a start point of the track of the continuous input track, in the keyboard layout The path of the stroke, the characters of the stroke, and the end of the trajectory;
  • the server 2 is configured to receive data that is sent by the electronic device and that includes at least information about the trajectory feature data. According to the information, different trajectory segments from the continuous input trajectory are matched from the lexicon. The trajectory feature data matches at least one candidate, and the matched candidate is returned to the electronic device.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Character Discrimination (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

La présente invention concerne un procédé, un appareil et un système de reconnaissance de mots basés sur une entrée continue de plusieurs mots. Le procédé consiste à : détecter une piste d'entrée continue dans une zone d'un clavier et à obtenir des données caractéristiques de piste de la piste d'entrée continue, la piste d'entrée continue étant significative d'un emplacement d'une séquence de caractères codés sur un schéma d'un clavier qui doit être entrée par un utilisateur; et trouver dans une bibliothèque de mots prédéfinie au moins un élément candidat associé aux données caractéristiques de piste d'un segment de piste différent dans la piste d'entrée continue. Le procédé peut améliorer l'efficacité d'entrée et réduire la complexité d'utilisation pour un utilisateur.
PCT/CN2013/079492 2012-07-17 2013-07-17 Procédé, appareil et système de reconnaissance de mots basés sur une entrée continue de plusieurs mots WO2014012485A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2012102484486A CN102880302A (zh) 2012-07-17 2012-07-17 一种基于多词连续输入的字词识别方法、装置和系统
CN201210248448.6 2012-07-17

Publications (1)

Publication Number Publication Date
WO2014012485A1 true WO2014012485A1 (fr) 2014-01-23

Family

ID=47481662

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/079492 WO2014012485A1 (fr) 2012-07-17 2013-07-17 Procédé, appareil et système de reconnaissance de mots basés sur une entrée continue de plusieurs mots

Country Status (2)

Country Link
CN (1) CN102880302A (fr)
WO (1) WO2014012485A1 (fr)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880302A (zh) * 2012-07-17 2013-01-16 重庆优腾信息技术有限公司 一种基于多词连续输入的字词识别方法、装置和系统
CN105988595B (zh) * 2015-02-17 2019-12-06 上海触乐信息科技有限公司 滑行输入方法及装置
CN104915020A (zh) * 2014-03-13 2015-09-16 杨文贵 九宫区位传讯装置及方法
CN105260113B (zh) * 2015-09-18 2018-09-21 科大讯飞股份有限公司 滑行输入方法、装置及终端设备
CN105353968B (zh) * 2015-10-23 2018-07-03 武汉悦然心动网络科技股份有限公司 对滑动输入轨迹进行识别的方法
CN105468743B (zh) * 2015-11-25 2018-12-28 钟岑 一种智能诊断手术编码检索方法
CN107765884B (zh) * 2016-08-22 2021-11-02 北京搜狗科技发展有限公司 一种滑行输入方法、装置及电子设备
CN107817942B (zh) * 2016-09-14 2021-08-20 北京搜狗科技发展有限公司 一种滑行输入方法、系统和一种用于滑行输入的装置
CN106502995B (zh) * 2016-11-30 2019-10-15 福建榕基软件股份有限公司 一种层级信息智能识别方法及装置
CN110554780A (zh) * 2018-05-30 2019-12-10 北京搜狗科技发展有限公司 一种滑行输入的方法和装置
CN109582972B (zh) * 2018-12-27 2023-05-16 信雅达科技股份有限公司 一种基于自然语言识别的光学字符识别纠错方法
CN109745699A (zh) * 2018-12-29 2019-05-14 维沃移动通信有限公司 一种响应触控操作的方法及终端设备
CN110456922B (zh) * 2019-08-16 2021-07-20 清华大学 输入方法、输入装置、输入系统和电子设备
CN111078028B (zh) * 2019-12-09 2023-11-21 科大讯飞股份有限公司 输入方法、相关设备及可读存储介质
CN111046666B (zh) * 2019-12-19 2023-05-05 天津新开心生活科技有限公司 事件识别方法及装置、计算机可读存储介质、电子设备
CN111090341A (zh) * 2019-12-24 2020-05-01 科大讯飞股份有限公司 输入法候选结果展示方法、相关设备及可读存储介质
CN111142832A (zh) * 2019-12-25 2020-05-12 惠州Tcl移动通信有限公司 一种输入识别方法、装置、存储介质及终端
CN111596771A (zh) * 2020-05-14 2020-08-28 海信视像科技股份有限公司 显示设备和输入法中选择器的移动方法
CN117591630A (zh) * 2023-11-21 2024-02-23 北京天防安全科技有限公司 一种关键词识别方法、装置和设备

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2353159A1 (fr) * 2001-07-17 2003-01-17 Madentec Limited Methode d'entree de donnees par glissement sur un ecran tactile
US20040104896A1 (en) * 2002-11-29 2004-06-03 Daniel Suraqui Reduced keyboards system using unistroke input and having automatic disambiguating and a recognition method using said system
CN1510557A (zh) * 2002-12-20 2004-07-07 �Ҵ���˾ 用于根据虚拟键盘布局来识别字形的系统和方法
CN1637686A (zh) * 2004-01-06 2005-07-13 国际商业机器公司 用于个人计算设备上改进的用户输入的系统和方法
CN1761989A (zh) * 2003-01-16 2006-04-19 克利福德·A·库什勒 基于连续笔划字词的文本输入系统和方法
CN102693090A (zh) * 2012-05-16 2012-09-26 刘炳林 一种输入方法和电子设备
CN102736821A (zh) * 2011-03-31 2012-10-17 腾讯科技(深圳)有限公司 基于滑动轨迹确定候选词的方法和装置
CN102880302A (zh) * 2012-07-17 2013-01-16 重庆优腾信息技术有限公司 一种基于多词连续输入的字词识别方法、装置和系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6871200B2 (en) * 2002-07-11 2005-03-22 Forensic Eye Ltd. Registration and monitoring system
CN102117175A (zh) * 2010-09-29 2011-07-06 北京搜狗科技发展有限公司 一种滑行输入中文的方法、装置和触摸屏输入法系统

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2353159A1 (fr) * 2001-07-17 2003-01-17 Madentec Limited Methode d'entree de donnees par glissement sur un ecran tactile
US20040104896A1 (en) * 2002-11-29 2004-06-03 Daniel Suraqui Reduced keyboards system using unistroke input and having automatic disambiguating and a recognition method using said system
CN1510557A (zh) * 2002-12-20 2004-07-07 �Ҵ���˾ 用于根据虚拟键盘布局来识别字形的系统和方法
CN1761989A (zh) * 2003-01-16 2006-04-19 克利福德·A·库什勒 基于连续笔划字词的文本输入系统和方法
CN1637686A (zh) * 2004-01-06 2005-07-13 国际商业机器公司 用于个人计算设备上改进的用户输入的系统和方法
CN102736821A (zh) * 2011-03-31 2012-10-17 腾讯科技(深圳)有限公司 基于滑动轨迹确定候选词的方法和装置
CN102693090A (zh) * 2012-05-16 2012-09-26 刘炳林 一种输入方法和电子设备
CN102880302A (zh) * 2012-07-17 2013-01-16 重庆优腾信息技术有限公司 一种基于多词连续输入的字词识别方法、装置和系统

Also Published As

Publication number Publication date
CN102880302A (zh) 2013-01-16

Similar Documents

Publication Publication Date Title
WO2014012485A1 (fr) Procédé, appareil et système de reconnaissance de mots basés sur une entrée continue de plusieurs mots
US11416679B2 (en) System and method for inputting text into electronic devices
US9417710B2 (en) System and method for implementing sliding input of text based upon on-screen soft keyboard on electronic equipment
US10191654B2 (en) System and method for inputting text into electronic devices
TWI266280B (en) Multimodal disambiguation of speech recognition
KR102221079B1 (ko) 실시간 필기 인식 관리
US7319957B2 (en) Handwriting and voice input with automatic correction
US20050192802A1 (en) Handwriting and voice input with automatic correction
US10444851B2 (en) Character input method, program for character input, recording medium, and information-processing device
CN105074643B (zh) 非词典字符串的手势键盘输入
US8713464B2 (en) System and method for text input with a multi-touch screen
TW200842613A (en) Spell-check for a keyboard system with automatic correction
WO2015024467A1 (fr) Procédé de frappe rapide d'informations fondée sur des mots associés à un contexte d'intérêt
WO2016107317A1 (fr) Procédé et dispositif de commande d'opération de curseur d'un procédé d'entrée
EP2897055A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations, et programme
WO2017114002A1 (fr) Dispositif et procédé permettant de saisir un texte manuscrit unidimensionnel
Ouyang et al. Mobile keyboard input decoding with finite-state transducers
CN109478122A (zh) 用于图形键盘的基于压力的手势键入
WO2016131425A1 (fr) Procédé et appareil de saisie de glissement
JP2010026718A (ja) 文字入力装置および方法
JP2013025390A (ja) 手書き入力方法
KR101651909B1 (ko) 음성 인식 텍스트 수정 방법 및 이 방법을 구현한 장치
KR102138095B1 (ko) 음성 명령 기반의 가상 터치 입력 장치
CN115359798A (zh) 语音控制的控件识别方法、装置、设备、介质及程序产品
JP5492316B2 (ja) 文字入力装置および方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13820692

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13820692

Country of ref document: EP

Kind code of ref document: A1