WO2015161823A1

WO2015161823A1 - Handwriting recognition method and device

Info

Publication number: WO2015161823A1
Application number: PCT/CN2015/077367
Authority: WO
Inventors: 江淑红; 吴波
Original assignee: 夏普株式会社; 江淑红
Priority date: 2014-04-25
Filing date: 2015-04-24
Publication date: 2015-10-29
Also published as: CN105095924A

Abstract

Disclosed are a handwriting recognition method and a corresponding handwriting recognition device. The method comprises: receiving handwriting stroke sequences continuously input by a user in the same input region; and based on the credibility of an individual character, conducting segmentation hyphenation on the received handwriting stroke sequences. The disclosed handwriting recognition method and corresponding handwriting recognition device not only can recognize a plurality of characters continuously input by a user in the same input region in an overlapping coverage manner, but also can guarantee relatively high accuracy of segmentation hyphenation and efficiency of handwriting input.

Description

Handwriting recognition method and device

Technical field

The present application generally relates to the field of human-computer interaction technology, and in particular to handwriting recognition.

Background technique

With the development of mobile communication technologies, smart terminals with touch screens have become more and more popular. In order to input information by handwriting by means of a touch screen, handwriting recognition technology has been widely used on these terminals.

Traditionally, smart terminals with limited screen sizes have adopted handwriting recognition technology based on single character input. That is, the user inputs word by word in a predetermined writing area (such as a preset writing box or the entire screen), and waits for system feedback after the word ends. After obtaining the word recognition result fed back by the system, the writing screen is cleared to continue the input of the next character. However, such an input method does not conform to the writing habits of people's daily continuous input characters, and the pen waiting and waiting for recognition affect the input efficiency.

In order to improve the user's handwriting experience and improve handwriting input efficiency, an overlapping handwriting input recognition method is needed to identify a plurality of characters that are continuously input by the user in an overlapping manner in the same input area.

To this end, Chinese Patent No. CN102141892 B entitled "Overlay Handwriting Input Display Method and System" discloses a scheme in which the affiliation relationship of a stroke is determined according to the handwritten feature of the stroke and the positional relationship between adjacent strokes. And, based on the pause time between adjacent strokes, it is judged whether the input strokes constitute the same character.

However, segmentation hyphenation based on the pause time between adjacent strokes is not precise enough. For example, a user may pause or think a little while in the process of entering a complex character. Splitting the word break based on the pause will result in an erroneous recognition result. Although it is possible to distinguish between inter-character pauses and intra-character pauses by forcing the user to wait for a relatively long time before entering the next character, this does not conform to the handwriting habits of people's daily continuous input of characters, and is bound to reduce the speed and efficiency of handwriting input. .

Summary of the invention

In view of the above problems and deficiencies of the prior art, the object of the present invention is to propose a new overlapping handwriting recognition scheme, which can not only recognize a plurality of characters continuously input by a user in an overlapping manner in the same input area, but also ensure comparison. High segmentation hyphenation accuracy and handwriting input efficiency.

According to a first aspect of the present invention, there is provided a handwriting recognition method comprising: receiving a sequence of handwritten strokes continuously input by a user in the same input area; and segmenting the received sequence of handwritten strokes based on word confidence.

The segmentation hyphenation of the received handwritten stroke sequence based on the word confidence may include forward segmentation and/or reverse segmentation. The forward segmentation determines the segmentation point of the received handwritten stroke sequence in the same order as the stroke input. The reverse segmentation determines the segmentation point of the received handwritten stroke sequence in the reverse order of the stroke input.

The forward segmentation may include: reading a stroke after the upper point of the received handwritten stroke sequence into the forward segmentation set; and calculating, for each stroke in the forward segmentation set, the stroke and the stroke The former stroke forms the credibility of the word; the gap between the stroke with the greatest degree of credibility and the subsequent stroke is determined as the cut point; and the above three steps are repeated. When the three steps are performed for the first time, the upper all points are located before the first input stroke.

The reverse segmentation may include: reading a stroke of the received handwritten stroke sequence before the upper point of the division into a reverse segmentation set; for each stroke in the reverse segmentation set, calculating the stroke and the stroke The posterior stroke forms the credibility of the word; the gap between the stroke with the greatest degree of credibility and the previous stroke is determined as the cut point; and the above three steps are repeated. When the three steps are performed for the first time, the upper all points are located after the last input stroke.

If the cut points determined by the forward cut and the reverse cut do not coincide, a fine cut can be performed for the stroke between the two cut points before and after the non-coincid cut point. The fine segmentation may include enumerating all the segmentation possibilities of the stroke, wherein each segmentation may correspond to a segmentation point configuration related to the number and position of the segmentation points; Possibly, calculating the credibility of the stroke between the cut points to form a single word, and determining the total credibility of the cut according to the calculated reliability of the single word; and the possible maximum score of the total credibility The segmentation point configuration is determined as the segmentation result.

The method may further include determining whether there is an overlap region between the single words formed by the strokes between the cut points and a size of the overlap region; and determining the word based on the determining Whether it is a composite word.

The method may further include displaying or blanking the recognized complete character in a light color when the user inputs the stroke.

Displaying or blanking the recognized full character in a light color when the user inputs the stroke may include: after the user newly inputs a stroke, handwriting recognition of the stroke sequence input by the user, thereby identifying the character string; if newly input A stroke is the first stroke of the last character in the string and the second-to-last character in the string is the same as the last character of the string recognized by the user after the last stroke, or if a new one is entered The stroke is not the first stroke of the last character in the string and the second-to-last character in the string is the same as the second-to-last character of the character string recognized by the user after inputting the last stroke, then the second to last Whether the number of strokes of the characters is greater than 2; and if the number of strokes of the second to last character is greater than 2, the second to last character and its previous characters are lightly displayed or blanked.

The segmentation of the received handwritten stroke sequence may also be based on the degree of matching of some or all of the strokes in the received handwritten stroke sequence with the overlapping character template. Each overlapping character template can be composed of two overlapping characters.

Preferably, text recognition is aided by language and/or writing rules.

According to a second aspect of the present invention, there is provided a handwriting recognition apparatus comprising: receiving means for receiving a sequence of handwritten strokes continuously input by a user in the same input area; and cutting means for correcting based on word confidence The received sequence of handwritten strokes is segmented and broken.

The slitting device may comprise a forward slitting device and/or a reverse slitting device. The forward severing means is operative to determine a puncturing point of the received handwritten stroke sequence in the same order as the stroke input. The reverse slicing device is operative to determine a cut point of the received handwritten stroke sequence in an order opposite to the stroke input.

The forward segmentation device may include: a forward segmentation set forming unit, configured to read a stroke of the received handwritten stroke sequence after the upper point of the entry into the forward segmentation set; the word confidence calculation unit For calculating the credibility of the stroke and the strokes of the preceding stroke for each stroke in the forward segmentation set; the segmentation point determination unit is configured to use the stroke with the greatest degree of credibility and the subsequent stroke The gap between the gaps is determined as a split point; and a control unit for controlling the above three units to repeatedly perform respective functions. Forward segmentation set forming unit When the function is first executed, the upper point is placed before the first input stroke.

The reverse segmentation device may include: a reverse segmentation set forming unit, configured to read a stroke of the received handwritten stroke sequence before the upper point of the entry into the reverse segmentation set; the word confidence calculation unit For calculating the credibility of the stroke and the subsequent stroke to form a single word for each stroke in the reverse segmentation set; the segmentation point determination unit is configured to use the stroke with the greatest degree of credibility and the previous stroke The gap between the gaps is determined as a split point; and a control unit for controlling the above three units to repeatedly perform respective functions. When the reverse-segment integration forming unit performs its function for the first time, the upper all points are located after the last input stroke.

The apparatus may further include: a fine segmentation device, configured to: before the non-coincidence segmentation point, the two segmentation points, in the case that the segmentation points determined by the forward segmentation and the reverse segmentation do not coincide The strokes between the executions are finely divided. The fine slicing device may comprise: a split possible enumeration unit for enumerating all the segmentation possibilities of the stroke, wherein each segmentation may correspond to a slice related to the number and position of the segmentation points. The point-to-point configuration; the credibility calculation unit is configured to calculate the credibility of the strokes formed by the strokes between the segmentation points for each segmentation possibility, and determine the possible segmentation according to the calculated word confidence. The total credibility; and the segmentation result determining unit is configured to determine the segmentation point configuration corresponding to the segmentation with the largest total credibility as the segmentation result.

The device may also include a post-processing device. The post-processing device includes: an overlap region evaluation unit configured to determine whether an overlap region and a size of the overlap region exist between the words formed by the strokes between the segment points; and a synthesized word determining unit for determining, based on the determination, It is judged whether the word is a composite word.

The post-processing device may be further configured to display or blank out the recognized complete characters in a light color when the user inputs the stroke.

The post-processing device may further include: a character string identifying unit, configured to perform handwriting recognition on the stroke sequence input by the user after the user newly inputs a stroke, thereby identifying the character string; and the determining unit is configured to be newly input A stroke is the first stroke of the last character in the string and the second-to-last character in the string is the same as the last character of the string recognized by the user after the last stroke, or in the new The input stroke is not the first stroke of the last character in the string and the second-to-last character in the string is the same as the second-to-last character of the string recognized by the user after the last stroke. , determining whether the number of strokes of the second-to-last character is greater than 2; and the light-colored display or blanking unit, For displaying that the penultimate character and its previous characters are lightly displayed or blanked in the case where the number of strokes of the penultimate character is greater than 2.

The segmentation device may further perform segmentation and word segmentation on the received handwritten stroke sequence based on the degree of matching of some or all of the strokes in the received handwritten stroke sequence with the overlapping character template. Each overlapping character template can be composed of two overlapping characters.

The post-processing device can be configured to utilize language and/or writing rules to aid in text recognition.

DRAWINGS

The above and other objects, features and advantages of the present invention will become apparent from

1 is a flow chart showing a handwriting recognition method according to the present invention;

2 is a flow chart showing a forward severing operation in accordance with the present invention;

Figure 3 is a flow chart showing a reverse dicing operation in accordance with the present invention;

Figure 4 is a flow chart showing a fine segmentation operation in accordance with the present invention;

Figure 5 shows an exploded stroke of a single character "cut" and "minute";

Figure 6 shows the effect of the characters "cut" and "minute" overlap input;

Figure 7 shows an example of a segmentation of a segmentation operation according to the present invention applied to an input stroke sequence of a string "segmentation";

Figure 8 is a flow chart showing the display or blanking of the identified complete characters in a light color when the user performs stroke input in accordance with the present invention;

FIG. 9 shows the actual effect of applying the pre-word fade processing according to the present invention to the Japanese input "重叠" of the overlap input;

Fig. 10 is a block diagram showing an example structure of a handwriting recognition apparatus according to the present invention.

detailed description

The preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings, and the details and functions that are not necessary for the present invention are omitted in the description to avoid confusion of the understanding of the present invention.

First, referring to FIG. 1, a process of the handwriting recognition method 100 according to the present invention will be described. Said. As shown in FIG. 1, the handwriting recognition method 100 starts at step s110, and receives a sequence of handwritten strokes that the user continuously inputs in the same input area. Next, in step s120, the received handwritten stroke sequence is segmented and word-break based on the word confidence. In order to implement step s120, the template matching method may be used for single word recognition, and the matching distance is determined as the single word reliability in step s120.

The feature template of the template matching method can be generated using a sample training method based on a learning strategy such as generalized learning vector quantization (GLVQ). Features used in single character recognition may include, for example, stroke direction distribution features, grid stroke features, perimeter orientation features, and the like. Pre-processing before feature extraction may include, for example, equidistant smoothing, centroid-based linear normalization, nonlinear normalization, etc., to normalize all features. In order to improve the recognition speed, a multi-stage cascade matching method can be employed. The above-mentioned contents regarding the template matching method can be found in the Chinese patent CN 101354749 B entitled "Dictionary Making Method, Handwriting Input Method and Apparatus", and will not be described again here.

Compared with the prior art that performs segmentation and hyphenation based on the pause time between adjacent strokes, segmentation of the received handwritten stroke sequence based on the word confidence can significantly improve the segmentation hyphenation accuracy and the handwriting input efficiency.

In a specific implementation, step s120 may include forward segmentation and/or reverse segmentation (collectively referred to as coarse segmentation). The forward segmentation determines the segmentation point of the received handwritten stroke sequence in the same order as the stroke input. The reverse segmentation determines the segmentation point of the received handwritten stroke sequence in the reverse order of the stroke input.

Example implementations of forward and reverse slicing operations are described below with reference to Figures 2 and 3. As shown in FIG. 2, the forward severing starts at step s201. In this step, the forward slice set is set to an empty set. In step s202, the counter i is initialized to zero.

Next, in step s203, the counter i is incremented by one. In step s204, the stroke s _i in the handwritten stroke sequence is added to the forward segmentation set S. In step s205, for each stroke s _k (k=1, . . . , i-1, i) in the forward segmentation set, a single word recognition is performed on the stroke and the preceding stroke forming word, and the word is calculated. Credibility P _k . In step s206, it is judged whether or not the counter i is equal to the total number L of strokes in the received handwritten stroke sequence.

If the determination result in step s206 is YES, the process proceeds to step s207, the search for the maximum value P _k max {P _k}. Next, in step s208, the stroke index K corresponding to max{P _k } is recorded, and the gap between the stroke and the subsequent stroke is determined to be recorded as a forward segmentation point. In step s209, the forward slice set S is emptied. In step s210, the counter i is set to K, and the process returns to step s203. At this time, in step s205, k does not start from 1, but starts from the stroke K+1 after the segmentation point, that is, k= K+1,...,i-1,i. If the result of the determination in step s206 is NO, the process returns to step s203.

As shown in FIG. 3, the reverse segmentation starts at step s301. In this step, the reverse slice set is set to an empty set. In step s302, the counter i is initialized to L+1.

Next, in step s303, the counter i is decremented by one. In step s304, the strokes s _i in the sequence of handwritten strokes are added to the inverse segmentation set S. In step s305, for each stroke s _k (k=i, i+1, . . . , L) in the reverse segmentation set, a single word recognition is performed on the stroke and the subsequent stroke forming single word, and the word is calculated. Credibility P _k . In step s306, it is judged whether or not the counter i is equal to 1.

If the determination result in step s306 is YES, the process proceeds to step s307, the search for the maximum value P _k max {P _k}. Next, in step s308, the stroke index K corresponding to max{P _k } is recorded, and the gap between the stroke and the previous stroke is determined to be recorded as a reverse segmentation point. In step s309, the forward slice set S is emptied. In step s310, the counter i is set to K, and the process returns to step s303. At this time, in step s305, k does not end to L, but ends the stroke K-1 before the reverse segmentation point, that is, k=i, i+1, ..., K-1. If the result of the determination in step s306 is NO, the process returns to step s303.

In the case where both the forward segmentation and the reverse segmentation are performed, if the forward segmentation point and the reverse segmentation point are completely coincident, the segmentation points can be temporarily fixed. However, there may be cases where the forward severing and the reverse severing points do not completely coincide. In this case, preferably, the fine cut is performed for the stroke between the two cut points before and after the non-coinciding cut point.

A flow chart of a fine dicing operation in accordance with the present invention is described below with reference to FIG. As shown, the fine cut starts at step s401. In this step, all the segmentation possibilities of the stroke are enumerated, wherein each of the segments may correspond to a segmentation point configuration related to the number and position of the segmentation points. Next, in step s402, for each of the segmentation possibilities, the credibility of the stroke forming word between the segment points is calculated, and the total credibility of the segmentation may be determined according to the calculated word confidence. Finally, in step s403, the segmentation point configuration corresponding to the segmentation with the largest total reliability is determined as the fine segmentation result.

Next, the segmentation hyphenation operation according to the present invention will be described by taking a segmentation hyphenation of a handwritten stroke sequence of the handwritten character string "splitting" which is superimposed and input. By way of illustration, Figure 5 shows an exploded stroke of a single character "cut" and "minute", and Figure 6 shows the effect of overlapping inputs.

It is assumed that the forward segmentation points a ₁ and a ₂ can be obtained by performing the forward segmentation operation as shown in Fig. 7(a). The reverse segmentation points b ₁ , b ₂ and b ₃ can be obtained by performing a reverse dicing operation. The forward segmentation points a ₁ and a ₂ coincide with the reverse segmentation points b ₂ and b ₁ , respectively, so that these segmentation points can be temporarily fixed. The inverse segmentation point b _{3 has} no corresponding forward segmentation point, and thus two stroke subsequences that are not broken by the segmentation point before and after (ie, "-

"with"

丿") as a whole (ie, "-

丿") Perform fine cuts.

To this end, first two potential cut points of s ₁ and s ₂ are added as shown in Fig. 7(c). All strokes are combined arbitrarily to form a combination of strokes such as C ₁ , C ₂ , ..., C ₉ . Then, list all possible segmentation paths, such as (1) C ₁ ; (2) C ₂ C ₉ ; (3) C ₄ C ₅ ; (4) C ₄ C ₈ C ₉ ; For each possible segmentation path, first, each combination constituting the path is single-word recognized and its word confidence is calculated. Then, the total confidence of the segmentation path is calculated. Next, the segmentation path with the largest total reliability is selected, and the corresponding segmentation point is determined as the segmentation result. Among them, the method of calculating the optimal path can be performed by dynamic programming, N-best algorithm, and the like.

In the case of the N-Best method, the most probable N segmentation paths are calculated. Define the starting point of each stroke as a primitive node. The path formed by the primitive or primitive combination is the corresponding stroke combination. The cost function of each partial path is: C(Y)=1-f(Y), That is to say, the higher the segmentation reliability, the smaller the cost function value of the partial path. The N-best method is to select the best N paths, so that the sum of the values of the cost functions of all the paths passed is the smallest, the second smallest... the Nth is small.

The N-Best method can be implemented in a variety of ways, for example, combining a dynamic programming (DP) method with a stack algorithm to generate multiple candidates, and so on. In the embodiment of the present invention, the N-Best method includes two steps: the forward search process adopts an improved Viterbi algorithm (the Viterbi algorithm is a dynamic plan for finding the most likely implicit state sequence). Method), used to record the state of the optimal N partial paths transferred to each primitive node (ie, the sum of the cost function values of the path passed); the state of the mth primitive node is only the m-1th The state of the primitive nodes; the backward search process uses a stack algorithm based on the A* algorithm, for each A node m whose heuristic function is the sum of the following two functions: one is the "path cost function", which represents the sum of the cost function values of the shortest path from the starting point to the mth node, and the second is "inspiration". The estimation function" represents an estimate of the path cost from the mth node to the target node. In the backward search process, the path score in the stack is the calculated full path score, and the optimal path is always at the top of the stack. Therefore, the algorithm is a global optimal algorithm.

In the case shown in FIG. 7 (c) as an example, compared to other path segmentation, segmentation path C ₁ has a greater overall reliability. Therefore, the corresponding cut point is selected as the fine cut result. Further, the segmentation point b ₃ obtained in the reverse dicing operation is eliminated.

After the segmentation point is determined, the word recognized during the execution of the segmentation operation can be read as a result of handwriting recognition. Still taking FIG. 7 as an example, after the rough division, the segmentation points a ₁ =b ₂ and a ₂ =b _{1 are determined} . After the fine cut, no new cut points were added. Further, the handwriting recognition result can be read as the words "cut", "eight", and "knife" recognized in the process of performing the coarse segmentation operation. The handwriting recognition results can be post-processed to optimize recognition accuracy.

In a specific implementation, it may be determined whether there is an overlap area between the single words formed by the strokes between the cut points and the size of the overlap area. Based on the determination result, it is determined whether the single word constitutes a composite word. In general, the smaller the overlap area, the more likely it is to form a composite word; the larger the overlap area, the less likely it is to form a composite word. For example, according to the "eight" and "knife" non-overlapping regions or the overlapping regions are extremely small, it can be judged that the two constitute a composite word "minute".

In addition, language recognition can be assisted by language and/or writing rules. For example, when recognizing a hiragana sequence of overlapping inputs, the following methods can be used to distinguish between uppercase and lowercase pseudonyms: つ and っ (promotion); and やゆよ and ゃゅょ (拗). Specifically, for ゃゅょ (拗音), if the character previously input is one of "きぎしじちぢにひぴびみり" and its size is significantly smaller than the previously input character, it is determined as a lowercase character. Otherwise, determine it as a size character. For っ (promotion), you can first compare its size to multiple characters of its context, and then use some rules (such as dictionary matching rules) to determine whether it is lowercase or uppercase.

In order to further improve the recognition accuracy, it is considered to train the overlapping character template, and based on the matching degree of some or all of the strokes in the received handwritten stroke sequence and the overlapping character template, the handwritten stroke sequence is segmented and broken. Taking the two-character hiragana overlap as an example, you can Each of the 84 hiraganas is combined with 84 hiragana characters to form 84*84 overlapping character templates "ああ", "あい", "あう", "あえ", "あお", ..., "あん",..,"いう", "いえ", "いお", ... "いん",..., etc.

In order to facilitate the user to distinguish between the stroke of the entered character and the stroke of the character currently being written, the handwriting recognition method of the present invention supports displaying or blanking the recognized complete character in a light color when the user inputs the stroke. An example flow for implementing this function is described below with reference to FIG.

First, in step s801, the counter n is initialized to zero. In step s802, the user is waited for a new stroke to be input, and after the user inputs a new stroke, the character string C ₁ C ₂ . . . C _{k is} recognized by handwriting recognition of the stroke sequence input by the user.

Next, in step s803, it is judged that the newly input one stroke is the first stroke of the last character C _k in the character string. If yes, go to step s804, otherwise go to step s805. In step s804, it is determined whether the second-to-last character C _k-1 in the character string is the same as the last character C' _{k of the} character string recognized by the user after the last stroke. If the same, step s806 is performed, otherwise step s809 is performed. In step s805, it is determined whether the second-to-last character C _k-1 in the character string is the same as the second-to-last character C' _{k-1 of the} character string recognized by the user after the last stroke. If the same, step s806 is performed, otherwise step s809 is performed.

In step s806, the counter n is set to 1. Next, step s807 is performed to determine whether the number of strokes of the penultimate character C _k-1 is greater than two. If yes, go to step s808, otherwise go back to step s802.

In step s808, the penultimate character C _k-1 and its previous characters are lightly displayed or blanked. In step s809, n is reset to zero. Then, it returns to step s802.

Table 1 gives a breakdown of the pre-word desalination process for the Japanese input "重叠" of the overlap input in tabular form. The serial number column in Table 1 indicates the number of strokes input by the user (i.e., the number of rounds in which step s802 is performed).

Table 1

It should be noted that, in step s807, it is determined that the fade processing is performed when the number of strokes of the second-to-last character is greater than 2: based on the following considerations: during the segmentation process, when the input strokes are small, the stroke sequence is often mis-cut into Single-stroke or two-stroke words (for example, Chinese characters "one", "two", etc.). At this time, if the previous word is faded, it will result in an incorrect display. For example, in line No. 2 of Table 1, it will result in "

"It is faded out.

Fig. 9 shows the actual effect of the above fade display processing. As can be seen from the figure, the user can clearly distinguish between the stroke of the entered character and the stroke of the character currently being written.

Corresponding to the handwriting recognition method described above, the present invention also proposes a related handwriting recognition device. FIG. 10 shows a schematic block diagram of such a handwriting recognition device 1000.

As shown in the figure, the handwriting recognition apparatus 1000 according to the present invention includes a handwriting input device 1100, a handwritten information storage device 1200, a handwritten character string recognition device 1300, an identification candidate selection device 1400, and a display control device 1500.

The handwriting input device 1100 is configured to receive a sequence of strokes input by the user and digitize them to obtain handwritten handwriting for use by other devices. The handwritten information storage device 1200 is used to store handwritten handwriting and other information generated during the handwriting process.

The handwritten character string recognition device 1300 may include a handwriting segmentation unit 1310, a single character/overlapping character recognition unit 1320, and a post processing unit 1330. The handwriting segmentation unit 1310 can invoke the single character/overlapping character recognition unit 1320 to receive the received word based on the word confidence and also based on the degree of matching of some or all of the strokes in the received handwritten stroke sequence with the overlapping character template. The sequence of handwritten strokes is used to segment and break words. The post-processing unit 1330 may determine whether the recognized word constitutes a synthesized word; correct the recognition result based on language and/or writing rules; and/or display or blank the identified completeness in a light color when the user inputs the stroke character.

The recognition candidate selection means 1400 provides the user with an identification candidate for the user to select the correct recognition result. The display control device 1500 controls display of display contents whose contents are constantly changing as the handwriting, the recognition candidate, and the final recognition result.

The handwriting recognition method and apparatus according to the present invention can be applied to various electronic devices that support handwriting input, such as an electronic whiteboard, a tablet computer, a desktop computer, a laptop computer, a personal digital assistant, a mobile phone, and the like. In addition, the principle applies to Chinese characters and Japanese, and also applies. In a variety of other languages (such as Korean).

It should be noted that, in the above description, the technical solutions of the present invention are shown by way of example only, but the invention is not limited to the above steps and unit structures. Where possible, the steps and unit structure can be adjusted and traded as needed. Therefore, certain steps and elements are not essential elements for carrying out the general inventive concept of the invention. Therefore, the technical features necessary for the present invention are limited only by the minimum requirements that can realize the general inventive concept of the present invention, and are not limited by the above specific examples.

The invention has thus far been described in connection with the preferred embodiments. It will be appreciated that various other changes, substitutions and additions may be made by those skilled in the art without departing from the spirit and scope of the invention. Therefore, the scope of the invention is not limited to the specific embodiments described above, but is defined by the appended claims.

Claims

A handwriting recognition method comprising:

Receiving a sequence of handwritten strokes that the user continuously inputs in the same input area;

Based on the single word credibility, the received handwritten stroke sequence is segmented and broken.
The method of claim 1 wherein said segmenting hyphenation of said received handwritten stroke sequence based on word confidence includes forward segmentation and/or reverse segmentation,

The forward segmentation determines the segmentation point of the received handwritten stroke sequence in the same order as the stroke input.

The reverse segmentation determines the segmentation point of the received handwritten stroke sequence in the reverse order of the stroke input.
The method of claim 2 wherein said forward segmentation comprises:

Reading the strokes of the received handwritten stroke sequence after the last point is read into the forward segmentation set;

Calculating the credibility of the stroke and its preceding stroke to form a single word for each stroke in the forward segmentation set;

Determining the gap between the stroke with the greatest degree of confidence in the word and the subsequent stroke as the segmentation point;

Repeat the above three steps,

Wherein, when the three steps are performed for the first time, the upper all points are located before the first input stroke.
The method of claim 2 wherein said reverse segmentation comprises:

Reading the strokes in the sequence of received handwritten strokes before the upper points are read into the reverse segmentation set;

Calculating the credibility of the stroke and the subsequent strokes to form a single word for each stroke in the reverse segmentation set;

Determining the gap between the stroke with the greatest degree of confidence in the word and the previous stroke as the cut point;

Repeat the above three steps,

Wherein, when the three steps are performed for the first time, the upper all points are located after the last input stroke.
The method according to claim 2, wherein if the segmentation points determined by the forward segmentation and the reverse segmentation do not coincide, the strokes between the two segmentation points before and after the non-coincidence segmentation point are executed finely. Splitting, wherein the fine cut includes:

Listing all the segmentation possibilities of the stroke, wherein each of the segments may correspond to a segmentation point configuration related to the number and position of the segmentation points;

For each segmentation possibility, calculate the credibility of the strokes between the segmentation points to form a word, and determine the total credibility of the segmentation according to the calculated word confidence;

The segmentation point configuration corresponding to the segmentation with the largest total reliability may be determined as the segmentation result.
The method of claim 1 further comprising:

Determining whether there is an overlap area between the words formed by the strokes between the cut points and the size of the overlap area;

Based on the determination, it is determined whether the single word constitutes a composite word.
The method of claim 1 further comprising: displaying or blanking the identified full character in a tint when the user inputs the stroke.
The method according to claim 7, wherein displaying or blanking the recognized complete characters in a light color when the user inputs the stroke comprises:

After the user newly inputs a stroke, handwriting recognition is performed on the sequence of strokes input by the user, thereby identifying the character string;

If a newly entered stroke is the first stroke of the last character in the string and the second-to-last character in the string is the same as the last character of the string recognized by the user after the last stroke, or if The newly input stroke is not the first stroke of the last character in the string and the second-to-last character in the string is the same as the second-to-last character of the character string recognized by the user after inputting the last stroke. Determine whether the number of strokes of the second last character is greater than 2;

If the number of strokes of the penultimate character is greater than 2, the second to last character and its previous characters are lightly displayed or blanked.
The method of claim 1 wherein segmenting the received handwritten stroke sequence is further based on a degree of matching of some or all of the strokes in the received sequence of handwritten strokes with the overlapping character template.
The method of claim 9 wherein each overlapping character template consists of two overlapping characters.
The method of claim 1 wherein language recognition is aided by language and/or writing rules.
A handwriting recognition device comprising:

a receiving device, configured to receive a sequence of handwritten strokes continuously input by the user in the same input area;

A segmentation device is configured to segment and segment the received handwritten stroke sequence based on the word confidence.
The apparatus of claim 12 wherein said cutting device comprises a forward severing device and/or a reverse severing device,

The forward severing device is configured to determine a puncturing point of the received handwritten stroke sequence in the same order as the stroke input.

The reverse slicing device is operative to determine a cut point of the received handwritten stroke sequence in an order opposite to the stroke input.
The apparatus of claim 13 wherein said forward severing means comprises:

a forward segmentation set forming unit, configured to read a stroke of the received handwritten stroke sequence after the upper all points into the forward segmentation set;

a single-word credibility calculation unit, configured to calculate, for each stroke in the forward segmentation set, the credibility of the stroke and the preceding stroke to form a single word;

a segmentation point determining unit for determining a gap between the stroke with the largest degree of credibility and the subsequent stroke as a segmentation point;

a control unit for controlling the above three units to repeatedly perform respective functions,

Wherein, when the forward segmentation set forming unit performs its function for the first time, the upper all points are located before the first input stroke.
The apparatus of claim 13 wherein said reverse severing means comprises:

a reverse segmentation set forming unit, configured to read a stroke of the received handwritten stroke sequence before the upper point of the entry into the reverse segmentation set;

a single word credibility calculation unit, configured to calculate a credibility of the stroke and the subsequent stroke forming a word for each stroke in the reverse segmentation set;

a segmentation point determining unit for determining a gap between the stroke with the largest degree of credibility and the previous stroke as a segmentation point;

a control unit for controlling the above three units to repeatedly perform respective functions,

Wherein, when the reverse-segment integration forming unit performs its function for the first time, the upper all points are located after the last input stroke.
The apparatus according to claim 13, further comprising: fine slitting means for, in the case where the cut points determined by the forward splitting and the reverse splitting do not coincide, the front and rear of the non-coincidence splitting points The strokes between the cut points are finely cut, where

The fine cutting device comprises:

Segmenting possible enumeration units for enumerating all the segmentation possibilities of the stroke, wherein each segmentation may correspond to a segmentation point configuration related to the number and position of the segmentation points;

The credibility calculation unit is configured to calculate the credibility of the stroke forming the word between the cut points for each of the segmentation possibilities, and determine the total credibility of the segmentation according to the calculated word credibility ;as well as

The segmentation result determining unit is configured to determine the segmentation point configuration corresponding to the segmentation with the largest total reliability as the segmentation result.
The apparatus of claim 12, further comprising a post-processing device, the post-processing device comprising:

An overlap area evaluation unit, configured to determine whether there is an overlap area between the words formed by the strokes between the cut points, and a size of the overlap area;

The synthesized word determining unit is configured to determine, based on the determination, whether the single word constitutes a synthesized word.
The apparatus according to claim 12, wherein said post-processing means is further configured to display or blank out the recognized complete characters in a light color when the user performs stroke input.
The apparatus of claim 18, wherein the post-processing device further comprises:

a string identifying unit, configured to perform handwriting recognition on the stroke sequence input by the user after the user newly inputs a stroke, thereby identifying the character string;

a judging unit, configured to: in the newly input one stroke is the first stroke of the last character in the string and the last character in the string is the last one of the character string recognized by the user after inputting the last stroke In the case where the characters are the same, or the newly input one stroke is not the first stroke of the last character in the string and the second-to-last character in the string is input with the character string recognized by the user after the last stroke If the second-to-last character is the same, determine whether the number of strokes of the second-to-last character is greater than 2;

a light color display or blanking unit for performing light color display or blanking on the penultimate character and its previous characters in a case where the number of strokes of the penultimate character is greater than 2.
The apparatus according to claim 12, wherein said segmentation means further performs segmentation hyphenation on the received sequence of handwritten strokes based on a degree of matching of some or all of the strokes in the received sequence of handwritten strokes with the overlapping character templates.
The apparatus of claim 20 wherein each overlapping character template is comprised of two overlapping characters.
The apparatus of claim 12, further comprising: post-processing means configured to utilize language and/or writing rules to assist in character recognition.