WO2006019537A1

WO2006019537A1 - Method and system for handwriting recognition using background pixels

Info

Publication number: WO2006019537A1
Application number: PCT/US2005/023320
Authority: WO
Inventors: Feng Jun Guo; Yong Ge; Li-Xin Zhen
Original assignee: Motorola Inc.
Priority date: 2004-07-22
Filing date: 2005-06-30
Publication date: 2006-02-23
Also published as: CN100343864C; CN1725228A

Abstract

A method and system for handwriting recognition uses independent and redundant measurements of features of input characters (100), resulting in improved accuracy. The system includes a microprocessor (163), a read only memory (ROM) (154) operatively connected to the microprocessor (163), a programmable memory (166) operatively connected to the microprocessor (163), and an input tablet (169) operatively connected to the microprocessor (163). The microprocessor (163) is operative to execute code stored in the ROM (154) to receive (step 205) a representation of a handwritten input character (100) scribed on the input tablet (169), the input character (100) defined by foreground pixels (105) adjacent background pixels (110), extract (step 210) foreground directional feature vectors from the foreground pixels (105), extract (step 215) background concavity feature vectors from the background pixels (110), and determine (step 220) a matching candidate character by matching the foreground directional feature vectors and the background concavity feature vectors with templates of model characters.

Description

METHOD AND SYSTEM FOR HANDWRITING RECOGNITION USING BACKGROUND PIXELS

FIELD OF THE INVENTION The invention relates generally to handwriting recognition processes.

The invention is particularly useful for, but not necessarily limited to, recognizing handwriting scribed on handheld electronic devices.

BACKGROUND OF THE INVENTION Computer recognition of printed text has presented vexing technical difficulties for several decades. Many optical character recognition (OCR) techniques have only recently been refined to enable scanned text documents to be processed with a high level of accuracy. However such OCR techniques generally are able to interpret only text that is printed by systems that create highly uniform and clear characters, such as electronic printers. Computer recognition of handwritten characters remains a very difficult technical challenge.

Existing methods for recognizing handwritten characters often include high-resolution templates that capture a time dimension related to the act of writing text in addition to the two physical dimensions of the written text.

Such templates are usually created using electronic tablets that record text strokes when a pen, stylus or finger contacts the tablet. Other techniques include the use of electronic pens that record the movement of the pen tip as characters are written. Directional feature vectors corresponding to the input handwritten characters are then matched with templates of model characters using various pattern recognition techniques. The templates of model characters comprise the statistical average values of directional feature vectors from numerous input samples of each character.

When handwriting on an electronic tablet, ideally a writer is able to see what he or she has written — such as through the use of electronic "ink" on a graphical user interface — so that the writer can control the legibility of the text. However, some handheld electronic devices such as mobile phones and personal digital assistants (PDAs) include small touch pads where users must write characters one on top of the other without receiving any feedback, for example feedback from a graphical user interface that displays how a device has interpreted the written input. Thus characters are written "blind." Not surprisingly, characters written on such touch pads tend to include significant distortions in the character shapes, particularly concerning more complex ideographic characters such as many Chinese and Japanese characters. The accurate electronic recognition of characters written on such touch pads is thus especially difficult.

There is therefore a need for an improved method and system for handwriting recognition that is able to overcome significant character shape distortions, such as distortions introduced through blind writing on touch pads, and to recognize the intended written characters. SUMMARY OF THE INVENTION

According to one aspect, the present invention is therefore an improved method of handwriting recognition. It includes receiving a representation of a handwritten input character scribed on a user interface of an electronic device, the input character defined by foreground pixels adjacent background pixels on the user interface. Foreground directional feature vectors are then extracted from the foreground pixels, and background concavity feature vectors are extracted from the background pixels. A matching candidate character is then determined by matching the foreground directional feature vectors and the background concavity feature vectors with templates of model characters.

Preferably, the invention further includes preprocessing the handwritten input character by performing smoothing, noise removal and size normalization processes. Preferably, the step of determining a matching candidate character further includes the steps of providing a first short list of candidate characters and corresponding first vector distances (dj) by matching the foreground directional feature vectors with the templates of model characters, providing a second short list of candidate characters and corresponding second vector distances (d₂) by matching the background concavity feature vectors with the templates of model characters, and combining the first vector distances (di) and the second vector distances (di) according to the following formula: d_comb = Wi_* di ₊ W_2* d₂, where d_comb is a weighted vector distance, Wi and W₂ are empirical weighting factors, and Wι + W₂= 1 , wherein the foreground directional feature vectors and the background concavity feature vectors are matched with the templates of model characters using the weighted vector distance d_comb.

Preferably, the method also includes reducing a size of the representation of the handwritten input character before extracting background concavity feature vectors from the background pixels.

Preferably, the step of extracting background concavity feature vectors from the background pixels comprises searching for foreground pixels in four directions away from each background pixel. Preferably, the templates of model characters comprise the statistical average values of directional feature vectors from numerous input samples.

Preferably, the method further includes re-sampling the input character to eliminate duplicate pixels and to normalize irregularities in pixel density.

Preferably, the empirical weighting factors Wi and W ₂ are defined during an iterative learning process based on the writing idiosyncrasies of an individual user of the electronic device.

According to another aspect, the present invention is a system for handwriting recognition including a microprocessor, a read only memory (ROM) operatively connected to the microprocessor, a programmable memory operatively connected to the microprocessor, and an input tablet operatively connected to the microprocessor. The microprocessor operates to execute code stored in the ROM to receive a representation of a handwritten input character scribed on the input tablet, the input character defined by foreground pixels adjacent background pixels, extract foreground directional feature vectors from the foreground pixels, extract background concavity feature vectors from the background pixels, and determine a matching candidate character by matching the foreground directional feature vectors and the background concavity feature vectors with templates of model characters.. In this specification, including the claims, the terms "comprises,"

"including," "comprising" or similar terms are intended to mean a non¬ exclusive inclusion, such that a method or apparatus that comprises a list of elements does not include those elements solely, but may well include other elements not listed.

BRIEF DESCRIPTION OF THE DRAWINGS In order that the invention may be readily understood and put into practical effect, reference will now be made to a preferred embodiment as illustrated with reference to the accompanying drawings, wherein like reference numbers refer to like elements, in which:

FIG. l is a schematic diagram illustrating a representation of a handwritten input character, in the form of a lower case Roman letter "e", that is scribed on an electronic device according to an embodiment of the present invention; FIG. 2 is a generalized flow diagram illustrating a method of handwriting recognition according to an embodiment the present invention;

FIG. 3 is another schematic diagram of an input character as scribed on a pixelated tablet of an electronic device, which further illustrates stroke directions used to form the character, according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating eight stroke directions used to define directional feature vectors according to an embodiment of the present invention;

FIG. 5 is another schematic diagram of an input character in the form of a lower case Roman letter "e", illustrating a boundary rectangle and background pixels, according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating four Freeman directions used in searching for foreground pixels according to an embodiment of the present invention;

FIG. 7 is a table that represents a background concavity feature vector according to an embodiment of the present invention;

FIGS. 8 and 9 are generalized flow diagrams illustrating a more detailed description of a method of handwriting recognition according to an embodiment the present invention; and

FIG. 10 is a schematic diagram illustrating a prior art mobile telephone including an input tablet on which representations of handwritten characters may be scribed.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT OF

THE INVENTION

Referring to FIG. 1 , there is a schematic diagram illustrating a representation of a handwritten input character 100 that is scribed on an electronic device according to an embodiment of the present invention. The character 100, in the form of a lower case Roman letter "e", comprises discrete foreground pixels 105 and background pixels 110. The foreground pixels 105 are generally of one colour such as black and form the lines and shapes of the input character 100. The background pixels 110 are generally of a sharply contrasting colour such as white. The size of the pixels 105, 110 varies depending on an image resolution setting of the electronic device, where a higher resolution results in smaller pixels 105, 110.

Referring to FIG. 2, there is a generalized flow diagram illustrating a method 200 of handwriting recognition according to an embodiment the present invention. The method 200 determines at least one matching model character candidate that best matches a representation of a handwritten input character 100 that is scribed on an electronic device. The method 200 begins at step 205 where the representation of the handwritten input character 100 is received when a user scribes the character 100 on the device using for example a stylus or finger. Next, at step 210 foreground directional feature vectors, as described in further detail below, are extracted from the foreground pixels 105. At step 215 background concavity feature vectors are extracted from the background pixels 110. The background concavity feature vectors include information about the shape of character strokes formed by the foreground pixels 105 based on searches for foreground pixels 105, where the searches begin from individual background pixels 110. That process is also described in more detail below. Finally, at step 220 the method 200 determines a matching candidate character by matching the foreground directional feature vectors and the background concavity feature vectors with templates of model characters. Methods for constructing templates of model characters were described briefly above and are well know in the art.

The method 200 may be incorporated into handheld electronic devices such as personal digital assistants (PDAs) and mobile telephones to provide an improved handwriting recognition capability. Because there is no direct correlation between the data used to create the foreground directional feature vectors and the data used to create the background concavity feature vectors, the method 200 includes a redundancy that improves accuracy. The redundancy enables errors in one type of vector, either in the foreground directional feature vectors or in the background concavity feature vectors, to be compensated by accuracies in the other type of vector. Further details of the method 200 are now described below.

Referring to FIG. 3, there is another schematic diagram of an input character 100 in the form of a lower case Roman letter "e" as scribed on a pixelated tablet of an electronic device, which further illustrates a stroke direction 305 used to form the input character 100. The letter "e" is an example of a character 100 that is provided at step 205 of method 200, where an electronic device receives a representation of a handwritten input character 100. Using the stroke directions 305, at step 210 the method 200 extracts directional feature vectors from the input character 100 according to methods well known in the art. For example, foreground pixels 105 of the input character 100 may be first fit to an NxN grid and normalized so that the size of the input character 100 is the same as the size of model characters used to create model character directional feature vectors. Each element of the NxN grid is then subdivided into an even finer mesh that is then analysed to extract a foreground directional feature vector. One example of a foreground directional feature vector is an 8 -dimension directional feature vector. Each dimension of an 8-dimension foreground directional feature vector corresponds to a stroke direction 305 used to create the input character 100. Referring to FIG. 4 there is a diagram illustrating that the eight stroke directions 305 may be created by dividing a circle into 45 -degree increments. Those skilled in the art will recognize that more or fewer dimensions for the foreground directional feature vectors may also be used according to the present invention. Each element of the mesh that contains foreground pixels 105 from a handwritten stroke is then assigned to one of the eight directions 305 based on a best fit of the actual stroke direction inside that element. The directional dimensions of the elements of the mesh are then summed to create a directional feature vector. An 8-dimension directional feature vector may be defined as

V={vi,V2,V3,V4,V5,V6,V7,v₈}, where the value of v,- is the count of the /th directional dimension in the mesh, where (K=i<=8). All the 8-dimension directional feature vectors in each element of the NxN grid are then averaged. Finally, an 8xNxN-dimension directional feature vector is derived for the entire input character.

Next, at step 215 of the method 200, background concavity feature vectors are extracted from the background pixels 110. Step 215 is further explained with reference to FIGS. 5 - 7. Referring to FIG. 5 there is again a schematic diagram of an input character 100 in the form of a lower case Roman letter "e" as scribed on a pixelated tablet of an electronic device. A normalisation step, as described in more detail below, identifies a region surrounding the input character 100 by defining a boundary rectangle 505 that circumscribes the input character 100. The background pixels 110 inside the boundary rectangle 505, such as pixels 110 "q" and "p" shown in FIG. 5, are then analysed as follows.

A search for foreground pixels 105 is conducted beginning at each background pixel 110 and extending away from each background pixel 110 in four directions until either the boundary rectangle 505 is reached or foreground pixels 105 are reached. The four directions that are searched, known as

Freeman directions, may be labelled 0, 1, 2, and 3, respectively, as shown in FIG. 6. The number of Freeman directions in which foreground pixels 105 are located then gives a rough approximation of the extent to which a background pixel 110 is located inside a closed contour defined by foreground pixels 105. Such an approximation is called a measure of concavity, and the location and extent of concavities can be used effectively in character recognition. That is, a measure of the concavity features of an input character 100 can be compared with a measure of the concavity features of templates of model characters to determine whether there is a match. According to the present invention, a background concavity feature vector defines the measure of concavity of an input character 100. Referring to FIG. 7, there is a table that represents a background concavity feature vector for the input character 100 shown in FIG. 5. The bottom row 710 of the table assigns a unique number to each column of the table, where each column represents a unique possible permutation that defines the results of a search from a background pixel 110. The top row 715 includes the number of directions in which a foreground pixel 105 was reached during searches from each background pixel 110. The third row 720 then identifies which directions did not reach any foreground pixels 105. Finally, the second row 725 is a counter that is incremented for each background pixel 110 that satisfies the definitions of the top row 715 and the third row 720.

For example, a search in four directions extending away from the background pixel 110 labelled "q" in FIG. 5 reaches foreground pixels 105 in two directions, directions 1 and 2. Thus the search does not reach foreground pixels 105 in the other two directions 3 and 0. Therefore the permutation 3 in row 710 defines pixel "q" and its associated counter in row 725 would be incremented by 1. Similarly, the background pixel 110 labelled "p" in FIG. 5 reaches foreground pixels 105 in three directions, directions 0, 2, and 3. Thus the search does not reach foreground pixels 105 in the remaining direction 1.

Therefore the permutation 5 in row 710 defines pixel "p", and its associated counter in row 725 would be incremented by 1.

The resolution of the image of the input character 100 inside the boundary rectangle 505 is preferably reduced before conducting the searches from the background pixels 110. Such resolution reduction decreases the number of background pixels 110 inside the boundary rectangle 505 and thereby accelerates the search process. Because the number of background pixels 110 will generally be much greater than the number of foreground pixels 105, reducing the resolution of the image does not significantly degrade the quality of the background concavity feature vectors.

After the background concavity feature vectors are extracted at step 215 of the method 200, the background concavity feature vectors are matched with templates of model characters according to template matching processes known in the art.

Referrring now to FIG. 8 there is a generalized flow diagram illustrating a more detailed description of a first part of the method 200 of handwriting recognition according to an embodiment the present invention. After a handwritten input character 100 is received at step 205, the method 200 continues to step 805 where the character 100 is re-sampled to eliminate duplicate pixels and to normalize any irregularity in pixel density. For example, such irregularities may occur when a writing speed is faster during one interval — resulting in fewer foreground pixels 105 — and slower during another interval — resulting in more foreground pixels 105. Re-sampling techniques known in the art, such as equi-distant re-sampling techniques that force a minimum Euclidean distance between two data points, thus result in foreground pixels 105 that are uniformly spaced. Such uniform spacing can enhance the character recognition accuracy of the present invention. Next, at step 810, additional pre-processing techniques are applied to the received input character 100. These include smoothing, noise removal and size normalization processes. Such pre-processing also increases the uniformity of the input characters 100, which leads to better character recognition results. The method 200 then continues at step 815 where the input character is fit to an NxN grid. At step 820 an 8-dimension directional feature vector is then defined and extracted. Next, at step 825, the resolution of the image of the input character 100 is decreased so as to reduce the number of background pixels 110 in preparation for extracting the background concavity feature vectors. At step 830 a search is conducted from each of the background pixels 110, as described above, in order to extract the background concavity feature vectors.

Referring now to FIG. 9, there is a generalized flow diagram that illustrates a continuation of the method 200 from FIG. 8. At step 935, the foreground directional feature vectors are matched with templates of model characters. That step is called foreground feature classification. A first short list of candidate characters is then provided at step 940, including first vector distances (d{) between the foreground directional feature vectors and the templates of model candidate characters. The distances between vectors may be based, for example, on either Euclidean distances or city-block distances. A similar process is then completed for the background concavity feature vectors. At step 945 the background concavity feature vectors are matched with templates of model characters. That step is called background feature classification. A second short list of candidate characters is then provided at step 950, including second vector distances (di) between the foreground directional feature vectors and the templates of model candidate characters.

At step 955, the first vector distances (dj) and the second vector distances (di) are combined according to the following formula: d_comb = W]* dι + W2* d₂. Here d_comb is a weighted vector distance that is used, as described below, to determine a final matching candidate character for use in a handwriting recognition system. Wi and W ₂ are weighting factors that are determined using empirical data based on the relative performance of the foreground feature classification when compared with the background feature classification. Generally Wi + W₂= \. The relative performance of the two classification steps depends on variables such as the alphabet of the input character 100 (e.g., Roman, Chinese, Japanese, etc.) and the writing styles of individuals. Thus particular embodiments of the present invention may automatically define the weighting factors Wi and W₂ as part of an iterative learning process that tunes the method of the present invention to the writing idiosyncrasies of individual users.

Finally, at step 960, the method 200 is completed when a matching candidate character is provided. Depending on the needs of a particular handwriting recognition system, a single candidate character or a list of several candidate characters may be provided at step 960.

With reference to FIG. 10, there is a schematic diagram illustrating a mobile telephone 151 such as may be used to implement the above described method of the present invention. The telephone 151 comprises a radio frequency communications unit 152 coupled to be in communication with a processor 153. Input interfaces in the form of a screen display 155, a keypad 156, and an input tablet 169 are also coupled to be in communication with the processor 153. Those skilled in the art will appreciate that the input tablet 169 may be integrated into other components of the telephone 151 such as the screen display 155. Users are then able to input handwritten text by scribing characters on the screen display 155.

The processor 153 includes an encoder/decoder 161 with an associated code Read Only Memory (ROM) 162 for storing data for encoding and decoding voice or other signals that may be transmitted or received by the mobile telephone 151. The processor 153 also includes a micro-processor 163 coupled, by a common data and address bus 167, to the encoder/decoder 161 and an associated character Read Only Memory (ROM) 164, a Random Access Memory (RAM) 154, static programmable memory 166, and a removable SIM module 168. The static programmable memory 166 and SIM module 168 can each store, amongst other things, model character feature vectors and representations of input characters entered using the input tablet 169.

The radio frequency communications unit 152 is a combined receiver and transmitter having a common antenna 157. The communications unit 152 has a transceiver 158 coupled to antenna 157 via a radio frequency amplifier

159. The transceiver 158 is also coupled to a combined modulator/demodulator 160 that couples the communications unit 152 to the processor 153.

The micro-processor 163 has ports for coupling, for example, to the keypad 156, the screen 155 and to the input tablet 169. The character Read

Only Memory 164 stores code for performing handwriting recognition, as described above, on representations of handwritten characters written on the input tablet 169 using, for example, a pen, stylus or finger. A user of the telephone 151 may therefore write one or more characters on the input tablet 169, and the telephone 151 will save the characters in the Random Access Memory (RAM) 154, in the static programmable memory 166, and/or in the removable SIM module 168. The user of the telephone 151 may then issue a command, for example using the keypad 156, requesting that the handwritten characters entered using the input tablet 169 be recognized.

The command to recognize the handwritten characters may be processed by the microprocessor 163. Using code saved in the code ROM 162, the microprocessor 163 will then execute the method 200 of the present invention as described above to determine, for each input character, at least one matching candidate character by matching foreground directional feature vectors and background concavity feature vectors with templates of model characters. Depending on the requirements of a particular system, the microprocessor 163 may then execute further commands based on the recognized input characters. Such further commands may include, for example, transmitting text messages comprising the recognized input characters or entering into an address book information comprising the recognized input characters.

The present invention is therefore an improved method and system for recognizing handwritten characters that are scribed on a user interface of an electronic device. Because there is no direct correlation between the data used to create the foreground directional feature vectors and the data used to create the background concavity feature vectors, the method 200 includes independent and redundant measurements that result in improved accuracy. The independent measurements enable errors in one type of vector, either in the foreground directional feature vectors or in the background concavity feature vectors, to be compensated by accuracies in the other type of vector. By analysing both foreground directional feature vectors and background concavity feature vectors associated with an input character 100, the present invention increases the probability that an input character 100 will be recognized correctly.

The above detailed description provides a preferred exemplary embodiment only, and is not intended to limit the scope, applicability, or configuration of the present invention. Rather, the detailed description of the preferred exemplary embodiment provides those skilled in the art with an enabling description for implementing the preferred exemplary embodiment of the invention. It should be understood that various changes can be made in the function and arrangement of elements and steps without departing from the spirit and scope of the invention as set forth in the appended claims.

Claims

WE CLAIM:

1. A method of handwriting recognition comprising the steps of: receiving a representation of a handwritten input character scribed on a user interface of an electronic device, the input character defined by foreground pixels adjacent background pixels on the user interface; extracting foreground directional feature vectors from the foreground pixels; extracting background concavity feature vectors from the background pixels; and determining a matching candidate character by matching the foreground directional feature vectors and the background concavity feature vectors with templates of model characters.

2. The method of claim 1 , further comprising the step of preprocessing the handwritten input character by performing smoothing, noise removal and size normalization processes.

3. The method of claim 1, wherein the step of determining a matching candidate character further comprises the steps of: providing a first short list of candidate characters and corresponding first vector distances (d_/) by matching the foreground directional feature vectors with the templates of model characters; providing a second short list of candidate characters and corresponding second vector distances tø) by matching the background concavity feature vectors with the templates of model characters; and combining the first vector distances (dj) and the second vector distances {di) according to the following formula: dcomb ⁼ W]* di + W 2* d2, where d_comb is a weighted vector distance, Wi and W2 are empirical weighting factors, and W/ + W ₂= 1 ; wherein the foreground directional feature vectors and the background concavity feature vectors are matched with the templates of model characters using the weighted vector distance d_comb.

4. The method of claim 1, further comprising the step of reducing a size of the representation of the handwritten input character before extracting background concavity feature vectors from the background pixels.

5. The method of claim 1, wherein the step of extracting background concavity feature vectors from the background pixels comprises searching for foreground pixels in four directions away from each background pixel.

6. The method of claim 1 , wherein the templates of model characters comprise the statistical average values of directional feature vectors from numerous input samples.

7. The method of claim 1, further comprising the step of re-sampling the input character to eliminate duplicate pixels and to normalize irregularities in pixel density.

8. The method of claim 3, wherein the empirical weighting factors Wi and

W₂ are defined during an iterative learning process based on the writing idiosyncrasies of an individual user of the electronic device.

9. A system for handwriting recognition comprising: a microprocessor; a read only memory (ROM) operatively connected to the microprocessor; a programmable memory operatively connected to the microprocessor; and an input tablet operatively connected to the microprocessor; the microprocessor operative to execute code stored in the ROM to receive a representation of a handwritten input character scribed on the input tablet, the input character defined by foreground pixels adjacent background pixels, extract foreground directional feature vectors from the foreground pixels, extract background concavity feature vectors from the background pixels, and determine a matching candidate character by matching the foreground directional feature vectors and the background concavity feature vectors with templates of model characters.

10. The system of claim 9, wherein the microprocessor is further operative to preprocess the handwritten input character by performing smoothing, noise removal and size normalization processes.

11. The system of claim 9, wherein the microprocessor further provides, when it determines a matching candidate character, a first short list of candidate characters and corresponding first vector distances {di) by matching the foreground directional feature vectors with the templates of model characters, provides a second short list of candidate characters and corresponding second vector distances (di) by matching the background concavity feature vectors with the templates of model characters, and combines the first vector distances (di) and the second vector distances (di) according to the following formula: d_comb ^~ Wi* di + W2_* d2, where d_comb is a weighted vector distance, Wi and W2 are empirical weighting factors, and Wi + W₂= l, wherein the foreground directional feature vectors and the background concavity feature vectors are matched with the templates of model characters using the weighted vector distance d_comb.

12. The system of claim 9, wherein the microprocessor further reduces a size of the representation of the handwritten input character before extracting background concavity feature vectors from the background pixels.

13. The system of claim 9, wherein the microprocessor searches for foreground pixels in four directions away from each background pixel when it extracts background concavity feature vectors from the background pixels.

14. The system of claim 9, wherein the templates of model characters comprise the statistical average values of directional feature vectors from numerous input samples.

15. The system of claim 9, wherein the microprocessor re-samples the input character to eliminate duplicate pixels and to normalize irregularities in pixel density.

16. The system of claim 11, wherein the empirical weighting factors W_\ and Wj are defined during an iterative learning process based on the writing idiosyncrasies of an individual user of the electronic device.