WO2000062244A1 - Handwriting coding and recognition - Google Patents

Handwriting coding and recognition Download PDF

Info

Publication number
WO2000062244A1
WO2000062244A1 PCT/GB2000/001340 GB0001340W WO0062244A1 WO 2000062244 A1 WO2000062244 A1 WO 2000062244A1 GB 0001340 W GB0001340 W GB 0001340W WO 0062244 A1 WO0062244 A1 WO 0062244A1
Authority
WO
WIPO (PCT)
Prior art keywords
intervals
handwriting
input parameter
input
recording
Prior art date
Application number
PCT/GB2000/001340
Other languages
French (fr)
Inventor
Ross Walker
Sean Kavanagh
Charalampos Ferekidis
Original Assignee
New Transducers Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New Transducers Limited filed Critical New Transducers Limited
Priority to AU39793/00A priority Critical patent/AU3979300A/en
Priority to EP00919036A priority patent/EP1173823A1/en
Priority to JP2000611237A priority patent/JP2002541597A/en
Publication of WO2000062244A1 publication Critical patent/WO2000062244A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/30Writer recognition; Reading and verifying signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments
    • G06V30/1423Image acquisition using hand-held instruments; Constructional details of the instruments the instrument generating sequences of position coordinates corresponding to handwriting

Definitions

  • This invention relates to handwriting coding and recognition, and in particular for a system that uses a coding method to generate data for recognition.
  • Handwriting recognition can be divided- up into distinct types of recognition problem.
  • the main distinction is between continuous writing, where there are no gaps between letters and script where the characters are discrete. Since there are always gaps between words in continuous writing it is logical for a recognition system to recognise whole words which are easier to isolate and probably easier to recognise.
  • TSPAR Time Encoded Signal Processing and Recognition
  • a coding method for handwriting comprising the steps of recording the variation of an input parameter, identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the lengths of the intervals, identifying the number of complex zeroes of the input parameter, up to a predetermined rank, in the said intervals, and recording the quantised lengths of the intervals and a measure of the said number complex zeroes up to a predetermined rank as a representation of the variation of the input parameter.
  • a predetermined rank of 1 has been found to give good results.
  • the method records the number of first rank zeroes, i.e. positive minima or negative maxima. This information may provide sufficient detail for useful characterisation without requiring excessive calculation.
  • the method thus parametrics the shape of the input parameter function. If the parameter rises smoothly to a maximum and then falls smoothly to the next zero, there will be no positive minimal so said number will be zero. If the function has an "M" shape, rising to a maximum, falling to a minimum and then rising to another maximum before passing through zero, then there will be one positive minimum so the said number will be one.
  • the number parameter ises the number of oscillations of the input parameter between zeroes.
  • the reason that the positive minima or negative maxima are known as complex zeroes of a function is that they correspond to zeroes of the function for complex number inputs to the function.
  • the first rank zeroes occur at real values being the real values of the complex numbers for which the function has a value zero.
  • the coding method may be a TESPAR method.
  • the method may further comprise the ⁇ step of generating a code number taking one of a set of predetermined values representing the duration of the interval and number of maxima and minima for at least some of the said intervals.
  • the code numbers may be further parameterised.
  • an S matrix may be calculated.
  • the S matrix records the number of instances of each code number in the recorded variation of the input parameter.
  • an A matrix recording the number of instances of a first code number following a second code number with a predetermined lag may be calculated.
  • a further alternative is to calculate a DZ matrix recording the number of instances of amplitude, length of interval and number of maxima and minima increasing, decreasing or staying the same in the next epoch.
  • the S, A and/or DZ matrices may be stored or evaluated.
  • the above approach may be used to record at least two input parameters.
  • the input parameters may be selected from x and y coordinates, speed, one or more coordinates of velocity (e.g. speed and angle, or x coordinate and y coordinate), angular velocity, or radius of curvature.
  • the lengths of the intervals may be measured in time, i.e. the recorded length represents the time between successive crossings of the given value.
  • the recorded length is a distance.
  • the said given value is preferably a predetermined value.
  • the value may be a zero value, particularly for a coordinate such as the vertical (y) component of velocity which passes through zero.
  • the given value may also be a value corresponding to a median value of the handwriting.
  • This approach can be used to parameterise the y- coordinate position, for example, in which case the value can be the centre line of the handwriting, i.e. the median y-coordinate assuming that the handwriting is written in the x direction.
  • a writing tablet for capturing the input parameter or parameters.
  • the writing tablet may also output position data which can be converted into other parameters using a computer.
  • the method may be used on handwriting that is already written, by scanning the handwriting, and then generating the input parameters. This may be done by using known techniques for vectorising bitmap data.
  • the first aspect of the invention also envisages a method of handwriting recognition, including the steps of coding the handwriting as described above, comparing the coded representations with representations corresponding to a number of different characters to find the bet match and outputting the character corresponding to the best match.
  • the comparing step may be carried out using a matrix method in which the correlations between matrices representing the coded data and matrices representing each of the said plurality of different characters are compared.
  • a neural network approach to the comparison could be used. Either approach may require a training period to generate suitable matrices.
  • a handwriting coding apparatus comprising a data input means for measuring the variation of an input parameter, an identifying means for identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the intervals, identifying the number of maxima and minima of the input parameter in said intervals, and a recording means for recording the quantised intervals and the said number of maxima and minima as a representation of the variation of the input parameter.
  • the above approach is used to record at least two input parameters.
  • the input parameters may be selected from x and y coordinates, speed, one or more coordinates of velocity (e.g. speed and angle, or x coordinate and y coordinate) , angular velocity, or radius of curvature.
  • a writing tablet is provided for capturing the input parameters.
  • the writing tablet may be of a conventional form.
  • the method may be used on handwriting that is already written, by providing a scanner for scanning the handwriting, and then generating the input parameters in the data input means by using known techniques, such as those used for vectorising bitmap data.
  • the second aspect of the invention may also provide a handwriting recognition apparatus, including a handwriting coding apparatus as described above, wherein the apparatus further comprises a processing means adapted to compare the coded representations of the characters with representations corresponding to a number of different characters to find the best match and outputting the character corresponding to the best match.
  • the identifying means and the processing means are both the central processing unit of a computer.
  • a computer program recorded on a data carrier operable to control a computer having a data store, an input device and a processor, the computer program operable to cause the said computer to carry out in cooperation with the input device a method for coding handwriting comprising the steps of recording the variation of an input parameter, identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the intervals, identifying the number of first rank zeroes of the input parameter in said intervals, and recording the quantised intervals and the said number of first rank zeroes as a representation of the variation of the input parameter.
  • the input device may be a writing tablet or a scanner, for example.
  • the computer program recorded on a data carrier preferably also causes the computer to carry out the step of comparing the coded representation with representations corresponding to a number of different characters to find the best match and outputting the character corresponding to the best match.
  • FIG. 1 shows a schematic diagram of an apparatus according to the invention
  • Figure 2 shows a flow diagram of a coding method in accordance with the invention
  • Figure 3 shows a handwriting recognition scheme in accordance with the invention for recognising static text
  • Figure 4 shows a handwriting recognition scheme in accordance with the invention for recognising dynamic text .
  • a writing tablet (1) records the x- and y- coordinates of a stylus (2) used to write thereupon.
  • the tablet is connected to a computer (3) having a display (5), a processor (9) and a data store (11) .
  • the computer is connected to the tablet by a connector (13) .
  • a program is stored in the data store (11) to cause the system to carry out a data coding method, schematically shown in Fig.2.
  • data corresponding to input coordinates is captured by the tablet (1) which records the position of the stylus as a user writes with the stylus (2) on the tablet (1) .
  • the tablet (1) outputs the x- and y- coordinates of the stylus (2) .
  • the x and y components of the velocity are then calculated by the computer.
  • the method then continues using the y- coordinate of position and the x- and y- coordinates of velocity as the three input parameters. These input parameters are not exclusive, and for some arrangements other sets of input parameters may be more suitable.
  • the computer then calculates the centre line of the writing, and uses that as the special value for the y- coordinate input parameter.
  • the zero value of the x and y components of the velocity is used for those components. For convenience, all these given values will be referred to as "zero values" hereinafter.
  • the data is parameterised by quantising the time interval between zero crossings, i.e. the time between the consecutive occasions that the input data parameter crosses the zero value. Then, the number of first rank zeroes in each time interval is recorded. The - number of first rank zeroes is the number of negative maxima or positive minima. The interval and number of zeroes is then coded using a single integer between 1 and 28.
  • the integer 1 for example corresponds to an interval of length up to a predetermined time and no first rank zeroes.
  • Each range of intervals and number of zeroes is assigned to a predetermined integer, as set out in GB 2162024.
  • the complete writing sample is then parameterised using an "S" matrix, as used in the TESPAR method.
  • the "S" matrix records the number of instances of each code number in the data sample .
  • the method compares the recorded values with values for each possible character of the alphabet and numerals 1 to 9, to record which character has been written.
  • the character that gives the best match for each of the three input parameters is output, and the computer then goes on to evaluate the next character.
  • the method includes the so-called "TESPAR" method.
  • a speech encoder known as TESPAR is known from GB2162024.
  • the method of the invention may carry out the coding using the TESPAR method and specific hardware adapted to implement the method. A brief summary of TESPAR will now be given.
  • Any band limited signal can be represented exactly in terms of its real and complex zeros, such that
  • ⁇ _ are the zeros of the function.
  • Real zeros correspond to times where the function f(t) crosses the zero line.
  • Complex zeros occur in conjugate pairs for a real function and can be classified by considering the effect of differentiation with respect to time.
  • the rank of a complex zero is defined as the number of differentiations required to produce a real zero at the same time ordinate as the complex zero.
  • positive minima (minima at positive values) and negative maxima (maxima at negative values) of the waveform correspond to first rank complex zeros because these stationary points become zero crossings when the waveform is differentiated once.
  • the waveform in TESPAR may be divided up into sections using the real zeros as the boundary points between sections. Each interval between two real zeros is then referred to as an epoch. This is not the only method available, but it is simple and usually effective.
  • the epochs have been defined some relevant information needs to be recorded about them. This conventionally includes the length of the interval (duration) of the epoch; its maximum amplitude and the number of first rank zeros. At this point the duration is usually represented as the nearest integer number of samples of the waveform rather than the exact time duration. The result is a matrix of values of size (3 x number of epochs) containing some of the information contained in the waveform.
  • the time encoding can be done.
  • schemes for encoding a signal once it has been stored in epoch form as described above called the natural TESPAR stream.
  • the general idea however is to take each epoch in turn, or several epochs at a time, and produce a code number depending upon the information stored in the epoch (s).
  • a common encoding scheme considers the duration and number of first rank zeroes and returns a single code number, dependent upon these values, ranging between 1 and 28.
  • the matrix is actually a vector of length equal to the number of TESPAR codes used to describe all the different types of epoch. Each element of the vector contains the number of times that an epoch with that code occurred in the signal. The resulting vector contains information about the content of the signal but no information about the ordering of the epochs, which means the signal cannot be regenerated without extra information.
  • the ⁇ S' matrix can be further refined. Rather than incrementing an element of the vector by one for each epoch with the appropriate code the element can be incremented by an amount that depends upon some characteristic of the epoch. This leads to the duration weighted S' matrix, where the elements are incremented by an amount dependent on the duration of the epoch and the amplitude weighted S' matrix where the maximum epoch amplitude is used as weighting.
  • the final matrix itself can be modified. For example the number in each element of the S' matrix can be doubled to produce a matrix that is more heavily weighted by the number of epochs that occur in the waveform.
  • the A' matrix is a two dimensional matrix that is generated by considering pairs of epochs in turn. These can be adjacent epochs or they can have a specified separation (called the lag). Each epoch in the pair will have a TESPAR code associated with it which gives two ordinates that specify the element of the matrix that is to be incremented.
  • the ⁇ A' matrix is similar to the ⁇ S' matrix but some information as to the ordering of the epochs is retained. An example of this matrix is shown in figure 2. As with the S' matrix the elements of the ⁇ A' matrix can be weighted according to some combination of epoch parameters.
  • the X DZ' matrix is generated by considering pairs of epochs and looking at how they change. Specifically the maximum amplitude, duration and number of complex first rank zeros are examined in each epoch. Each of these parameters may increase, decrease or stay the same which gives 27 possible combinations and hence a 27 element vector.
  • the ⁇ DZ' matrix looks similar to an y S' matrix. Since duration and amplitude are nearly continuous it is usual to specify a range of changes for these values that are taken to be the same for the purposes of encoding the X DZ' matrix.
  • an archetypal matrix can be generated by simply adding the matrices together and dividing by the number of matrices used. To test to see if a signal is the same as the one represented by the archetype the signal needs to be encoded in the same matrix format as the archetype and then compared, usually by finding the correlation score. In this way many archetypes can be included and ranked as to which are most likely to be the same as the signal.
  • a neural -net may be used. Since a signal of any length will always produce a matrix of the same size when it is time encoded (assuming the same matrix is used, of course) a neural net can be designed with a fixed number of inputs corresponding to the number of elements in the matrix. With sufficient training examples the neural net can be trained to recognise a number of standard signals.
  • the (x,y) co-ordinates provide two waveforms that vary with time and are orthogonal. These can be encoded separately to provide two separate matrices or they can be used to derive other parameters. An example would be differentiating x(t) and y(t) once to produce the velocities. In each case the waveform chosen has to have a "zero" value that the waveform crosses so that the epochs can be defined. In the case of the velocities this could be the actual zero value whereas for the y ordinate a line through the middle of the text would be required.
  • Figure 3 illustrates one possible implementation of a system to recognise text as it is being written.
  • data of the position and velocity of the pen on a writing tablet is obtained.
  • the y coordinate of the writing implement about a mean level and the velocity of the implement in the x direction as a function of time are then extracted from the data in the next step 33.
  • These two waveforms are then time encoded 35 using the TESPAR encoding scheme and compared with a store of previous examples. The closest match between the known examples and the input data is then output.
  • a specific application of the above recognition example is signature recognition.
  • the person in question signs their name on a tablet that records the position as a function of time. This signal can then be time-encoded and compared to known time-encoded signatures to verify the identity of that person.
  • the recognition method may also be applied to static text.
  • the text to be recognised has already been written the information as to how it was written is not available.
  • this case provides • a subset of the information available in the dynamic, case where the text is analysed as it is written.
  • One representation of written script is as a bitmap in which the pixels are either black (foreground colour) or white (background colour) .
  • this waveform can be encoded.
  • One method is to encode each column (or row) into matrix form and then use the resulting set of matrices as the input for a neural net.
  • the array could be turned into a one- dimensional vector by writing each row (or column) sequentially.
  • the resulting vector could then be encoded to produce a single matrix that could be used to characterise the image.
  • This example describes one method of implementing a system to recognise scanned images of static text.
  • the flow diagram for this system is shown in Figure 4.
  • an image of the text to be classified is scanned (41) to render it in electronic form and stored as a two dimensional bitmap (43) .
  • This image is then read (45) both one row after another and one column after another to produce two long waveforms corresponding to the row and column scans.
  • a pixel that is clear is represented by minus 1 and a filled pixel by plus 1. This produces a rectangular waveform with well defined zero crossing points.
  • the resulting encoded waveforms are then compared (47) with previously determined examples to assign probabilities to each.
  • the most likely is then returned (49) as the recognised text.

Abstract

A coding and recognition method for handwriting identifies intervals between occurrences of an input parameter, such as pen position and velocities, crossing a given value, and recording the lengths of the intervals. The number of maxima and minima in the intervals are then recorded.

Description

TITLE: HANDWRITING CODING AND RECOGNITION
DESCRIPTION
This invention relates to handwriting coding and recognition, and in particular for a system that uses a coding method to generate data for recognition.
The problem of recognising handwriting has attracted much attention in recent years, particularly for use as a system for computer input, especially for small notebook- type computer systems and so called "palm-top" computer systems .
Handwriting recognition can be divided- up into distinct types of recognition problem. The main distinction is between continuous writing, where there are no gaps between letters and script where the characters are discrete. Since there are always gaps between words in continuous writing it is logical for a recognition system to recognise whole words which are easier to isolate and probably easier to recognise.
Another important distinction in a hand writing recognition scheme is whether the letters are produced as they are being analysed or if the characters have been produced beforehand. In the former case more information will be available.
In a different field, it is known to time encode speech using a Time Encoded Signal Processing and Recognition (TESPAR) coder. Aspects of this coder are described in GB 2020517, GB 2084433, GB 2162024, GB 2162025, GB 2187586, GB 2179 183, WO 92/15089, W097/31368, W097/45831 and WO98/08188.
According to the first aspect of the invention there is provided a coding method for handwriting, comprising the steps of recording the variation of an input parameter, identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the lengths of the intervals, identifying the number of complex zeroes of the input parameter, up to a predetermined rank, in the said intervals, and recording the quantised lengths of the intervals and a measure of the said number complex zeroes up to a predetermined rank as a representation of the variation of the input parameter.
A predetermined rank of 1 has been found to give good results. In this case the method records the number of first rank zeroes, i.e. positive minima or negative maxima. This information may provide sufficient detail for useful characterisation without requiring excessive calculation.
The method thus parametrics the shape of the input parameter function. If the parameter rises smoothly to a maximum and then falls smoothly to the next zero, there will be no positive minimal so said number will be zero. If the function has an "M" shape, rising to a maximum, falling to a minimum and then rising to another maximum before passing through zero, then there will be one positive minimum so the said number will be one.
Thus, the number parameterises the number of oscillations of the input parameter between zeroes.
The reason that the positive minima or negative maxima are known as complex zeroes of a function is that they correspond to zeroes of the function for complex number inputs to the function. The first rank zeroes occur at real values being the real values of the complex numbers for which the function has a value zero.
The coding method may be a TESPAR method.
The method may further comprise the ■ step of generating a code number taking one of a set of predetermined values representing the duration of the interval and number of maxima and minima for at least some of the said intervals.
The code numbers may be further parameterised. In one approach an S matrix may be calculated. The S matrix records the number of instances of each code number in the recorded variation of the input parameter. Alternatively or additionally an A matrix recording the number of instances of a first code number following a second code number with a predetermined lag may be calculated. A further alternative is to calculate a DZ matrix recording the number of instances of amplitude, length of interval and number of maxima and minima increasing, decreasing or staying the same in the next epoch.
The S, A and/or DZ matrices may be stored or evaluated.
The above approach may be used to record at least two input parameters. The input parameters may be selected from x and y coordinates, speed, one or more coordinates of velocity (e.g. speed and angle, or x coordinate and y coordinate), angular velocity, or radius of curvature.
The lengths of the intervals may be measured in time, i.e. the recorded length represents the time between successive crossings of the given value. However, other parameters, such as the x coordinate, could also be used In the latter case the recorded length is a distance. TESPAR works on single-valued function of a variable.
The said given value is preferably a predetermined value. The value may be a zero value, particularly for a coordinate such as the vertical (y) component of velocity which passes through zero. The given value may also be a value corresponding to a median value of the handwriting. b
This approach can be used to parameterise the y- coordinate position, for example, in which case the value can be the centre line of the handwriting, i.e. the median y-coordinate assuming that the handwriting is written in the x direction.
Preferably, a writing tablet is provided for capturing the input parameter or parameters. The writing tablet may also output position data which can be converted into other parameters using a computer. Alternatively, the method may be used on handwriting that is already written, by scanning the handwriting, and then generating the input parameters. This may be done by using known techniques for vectorising bitmap data.
The first aspect of the invention also envisages a method of handwriting recognition, including the steps of coding the handwriting as described above, comparing the coded representations with representations corresponding to a number of different characters to find the bet match and outputting the character corresponding to the best match.
The comparing step may be carried out using a matrix method in which the correlations between matrices representing the coded data and matrices representing each of the said plurality of different characters are compared. Alternatively, a neural network approach to the comparison could be used. Either approach may require a training period to generate suitable matrices.
According to a second aspect of the invention there is provided a handwriting coding apparatus, comprising a data input means for measuring the variation of an input parameter, an identifying means for identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the intervals, identifying the number of maxima and minima of the input parameter in said intervals, and a recording means for recording the quantised intervals and the said number of maxima and minima as a representation of the variation of the input parameter.
Preferably, the above approach is used to record at least two input parameters. The input parameters may be selected from x and y coordinates, speed, one or more coordinates of velocity (e.g. speed and angle, or x coordinate and y coordinate) , angular velocity, or radius of curvature.
Preferably, a writing tablet is provided for capturing the input parameters. The writing tablet may be of a conventional form. Alternatively, the method may be used on handwriting that is already written, by providing a scanner for scanning the handwriting, and then generating the input parameters in the data input means by using known techniques, such as those used for vectorising bitmap data.
The second aspect of the invention may also provide a handwriting recognition apparatus, including a handwriting coding apparatus as described above, wherein the apparatus further comprises a processing means adapted to compare the coded representations of the characters with representations corresponding to a number of different characters to find the best match and outputting the character corresponding to the best match.
Preferably, the identifying means and the processing means are both the central processing unit of a computer.
According to a third aspect of the invention there is provided a computer program recorded on a data carrier, operable to control a computer having a data store, an input device and a processor, the computer program operable to cause the said computer to carry out in cooperation with the input device a method for coding handwriting comprising the steps of recording the variation of an input parameter, identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the intervals, identifying the number of first rank zeroes of the input parameter in said intervals, and recording the quantised intervals and the said number of first rank zeroes as a representation of the variation of the input parameter.
The input device may be a writing tablet or a scanner, for example.
The computer program recorded on a data carrier according to the third aspect preferably also causes the computer to carry out the step of comparing the coded representation with representations corresponding to a number of different characters to find the best match and outputting the character corresponding to the best match.
For a better understanding of the invention an embodiment will now be described, purely by way of example, with reference to the accompanying drawings, in which
Figure 1 shows a schematic diagram of an apparatus according to the invention, Figure 2 shows a flow diagram of a coding method in accordance with the invention,
Figure 3 shows a handwriting recognition scheme in accordance with the invention for recognising static text, and Figure 4 shows a handwriting recognition scheme in accordance with the invention for recognising dynamic text .
Referring to Fig.l, a writing tablet (1) records the x- and y- coordinates of a stylus (2) used to write thereupon. The tablet is connected to a computer (3) having a display (5), a processor (9) and a data store (11) . The computer is connected to the tablet by a connector (13) .
A program is stored in the data store (11) to cause the system to carry out a data coding method, schematically shown in Fig.2.
Firstly, data corresponding to input coordinates is captured by the tablet (1) which records the position of the stylus as a user writes with the stylus (2) on the tablet (1) . The tablet (1) outputs the x- and y- coordinates of the stylus (2) . The x and y components of the velocity are then calculated by the computer. The method then continues using the y- coordinate of position and the x- and y- coordinates of velocity as the three input parameters. These input parameters are not exclusive, and for some arrangements other sets of input parameters may be more suitable. The computer then calculates the centre line of the writing, and uses that as the special value for the y- coordinate input parameter. The zero value of the x and y components of the velocity is used for those components. For convenience, all these given values will be referred to as "zero values" hereinafter.
It is also possible to eliminate any offsets by differentiating the input once, then parameterising the input.
The data is parameterised by quantising the time interval between zero crossings, i.e. the time between the consecutive occasions that the input data parameter crosses the zero value. Then, the number of first rank zeroes in each time interval is recorded. The - number of first rank zeroes is the number of negative maxima or positive minima. The interval and number of zeroes is then coded using a single integer between 1 and 28. The integer 1 for example corresponds to an interval of length up to a predetermined time and no first rank zeroes. Each range of intervals and number of zeroes is assigned to a predetermined integer, as set out in GB 2162024.
The complete writing sample is then parameterised using an "S" matrix, as used in the TESPAR method. The "S" matrix records the number of instances of each code number in the data sample .
The method compares the recorded values with values for each possible character of the alphabet and numerals 1 to 9, to record which character has been written. The character that gives the best match for each of the three input parameters is output, and the computer then goes on to evaluate the next character.
In the specific example the method includes the so- called "TESPAR" method. A speech encoder known as TESPAR is known from GB2162024. The method of the invention may carry out the coding using the TESPAR method and specific hardware adapted to implement the method. A brief summary of TESPAR will now be given.
Any band limited signal can be represented exactly in terms of its real and complex zeros, such that
Figure imgf000012_0001
where τ_ are the zeros of the function. Real zeros correspond to times where the function f(t) crosses the zero line. Complex zeros occur in conjugate pairs for a real function and can be classified by considering the effect of differentiation with respect to time. The rank of a complex zero is defined as the number of differentiations required to produce a real zero at the same time ordinate as the complex zero. Thus positive minima (minima at positive values) and negative maxima (maxima at negative values) of the waveform correspond to first rank complex zeros because these stationary points become zero crossings when the waveform is differentiated once.
The waveform in TESPAR may be divided up into sections using the real zeros as the boundary points between sections. Each interval between two real zeros is then referred to as an epoch. This is not the only method available, but it is simple and usually effective.
Once the epochs have been defined some relevant information needs to be recorded about them. This conventionally includes the length of the interval (duration) of the epoch; its maximum amplitude and the number of first rank zeros. At this point the duration is usually represented as the nearest integer number of samples of the waveform rather than the exact time duration. The result is a matrix of values of size (3 x number of epochs) containing some of the information contained in the waveform.
It should be noted that at this point an approximation to the original waveform could still be regenerated. It would not be exact since the duration of each epoch has been quantised and only the presence of first rank zeros has been noted. The position of the zeros and the existence of higher rank zeros are not recorded. However, there is sufficient information stored to allow a good approximation to the original waveform to be generated in the sense that a regenerated speech signal is easily understood.
Once the epochs have been defined and their parameters recorded the time encoding can be done. There are many schemes for encoding a signal once it has been stored in epoch form as described above (called the natural TESPAR stream) . The general idea however is to take each epoch in turn, or several epochs at a time, and produce a code number depending upon the information stored in the epoch (s). A common encoding scheme considers the duration and number of first rank zeroes and returns a single code number, dependent upon these values, ranging between 1 and 28.
When the epochs have been encoded some more information has been lost, but a signal can still be regenerated. However, at this point the aim is not usually to regenerate the signal but to produce a compact format that will allow the waveform to be characterised and compared with other waveforms. To this end the code numbers for each epoch are combined into one of- a variety of possible matrices. The more common types of matrix are described below.
The matrix is actually a vector of length equal to the number of TESPAR codes used to describe all the different types of epoch. Each element of the vector contains the number of times that an epoch with that code occurred in the signal. The resulting vector contains information about the content of the signal but no information about the ordering of the epochs, which means the signal cannot be regenerated without extra information.
The λS' matrix can be further refined. Rather than incrementing an element of the vector by one for each epoch with the appropriate code the element can be incremented by an amount that depends upon some characteristic of the epoch. This leads to the duration weighted S' matrix, where the elements are incremented by an amount dependent on the duration of the epoch and the amplitude weighted S' matrix where the maximum epoch amplitude is used as weighting. In addition the final matrix itself can be modified. For example the number in each element of the S' matrix can be doubled to produce a matrix that is more heavily weighted by the number of epochs that occur in the waveform. The A' matrix is a two dimensional matrix that is generated by considering pairs of epochs in turn. These can be adjacent epochs or they can have a specified separation (called the lag). Each epoch in the pair will have a TESPAR code associated with it which gives two ordinates that specify the element of the matrix that is to be incremented. The ΛA' matrix is similar to the ΛS' matrix but some information as to the ordering of the epochs is retained. An example of this matrix is shown in figure 2. As with the S' matrix the elements of the ΛA' matrix can be weighted according to some combination of epoch parameters.
It is possible to extend the idea of the λA' matrix by considering groups of three epochs at a time to produce a three dimensional matrix. Indeed this can be extended up to the number of epochs in the signal at which point there is one non-zero element in a matrix with the same number of dimensions as there are epochs in the waveform. The XDZ' matrix is generated by considering pairs of epochs and looking at how they change. Specifically the maximum amplitude, duration and number of complex first rank zeros are examined in each epoch. Each of these parameters may increase, decrease or stay the same which gives 27 possible combinations and hence a 27 element vector. Thus the ΛDZ' matrix looks similar to an yS' matrix. Since duration and amplitude are nearly continuous it is usual to specify a range of changes for these values that are taken to be the same for the purposes of encoding the XDZ' matrix.
To illustrate how an end-point finding approach might work consider a speech waveform with individual words separated by silence. The signal is divided up: -into short time segments (say 20ms) and each time segment encoded into an S' matrix. Each element of these S' matrices is then summed up to give one number per time slice reflecting how many epochs are present in each time slice. Note that the silence may include a lot of small amplitude epochs as unavoidable background noise. In this case some form of suppression is required such as ignoring all epochs with maximum amplitude less than a certain value. This simple graph will then have low values for regions of silence and high values in regions where a word is being spoken, which allows the end points of the words to be estimated.
Once the appropriate section of a waveform has been located and encoded in some form of matrix it usually needs to be categorised as an example of a previously known signal. The two most common approaches are to use archetypal matrices or to use a neural net.
Given a set of examples of a particular signal an archetypal matrix can be generated by simply adding the matrices together and dividing by the number of matrices used. To test to see if a signal is the same as the one represented by the archetype the signal needs to be encoded in the same matrix format as the archetype and then compared, usually by finding the correlation score. In this way many archetypes can be included and ranked as to which are most likely to be the same as the signal.
If better discrimination is required than that afforded by using archetype matrices a neural -net may be used. Since a signal of any length will always produce a matrix of the same size when it is time encoded (assuming the same matrix is used, of course) a neural net can be designed with a fixed number of inputs corresponding to the number of elements in the matrix. With sufficient training examples the neural net can be trained to recognise a number of standard signals.
If the handwriting is analysed as it is being written (or the relevant data is recorded) the position of the pen as a function of time is available. The (x,y) co-ordinates provide two waveforms that vary with time and are orthogonal. These can be encoded separately to provide two separate matrices or they can be used to derive other parameters. An example would be differentiating x(t) and y(t) once to produce the velocities. In each case the waveform chosen has to have a "zero" value that the waveform crosses so that the epochs can be defined. In the case of the velocities this could be the actual zero value whereas for the y ordinate a line through the middle of the text would be required.
An additional complication arises when the pen is lifted off the surface of whatever is being written on. This could happen for example when the crossbar on a "t" is written after the vertical line has been drawn. Such an event will be characterised by gaps in the plots at certain times. Possible methods of incorporating these gaps include setting the value of the functions to zero over the period of the gaps or simply joining tie separate segments together. In either case there will be discontinuous changes in the waveforms. However, this should not be a problem for time encoding.
Figure 3 illustrates one possible implementation of a system to recognise text as it is being written. In the first step 31, data of the position and velocity of the pen on a writing tablet is obtained. The y coordinate of the writing implement about a mean level and the velocity of the implement in the x direction as a function of time are then extracted from the data in the next step 33. These two waveforms are then time encoded 35 using the TESPAR encoding scheme and compared with a store of previous examples. The closest match between the known examples and the input data is then output. A specific application of the above recognition example is signature recognition. The person in question signs their name on a tablet that records the position as a function of time. This signal can then be time-encoded and compared to known time-encoded signatures to verify the identity of that person.
The recognition method may also be applied to static text. When the text to be recognised has already been written the information as to how it was written is not available. Hence this case provides a subset of the information available in the dynamic, case where the text is analysed as it is written.
One representation of written script is as a bitmap in which the pixels are either black (foreground colour) or white (background colour) . This gives a two-dimensional array of points each of which takes the value 0 or 1. This can be though of as an infinitely clipped two dimensional waveform. (An alternative would be to assign a number to each colour that can be represented to produce a function that varies over a range of values.) When this waveform has been generated it can be encoded. One method is to encode each column (or row) into matrix form and then use the resulting set of matrices as the input for a neural net. Alternatively the array could be turned into a one- dimensional vector by writing each row (or column) sequentially. The resulting vector could then be encoded to produce a single matrix that could be used to characterise the image. This example describes one method of implementing a system to recognise scanned images of static text. The flow diagram for this system is shown in Figure 4. Initially, an image of the text to be classified is scanned (41) to render it in electronic form and stored as a two dimensional bitmap (43) . This image is then read (45) both one row after another and one column after another to produce two long waveforms corresponding to the row and column scans. A pixel that is clear is represented by minus 1 and a filled pixel by plus 1. This produces a rectangular waveform with well defined zero crossing points.
The resulting encoded waveforms are then compared (47) with previously determined examples to assign probabilities to each. The most likely is then returned (49) as the recognised text.

Claims

1. A coding method for handwriting, comprising the steps of recording the variation of an input parameter, identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the lengths of the intervals, identifying the number of complex zeroes, of the input parameter up to a predetermined rank, in the said intervals, and recording the quantised lengths of the intervals and a measure of the said number of complex zeroes up to a predetermined rank as a representation of the variation of the input parameter.
2. A method according to claim 1 wherein the predetermined rank is 1 and the steps of recording a representation of the variation of the input parameter records the number of first rank zeroes.
3. A method according to claim 2 wherein the coding method is a TESPAR method.
4. A method according to any preceding claim further comprising the step of generating a code number taking one of a set of predetermined values representing t-he duration of the interval and the said number of complex zeroes for at least some of the said intervals.
5. A method according to claim 4 further comprising the step of calculating an S matrix recording the number of instances of each code number in the recorded variation of the input parameter.
6. A method according to claim 4 or 5 further comprising the step of calculating an A matrix recording the number of instances of a first code number following a second code number with a predetermined lag.
7. A method according to any of claims 4 to 6 further comprising the step of calculating a DZ matrix recording the number of instances of amplitude, length of interval and number of maxima and minima increasing, decreasing or staying the same in the next epoch.
8. A method according to any preceding claim wherein at least two input parameters are recorded.
9. A method according to any preceding claim wherein the intervals are time intervals so that the recorded lengths represent the time between successive crossings of the given value.
10. A method according to any preceding claim wherein the given value is a median value of the handwriting.
11. A method according to any preceding claim including capturing the input parameter or parameters with a writing tablet.
12. A method according to any of claims 1 to 10 including scanning the handwriting, and then generating- - the input parameters.
13. A method of handwriting recognition, including the steps of coding the handwriting according to any preceding claim, comparing the coded representations with representations corresponding to a number of different characters to find the best match and outputting the character corresponding to the best 5 match.
14. A handwriting coding apparatus, comprising a data input means for measuring the variation of an input parameter, an identifying means for identifying the intervals 10 between the occurrences of the input parameter crossing a given value, and quantising the intervals, and identifying the number of complex zeroes up to a predetermined rank of the input parameter in said intervals, and a recording means for recording the quantised 15 intervals and a measure of the said number of complex zeroes as a representation of the variation of the input parameter.
15. A handwriting coding apparatus according to claim 14 wherein the predetermined rank is one.
20 16. A handwriting coding apparatus according to claim 14 or 15 further comprising a writing tablet for capturing the input parameters.
17. A handwriting recognition apparatus, including a handwriting coding apparatus according to any of claims 14
25 or 16 and further comprising a processing means adapted to compare the coded representations of the characters with representations corresponding to a number of different characters to find the best match and outputting the character corresponding to the best match.
18. A handwriting recognition apparatus according to claim 17 wherein the identifying means and the processing means are both the central processing unit of a computer.
19. A computer program for controlling a computer having a data storage means and a processor, connected to an input means for providing handwriting data input, the computer program operable to cause the said computer to carry out in cooperation with the input means a method for coding handwriting comprising the steps of recording the variation of an input parameter, identifying the intervals between the occurrences of the input parameter crossing a given value, and quantising the intervals, identifying the number of complex zeroes of first rank of the input parameter in said intervals, and recording the quantised intervals and a measure of the said number of coupler zeroes as a representation of the variation of the input parameter.
PCT/GB2000/001340 1999-04-14 2000-04-14 Handwriting coding and recognition WO2000062244A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU39793/00A AU3979300A (en) 1999-04-14 2000-04-14 Handwriting coding and recognition
EP00919036A EP1173823A1 (en) 1999-04-14 2000-04-14 Handwriting coding and recognition
JP2000611237A JP2002541597A (en) 1999-04-14 2000-04-14 Coding and recognition of handwriting

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9908462.6 1999-04-14
GBGB9908462.6A GB9908462D0 (en) 1999-04-14 1999-04-14 Handwriting coding and recognition

Publications (1)

Publication Number Publication Date
WO2000062244A1 true WO2000062244A1 (en) 2000-10-19

Family

ID=10851464

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2000/001340 WO2000062244A1 (en) 1999-04-14 2000-04-14 Handwriting coding and recognition

Country Status (6)

Country Link
EP (1) EP1173823A1 (en)
JP (1) JP2002541597A (en)
CN (1) CN1347534A (en)
AU (1) AU3979300A (en)
GB (1) GB9908462D0 (en)
WO (1) WO2000062244A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1460522A2 (en) * 2003-03-17 2004-09-22 Samsung Electronics Co., Ltd. Handwriting trajectory recognition system and method
US8897511B2 (en) 2004-08-21 2014-11-25 Softpro Gmbh Method and device for detecting a hand-written signature or mark and for recognising the authenticity of said signature or mark

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3818443A (en) * 1972-04-28 1974-06-18 Burroughs Corp Signature verification by zero-crossing characterization
GB2062323A (en) * 1979-10-26 1981-05-20 Nat Res Dev Apparatus for Recognising Handwritten Signs
WO1997031368A1 (en) * 1996-02-20 1997-08-28 Domain Dynamics Limited Signal processing arrangements

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3818443A (en) * 1972-04-28 1974-06-18 Burroughs Corp Signature verification by zero-crossing characterization
GB2062323A (en) * 1979-10-26 1981-05-20 Nat Res Dev Apparatus for Recognising Handwritten Signs
WO1997031368A1 (en) * 1996-02-20 1997-08-28 Domain Dynamics Limited Signal processing arrangements

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GORSE D ET AL: "A modular pRAM architecture for the classification of TESPAR-encoded speech signals", NEUROFUZZY'96. IEEE EUROPEAN WORKSHOP, PRAGUE, CZECH REPUBLIC, 16-18 APRIL 1996, vol. 6, no. 3, Neural Network World, 1996, UIVT AV CR - NNW, Czech Republic, pages 299 - 304, XP000925748, ISSN: 1210-0552 *
PAQUET T ET AL: "RECOGNITION OF HANDWRITTEN SENTENCES USING A RESTRICTED LEXICON", PATTERN RECOGNITION,US,PERGAMON PRESS INC. ELMSFORD, N.Y, vol. 26, no. 3, 1 March 1993 (1993-03-01), pages 391 - 407, XP000367312, ISSN: 0031-3203 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1460522A2 (en) * 2003-03-17 2004-09-22 Samsung Electronics Co., Ltd. Handwriting trajectory recognition system and method
EP1460522A3 (en) * 2003-03-17 2008-08-27 Samsung Electronics Co., Ltd. Handwriting trajectory recognition system and method
US7474771B2 (en) 2003-03-17 2009-01-06 Samsung Electronics Co., Ltd. Handwriting trajectory recognition system and method
US8897511B2 (en) 2004-08-21 2014-11-25 Softpro Gmbh Method and device for detecting a hand-written signature or mark and for recognising the authenticity of said signature or mark
EP1792266B1 (en) * 2004-08-21 2017-07-12 Softpro GmbH Method and device for detecting a hand-written signature or mark and for recognising the authenticity of said signature or mark

Also Published As

Publication number Publication date
JP2002541597A (en) 2002-12-03
EP1173823A1 (en) 2002-01-23
GB9908462D0 (en) 1999-06-09
AU3979300A (en) 2000-11-14
CN1347534A (en) 2002-05-01

Similar Documents

Publication Publication Date Title
US5841902A (en) System and method for unconstrained on-line alpha-numerical handwriting recognition
JP4142463B2 (en) System and method for facilitating pattern recognition
Azmi et al. Biometric signature verification system based on freeman chain code and k-nearest neighbor
US20060050962A1 (en) System, process and software arrangement for recognizing handwritten characters
MX2007010180A (en) Intelligent importation of information from foreign application user interface using artificial intelligence.
CN108664975B (en) Uyghur handwritten letter recognition method and system and electronic equipment
Wu et al. On-line signature verification based on logarithmic spectrum
Khan Online Urdu handwritten character recognition: Initial half form single stroke characters
Manke et al. A connectionist recognizer for on-line cursive handwriting recognition
KR102111858B1 (en) Method and system for authenticating stroke-based handwritten signature using machine learning
US7313267B2 (en) Automatic encoding of a complex system architecture in a pattern recognition classifier
Mozaffari et al. Feature comparison between fractal codes and wavelet transform in handwritten alphanumeric recognition using SVM classifier
EP1173823A1 (en) Handwriting coding and recognition
JPS60153574A (en) Character reading system
Bania et al. Handwritten Assamese character recognition using texture and diagonal orientation features with artificial neural network
CN115953744A (en) Vehicle identification tracking method based on deep learning
Assaleh et al. Recognition of handwritten Arabic alphabet via hand motion tracking
KR20090111202A (en) The Optical Character Recognition method and device by the numbers of horizon, vertical and slant lines which is the element of Hanguel
US5940533A (en) Method for analyzing cursive writing
CN114419174A (en) On-line handwritten text synthesis method, device and storage medium
Chahi et al. Effective feature descriptor-based new framework for off-line text-independent writer identification
Ye et al. Off-line handwritten signature verification with inflections feature
Magrina Recognition of ancient Tamil characters from epigraphical inscriptions using Raspberry Pi based tesseract OCR
Jain Unconstrained Arabic & Urdu text recognition using deep CNN-RNN hybrid networks
Konstantakis et al. A writer identification system of greek historical documents using matlab

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 00806186.6

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2000919036

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2000 611237

Country of ref document: JP

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 2000919036

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2000919036

Country of ref document: EP