WO2010150383A1

WO2010150383A1 - Recognition device

Info

Publication number: WO2010150383A1
Application number: PCT/JP2009/061615
Authority: WO
Inventors: 洋次郎登内
Original assignee: 株式会社東芝
Priority date: 2009-06-25
Filing date: 2009-06-25
Publication date: 2010-12-29

Abstract

An acquiring section acquires coordinate data about an inputted stroke in order of time series. A creating section creates a real stroke which is the stroke from a pen-down to a pen-up and creates a virtual stroke virtually connecting the coordinate data about the end point of the real stroke and the coordinate data about the start point of the next real stroke with a line segment. An extracting section extracts feature vectors of the real stroke for each of the divisions defined by dividing the region where the stroke exists and obtains a first feature vector composed of the extracted feature vectors. The extracting section furthermore extracts feature vectors of the virtual stroke for each of the divisions and obtains a second feature vector composed of the extracted feature vectors. A recognizing section recognizes the stroke by using the distance between the first feature vector and the vector corresponding to the real stroke out of the vectors corresponding to the identification code and the distance between the second feature vector and the vector corresponding to the virtual stroke out of the vectors corresponding to the identification code.

Description

Recognition device

The present invention relates to a technique for recognizing characters from input strokes.

There is a technology that performs character recognition using an actual stroke that is an actually input stroke and a virtual stroke that is a stroke connecting the actual strokes.

In such character recognition technology, character recognition is performed for one stroke obtained by combining an actual stroke and a virtual stroke. For example, in Patent Documents 1 and 2, in order to prevent misrecognition, a real weight is given a large weight (thickening the handwriting), and a virtual stroke is given a small weight (thinning the handwriting). A technique for recognizing a feature vector is disclosed.

Japanese Patent No. 4099248 Japanese Patent No. 4048716

However, in the conventional technology as described above, the feature vector is extracted and recognized without distinguishing the actual stroke and the virtual stroke. Therefore, there is a possibility of being misrecognized.

The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a recognition device that improves recognition performance by clearly distinguishing and recognizing a real stroke and a virtual stroke.

In order to solve the above-described problem, the recognition apparatus according to the present invention includes an acquisition unit that acquires coordinate data of handwriting input by the user from the input unit in chronological order, and before pen-up and before pen-up is performed. A creation unit that creates a virtual stroke in which the actual stroke as a handwriting is created, and the coordinate data of the end point of the actual stroke and the coordinate data of the start point of the next actual stroke of the actual stroke are virtually connected by a line segment; The feature vector of the actual stroke is extracted for each divided region obtained by dividing the region where the handwriting exists, and a first feature vector composed of each extracted feature vector is obtained, and the feature vector of the virtual stroke is obtained for each divided region. An extraction unit for extracting and obtaining a second feature vector composed of the extracted feature vectors, and corresponding to the first feature vector and the identification code; The handwriting is recognized using a distance between a vector corresponding to the actual stroke among vectors and a distance between a vector corresponding to the virtual stroke among vectors corresponding to the second feature vector and the identification code. And a recognizing unit.

According to the present invention, the recognition performance can be improved by clearly distinguishing the actual stroke from the virtual stroke.

The figure which shows the recognition apparatus of 1st Embodiment. The figure which shows the example of a real stroke and a virtual stroke. The figure which shows the example of a 1st feature vector. The figure which shows the example of a 2nd feature vector. The figure which shows the example of the dictionary memorize | stored in the dictionary memory | storage part of 1st Embodiment. The flowchart which shows operation | movement of 1st Embodiment. The figure which shows the recognition apparatus of 2nd Embodiment. The figure which shows the example of the dictionary memorize | stored in the dictionary memory | storage part of 2nd Embodiment. The flowchart which shows operation | movement of 2nd Embodiment.

Hereinafter, embodiments of the recognition apparatus of the present invention will be described with reference to the drawings.

(First embodiment)
In the present embodiment, the first feature vector of the actual stroke and the second feature vector of the virtual stroke are extracted separately. An example in which recognition processing is performed using feature vectors that are combined while leaving the features of each extracted feature vector will be described.

FIG. 1 is a block diagram showing an example of the configuration of the recognition apparatus 1 of the present embodiment. As illustrated in FIG. 1, the recognition apparatus 1 includes an input unit 10, a display unit 20, a storage unit 30, an acquisition unit 40, a creation unit 50, an extraction unit 60, a combination unit 70, and a recognition unit 80. And a display control unit 90.

The input unit 10 allows a user to input handwriting from a predetermined device such as a pen or a finger. For example, it can be realized by an existing coordinate input device such as a touch pad, a touch panel, or a tablet.

The display unit 20 displays a recognition result of the recognition unit 80 described later and the like according to an instruction from the display control unit 90 described later. It can be realized by an existing display device such as a liquid crystal display, a plasma display, an organic EL display, or a touch panel display.

The storage unit 30 stores information used for various processes performed by the recognition device 1. This can be realized by an existing storage medium that can be stored magnetically, electrically, or optically. The storage unit 30 includes a dictionary storage unit 32. Details of the dictionary storage unit 32 will be described later.

The acquisition unit 40 acquires the coordinate data of the handwriting input to the input unit 10 in chronological order. The acquisition unit 40 acquires handwriting coordinate data at regular time intervals while a pen or the like is in contact with the input surface of the input unit 10.

The creation unit 50 creates an actual stroke that is a handwriting input from the input unit 10 and before the pen-up is performed. At the same time, a virtual stroke is created in which the coordinate data of the end point of the actual stroke and the coordinate data of the start point of the next actual stroke of the actual stroke are virtually connected by a line segment. The actual stroke is a handwriting for one stroke, and refers to a handwriting from when the pen or the like touches the input surface of the input unit 10 until it leaves (from before pen-down to before pen-up).

The creation unit 50 creates an actual stroke by connecting the coordinate data acquired from when the pen or the like comes into contact with the input surface of the input unit 10 until the pen leaves the line, in a chronological order. For example, the coordinate data from the start point to the end point of the i-th actual stroke is (X _i [1], Y _i [1]), (X _i [2], Y _i [2]), ..., (X _i [N _i ], Y _i [N _i ]). The creation unit 50 creates an actual stroke by connecting these coordinate data with line segments in order of time series. Note that i is a natural number. _Ni is a natural number of 1 or more, and indicates the number of coordinate data of the i-th actual stroke.

In addition, the creation unit 50 creates a virtual stroke in which the coordinate data of the end point of the actual stroke and the coordinate data of the start point of the next actual stroke after the actual stroke are virtually connected by a line segment. For example, the coordinate data of the end point of the i-th actual stroke is represented as (X _i [N _i ], Y _i [N _i ]), and the coordinate data of the start point of the i + 1-th actual stroke is (X _{i + 1} [1], Y _{i + 1} [1]). The creating unit 50 virtually connects the two coordinate data with a line segment to create an i-th virtual stroke (a virtual stroke between the i-th actual stroke and the i + 1-th actual stroke).

The creation unit 50 described an example in which a virtual stroke is created by virtually connecting two end points of an actual stroke and the start point of the next actual stroke with a line segment. Alternatively, a virtual stroke may be created by interpolating between two points and virtually connecting three or more points with line segments. The interpolation may be linear interpolation or interpolation considering the continuity of the direction change at the start point and the end point.

FIG. 2 is a diagram illustrating an example of an actual stroke and a virtual stroke created by the creation unit 50. In the example shown in FIG. 2, dots indicate coordinate data, solid lines connecting the dots indicate actual strokes, and broken arrows indicate virtual strokes.

Returning to FIG. 1, the extraction unit 60 extracts the feature vector of the actual stroke for each divided region obtained by dividing the region where the handwriting input to the input unit 10 exists, and obtains the first feature vector composed of the extracted feature vectors. obtain. Further, a feature vector of the virtual stroke is extracted for each divided region, and a second feature vector composed of each extracted feature vector is obtained.

The extraction unit 60 extracts the appearance frequency distribution in the direction between the coordinate data of the actual stroke for each divided area. A first feature vector is obtained by arranging the extracted appearance frequency distributions in one column. Further, the appearance frequency distribution in the direction between the coordinate data of the virtual strokes is extracted for each divided region. A second feature vector is obtained by arranging the extracted appearance frequency distributions in one column. In the present embodiment, the distribution of direction component density features will be described as an example of the appearance frequency distribution.

The direction component density feature is represented by a three-dimensional array F [d] [x] [y] indicating the x direction, the y direction, and the direction between the coordinate data (direction of actual stroke or virtual stroke). For example, if the quantization number in the x direction is Nx, the quantization number in the y direction is Ny, and the quantization number in the direction between coordinate data is D, the array F [d] [x] [y] is (F [ 1] [1] [1], F [1] [1] [2],..., F [1] [1] [Ny], F [1] [2] [1],. [Nx] [Ny-1], F [D] [Nx] [Ny]) Nx * Ny * D-dimensional vectors. Nx, Ny, and D are all natural numbers of 2 or more.

Then, the extraction unit 60 individually calculates the direction component density features of the real stroke and the virtual stroke to obtain the first feature vector and the second feature vector, respectively. For calculating the direction component density feature, for example, "" Feature selection type character recognition by minimum classification error learning ", IEICE Transactions. D-II, Information / System, II-Information Processing, Vol. 12, December 1998, p.2749-2756 ”can be used.

FIG. 3 is a diagram illustrating an example of the first feature vector extracted by the extraction unit 60. Specifically, the directional component density characteristics of the actual stroke shown in FIG. 2 are shown. FIG. 4 is a diagram showing an example of the second feature vector extracted by the extraction unit 60, and more specifically shows the direction component density feature of the virtual stroke shown in FIG. In the example shown in FIGS. 3 and 4, since the quantization number in the x direction is 8, the quantization number in the y direction is 8, and the quantization number in the direction between coordinate data is 4, the array F is It is represented by a 128-dimensional vector.

The divided areas may be determined in advance, or may be divided into a predetermined number (specifically, values of Nx and Ny) by the extraction unit 60 according to the area where the handwriting exists. Good.

1, the combining unit 70 combines the first feature vector and the second feature vector extracted by the extracting unit 60 to obtain a combined vector. The combining unit 70 obtains a combined vector by arranging the first feature vector and the second feature vector in one column.

For example, an array F indicating the first feature vector is an L-dimensional vector (a [1], a [2],..., A [L]). An array F indicating the second feature vector is an M-dimensional vector (b [1], b [2],..., B [M]). Note that L and M are both natural numbers of 8 or more. In this case, the combining unit 70 arranges the first feature vector and the second feature vector in one column, and (a [1], a [2],..., A [L], b [1], b [ 2],..., B [M]) to obtain an L + M-dimensional coupling vector.

The combining unit 70 may weight each vector element when combining the first feature vector and the second feature vector.

Here, the dictionary storage unit 32 will be described. The dictionary storage unit 32 stores, for each identification code, an identification vector obtained by combining the first feature vector and the second feature vector of the identification code in association with each other.

FIG. 5 shows an example of a dictionary stored in the dictionary storage unit 32. In the example shown in FIG. 5, a character code (SJIS) that is an identification code is associated with an identification vector obtained by combining the first feature vector and the second feature vector of the character code. The identification vectors are combined by a method similar to the method of combining the combination vectors. P1, P2,... Indicate character code identification vectors, P11, P21,... Indicate character code first feature vectors, and P12, P22,.

1, the recognition unit 80 recognizes the handwriting input to the input unit 10 using the first feature vector and the second feature vector extracted by the extraction unit 60. The recognition unit 80 calculates the distance between the combined vector combined by the combining unit 70 and the identification vector stored in the dictionary storage unit 32. The identification code associated with the identification vector that minimizes the calculated distance is recognized as the handwriting input to the input unit 10. That is, the distance between the first feature vector and the vector corresponding to the actual stroke of the identification vector stored in the dictionary storage unit 32 is calculated. Further, the distance between the second feature vector and the vector corresponding to the virtual stroke of the identification vector stored in the dictionary storage unit 32 is calculated. A comprehensive judgment is made on the two calculated distances to identify the character.

For example, the coupling vector is an N-dimensional vector (s [1], s [2],..., S [N]), and the identification vector is (t [1], t [2],..., T [N]. ), The recognition unit 80 calculates the distance between the two vectors using the Euclidean distance R shown in Equation (1). Note that N is a natural number of 16 or more.

The display control unit 90 displays the recognition result of the recognition unit 80 on the display unit 20. Specifically, the display control unit 90 causes the recognition unit 80 to display characters or the like indicated by the identification code recognized as the handwriting input to the input unit 10.

In addition, the acquisition unit 40, the creation unit 50, the extraction unit 60, the combination unit 70, the recognition unit 80, and the display control unit 90 can be realized by an existing arithmetic device such as a CPU.

FIG. 6 is a flowchart illustrating an example of a flow of a recognition process performed by the recognition apparatus 1 according to the present embodiment.

In step S100, handwriting is input to the input unit 10 from a pen or the like.

In step S102, the acquisition unit 40 acquires the coordinate data of the handwriting input to the input unit 10 in chronological order.

In step S104, the creation unit 50 creates an actual stroke that is a stroke constituting the input handwriting.

In step S106, the creation unit 50 creates a virtual stroke in which the coordinate data of the end point of the actual stroke and the coordinate data of the start point of the next actual stroke are virtually connected by a line segment.

In step S108, the extraction unit 60 extracts a feature vector of the actual stroke for each divided region obtained by dividing the region where the input handwriting exists, and obtains a first feature vector composed of each extracted feature vector.

In step S110, the extraction unit 60 extracts a feature vector of the virtual stroke for each divided region, and obtains a second feature vector composed of the extracted feature vectors.

In step S112, the combining unit 70 combines the first feature vector and the second feature vector extracted by the extracting unit 60 to obtain a combined vector.

In step S114, the recognition unit 80 calculates the distance between the combined vector combined by the combining unit 70 and the identification vector stored in the dictionary storage unit 32. The identification code associated with the identification vector that minimizes the calculated distance is recognized as the handwriting input to the input unit 10.

In step S116, the display control unit 90 causes the display unit 20 to display the recognition result of the recognition unit 80.

In this embodiment, the first feature vector of the actual stroke and the second feature vector of the virtual stroke are separately extracted, and the recognition process is performed by combining the extracted feature vectors while leaving the features of the feature vectors. Therefore, according to the present embodiment, the actual stroke and the virtual stroke can be clearly distinguished and recognized, and the recognition accuracy can be improved. For example, characters such as “=” and “Z” that are difficult to recognize unless the actual stroke and the virtual stroke can be clearly distinguished can be recognized correctly.

(Second Embodiment)
In the present embodiment, an example will be described in which a first feature vector of an actual stroke and a second feature vector of a virtual stroke are separately extracted, and recognition processing is performed for each extracted feature vector. In the following description, differences from the first embodiment will be mainly described, and components having similar functions will be denoted by the same names and symbols, and description thereof will be omitted.

FIG. 7 is a block diagram illustrating an example of the configuration of the recognition apparatus 101 according to the present embodiment. The recognition apparatus 101 illustrated in FIG. 7 does not include the dictionary data stored in the dictionary storage unit 132 of the storage unit 130, the processing content of the recognition unit 180, and the combining unit 70. Is different.

The dictionary storage unit 132 stores, for each identification code, the first identification vector that is the first feature vector of the identification code and the second identification vector that is the second feature vector of the identification code in association with each other.

FIG. 8 shows an example of a dictionary stored in the dictionary storage unit 132. In the example shown in FIG. 8, a character code (SJIS) that is an identification code, a first identification vector of the character code, and a second identification vector of the character code are associated with each other. P11, P21,... Indicate the first identification vector of the character code, and P12, P22,.

Returning to FIG. 7, the recognizing unit 180 determines the distance between the first feature vector extracted by the extracting unit 60 and the first identification vector stored in the dictionary storage unit 132, and the second feature vector extracted by the extracting unit 60. Each distance from the second identification vector stored in the dictionary storage unit 132 is calculated and integrated by a predetermined method. The identification codes associated with the first identification vector and the second identification vector that minimize the integrated distance are recognized as handwriting input to the input unit 10.

Specifically, the recognizing unit 180 calculates the distance R1 between the first feature vector and the first identification vector, the distance R2 between the second feature vector and the second identification vector, and an integrated function f (R1, R2). Use to integrate. Then, an integrated distance R that is an integrated distance is obtained. For the integration function f (R1, R2), equation (2) for obtaining a linear sum of each distance can be used.
f (R1, R2) = (k1 * R1 + k2 * R2) / (k1 + k2) (2)
Then, the recognizing unit 180 recognizes the identification code associated with the first identification vector and the second identification vector having the minimum integrated distance R as the handwriting input to the input unit 10.

Note that the recognition unit 180 may integrate the distance R1 between the first feature vector and the first identification vector in preference to the distance R2 between the second feature vector and the second identification vector. For example, the recognition unit 180 may integrate the distance R2 as 0 when the distance R1 is smaller than the threshold value. That is, the recognition unit 180 may set the distance R1 as the integrated distance R when the distance R1 is smaller than the threshold value. The threshold value may always be the same value, or may be changed for each identification code.

FIG. 9 is a flowchart showing an example of the operation of the recognition apparatus 101 of the present embodiment.

The processing from step S200 to step S210 is the same as the processing from step S100 to S110 in FIG.

In step S212, the recognition unit 180 determines the distance between the first feature vector extracted by the extraction unit 60 and the first identification vector stored in the dictionary storage unit 132, the second feature vector extracted by the extraction unit 60, and the dictionary. Each distance from the second identification vector stored in the storage unit 132 is calculated and integrated by a predetermined method. The identification codes associated with the first identification vector and the second identification vector that minimize the integrated distance are recognized as handwriting input to the input unit 10.

In step S214, the display control unit 90 causes the display unit 20 to display the recognition result of the recognition unit 180.

Thus, in this embodiment, the first feature vector of the actual stroke and the second feature vector of the virtual stroke are extracted separately, and recognition processing is performed for each extracted feature vector. For this reason, also in this embodiment, a real stroke and a virtual stroke can be clearly distinguished and recognized, and recognition accuracy can be improved. In particular, according to the present embodiment, either the first feature vector or the second feature vector can be preferentially recognized in accordance with the handwriting mode (for example, the first feature vector is mainly used and the second feature vector is used. Vectors can be used supplementarily), and recognition accuracy can be further increased.

The recognition apparatus according to the first and second embodiments includes a control device such as a CPU, a storage device such as a ROM and a RAM, an external storage device such as an HDD and a removable drive device, a display device such as a display, a keyboard and a mouse. Etc., and has a hardware configuration using a normal computer.

(Modification)
Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Moreover, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, the constituent elements over different embodiments may be appropriately combined.

For example, in the above embodiment, the first feature vector and the second feature vector may be dimensionally compressed using a principal component analysis method (PCA) or the like. In this way, the data amount of the first feature vector and the second feature vector can be reduced.

Further, for example, in the above embodiment, each feature vector extracted by the extraction unit 60 may be blurred. For example, according to Equation (3), a feature vector of a predetermined divided area (x, y) is converted into peripheral divided areas (x-1, y), (x + 1, y), (x, y-1), (x , Y + 1). In this way, the habit of the handwriting input person can be blurred, and the recognition accuracy can be prevented from deteriorating.
F [d] [x] [y] = (4 * F [d] [x] [y] + F [d] [x−1] [y] + F [d] [x + 1] [y] + F [d] [X] [y-1] + F [d] [x] [y + 1]) / 8 (3)
Here, d is any value from 1 to D.

In the above embodiment, the character recognition is described as an example, but the same method can be applied to gesture recognition.

Further, for example, the functions of the recognition devices of the first and second embodiments may be realized by executing a recognition program.

In this case, the recognition program executed by the recognition apparatuses of the first and second embodiments is stored in a computer-readable storage medium in an installable format or an executable file format and provided as a computer program product. . The recognition program executed by the recognition apparatus according to the first or second embodiment may be provided by being incorporated in advance in a ROM or the like.

The recognition program executed by the recognition apparatuses of the first and second embodiments has a module configuration for realizing the above-described units on a computer. As actual hardware, the CPU reads out a recognition program from the HDD or the like on the RAM and executes it, so that the above-described units are realized on the computer.

DESCRIPTION OF SYMBOLS 1,101 Recognition apparatus 10 Input part 20 Display part 30,130 Storage part 32,132 Dictionary storage part 40 Acquisition part 50 Creation part 60 Extraction part 70 Combining part 80,180 Recognition part 90 Display control part

Claims

An acquisition unit that acquires coordinate data of handwriting input by the user from the input unit in chronological order;
Creates an actual stroke, which is the handwriting from the pen-down to before the pen-up, and creates a virtual line segment between the coordinate data of the end point of the actual stroke and the coordinate data of the start point of the next actual stroke. A creation unit that creates virtual strokes connected with
The feature vector of the actual stroke is extracted for each divided region obtained by dividing the region where the handwriting exists, and a first feature vector composed of each extracted feature vector is obtained, and the feature vector of the virtual stroke is obtained for each divided region. An extraction unit for extracting and obtaining a second feature vector composed of the extracted feature vectors;
A distance between the first feature vector and a vector corresponding to the actual stroke among vectors corresponding to the identification code, and a vector corresponding to the virtual stroke among vectors corresponding to the second feature vector and the identification code. A recognition unit for recognizing the handwriting using a distance; and
A recognition apparatus comprising:
A combining unit that combines the first feature vector and the second feature vector extracted by the extracting unit to obtain a combined vector;
A dictionary storage unit that stores, for each identification code, the first feature vector and the second feature vector of the identification code in association with the identification vector obtained by combining the combination vectors in the same manner as the combination vector. Prepared,
The recognition unit calculates a distance between the combination vector and the identification vector, and recognizes an identification code associated with the identification vector that minimizes the calculated distance as the handwriting. The recognition device described in 1.
A dictionary storage unit that stores, for each identification code, a first identification vector that is a first feature vector of the identification code and a second identification vector that is a second feature vector of the identification code in association with each other;
The recognition unit calculates a distance between the first feature vector and the first identification vector extracted by the extraction unit, and a distance between the second feature vector and the second identification vector extracted by the extraction unit, respectively. The identification code associated with the first identification vector and the second identification vector with which the integrated distance is minimized is recognized as the handwriting. Recognition device.
The recognizing unit integrates the distance between the first feature vector and the first identification vector in preference to the distance between the second feature vector and the second identification vector. The recognition device described.
The extraction unit extracts the appearance frequency distribution in the direction between the coordinate data of the actual stroke for each divided region, obtains the first feature vector by arranging the extracted appearance frequency distributions in one column, 2. The second feature vector is obtained by extracting an appearance frequency distribution in a direction between coordinate data of the virtual stroke for each of the divided regions and arranging the extracted appearance frequency distributions in one column. The recognition device described in 1.
An input unit for the user to input handwriting,
A display unit for displaying a recognition result of the recognition unit;
The recognition device according to claim 1, further comprising: