CN111062398B

CN111062398B - Character recognition method, character recognition device, computer equipment and storage medium

Info

Publication number: CN111062398B
Application number: CN202010189305.7A
Authority: CN
Inventors: 周柔刚; 周才健; 盛锦华
Original assignee: Guangdong Guangyuan Intelligent Technology Co ltd; Jinhua Mstar Intelligent Technology Co ltd; Suzhou Huicui Intelligent Technology Co ltd; Hangzhou Huicui Intelligent Technology Co ltd
Current assignee: Guangdong Guangyuan Intelligent Technology Co ltd; Jinhua Mstar Intelligent Technology Co ltd; Suzhou Huicui Intelligent Technology Co ltd; Hangzhou Huicui Intelligent Technology Co ltd
Priority date: 2020-03-18
Filing date: 2020-03-18
Publication date: 2021-01-19
Anticipated expiration: 2040-03-18
Also published as: CN111062398A

Abstract

The application relates to a character recognition method, a character recognition device, a computer device and a storage medium. The method comprises the following steps: regulating the character skeleton to be recognized and the standard character skeleton to the same pixel range to obtain a regulated character skeleton to be recognized and a regulated standard character skeleton; translating the position of each point in the structured character skeleton to be recognized according to the distance between the point and the nearest point in the structured reference character skeleton to obtain a translated character skeleton to be recognized; dividing the translation character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes, and calculating the similarity between the translation character skeleton to be recognized and the regular reference character skeleton according to the number of points in the translation character skeleton to be recognized and the number of points in the regular reference character skeleton in the grids; and determining the character corresponding to the reference character image with the maximum similarity as the character in the character image to be recognized. The method can improve the character recognition accuracy.

Description

Character recognition method, character recognition device, computer equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a character recognition method, apparatus, computer device, and storage medium.

Background

With the development of image processing technology, character recognition technology has emerged, and characters in an image can be acquired by performing character recognition on the acquired image. The existing character recognition method generally adopts a training classification mode, the method depends on the collection of character samples, and different samples can cause the recognition effect after training to generate difference.

However, the conventional character recognition method can stably recognize a standard print character, but has low accuracy in character detection for handwriting input.

Disclosure of Invention

In view of the above, it is necessary to provide a character recognition method, apparatus, computer device and storage medium capable of improving character detection accuracy.

A method of character recognition, the method comprising:

acquiring a character skeleton to be recognized in a character image to be recognized and a reference character skeleton in a reference character image;

regulating the character skeleton to be recognized and the standard character skeleton to the same pixel range to obtain a regulated character skeleton to be recognized and a regulated standard character skeleton;

translating the position of each point in the structured character skeleton to be recognized according to the distance between the point and the nearest point in the structured reference character skeleton to obtain a translated character skeleton to be recognized;

dividing the translation character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes, and calculating the similarity between the translation character skeleton to be recognized and the regular reference character skeleton according to the number of points in the translation character skeleton to be recognized and the number of points in the regular reference character skeleton in the grids;

and determining the character corresponding to the reference character image with the maximum similarity as the character in the character image to be recognized.

In one embodiment, the character skeleton comprises a bone and a bone node, wherein the bone node is an intersection point between the bone and the bone; before the character skeleton to be recognized and the standard character skeleton are structured to the same pixel range and the structured character skeleton to be recognized and the structured standard character skeleton are obtained, the method comprises the following steps: and eliminating the non-node bones and the single-node bones of the character skeleton to be recognized and the reference character skeleton.

In one embodiment, the character skeleton comprises a bone and a bone node, wherein the bone node is an intersection point between the bone and the bone; before the character skeleton to be recognized and the standard character skeleton are structured to the same pixel range and the structured character skeleton to be recognized and the structured standard character skeleton are obtained, the method comprises the following steps: and removing bones of which the lengths of the bones of the character skeleton to be recognized and the reference character skeleton are smaller than a preset value.

In one embodiment, the removing the skeleton with the length of the skeleton of the character skeleton to be recognized and the skeleton of the reference character skeleton smaller than a preset value includes: calculating the total bone length L1 and the number N1 of the character skeleton to be recognized; calculating the total bone length L2 and the number of bones N2 of the reference character skeleton; calculating the average bone length of the character skeleton to be recognized according to the total bone length L1 and the number of bones N1

(ii) a Calculating the average bone length of the reference character skeleton according to the total bone length L2 and the number of bones N2

(ii) a The skeleton length of the character skeleton to be recognized is less than 0.1 time of the average skeleton length

Removing the skeleton; enabling the skeleton length of the reference character skeleton to be less than 0.1 time of the average skeleton length

The bone is removed.

In one embodiment, the warping the character skeleton to be recognized and the standard character skeleton to the same pixel range to obtain a warped character skeleton to be recognized and a warped standard character skeleton, includes:

recording the coordinates of each pixel point in the character skeleton to be recognized as

The minimum x coordinate value in the pixel points is

The maximum x coordinate value is

The minimum y coordinate value is

The maximum y coordinate value is

And regulating to the pixel range of NxN, obtaining the coordinates of each pixel point in the regulated character skeleton to be recognized

(ii) a Wherein, the N is a positive integer,

；

recording the coordinate of each pixel point in the reference character skeleton as

The minimum X coordinate value in the pixel points is

The maximum X coordinate value is

The minimum Y coordinate value is

The maximum Y coordinate value is

(ii) a Wherein the content of the first and second substances,

。

in one embodiment, translating the position of each point in the normalized character skeleton to be recognized according to the distance between the point and the closest point in the normalized reference character skeleton to obtain a translated character skeleton to be recognized includes: obtaining the distance O between each point in the structured character skeleton to be recognized and the closest point in the direction of the transverse axis in the structured reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2(ii) a According to the distance O between the midpoint of the structured character skeleton to be recognized and the closest point in the direction of the transverse axis in the structured reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2Calculating the average closest point distance in the direction of the horizontal axis and the average closest point distance in the direction of the vertical axis; and taking the average closest point distance in the direction of the transverse axis and the average closest point distance in the direction of the longitudinal axis as translation adjustment vectors, and translating each point in the regular character skeleton to be recognized to obtain a translation character skeleton to be recognized.

In one embodiment, the dividing the character skeleton to be recognized by translation and the regular reference character skeleton into grids with uniform sizes, and calculating the similarity between the character skeleton to be recognized by translation and the regular reference character skeleton according to the number of points in the character skeleton to be recognized by translation and the number of points in the regular reference character skeleton in the grids includes: dividing the translational character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes according to the pixel range; the number of pixels within each grid is the same; calculating the number ni of the points in the character skeleton to be recognized in the translation and the number mi of the points in the regular reference character skeleton for the grids with the points in the character skeleton to be recognized in the translation; calculating the ratio of the number mi of the regular reference character skeleton to the number ni of the translation character skeleton to be recognized; taking the minimum value between the ratio and 1 as the grid fraction of the grid with the middle point of the character skeleton to be recognized; and calculating the average value of the grid fractions of all grids with the midpoints of the translation character frameworks to be recognized, and obtaining the similarity between the translation character frameworks to be recognized and the regular reference character frameworks.

An apparatus for character recognition, the apparatus comprising:

the skeleton extraction module is used for acquiring a character skeleton to be recognized in the character image to be recognized and a reference character skeleton in the reference character image;

the warping module is used for warping the character skeleton to be recognized and the standard character skeleton to the same pixel range to obtain a warped character skeleton to be recognized and a warped standard character skeleton;

the translation module is used for translating the position of each point in the normalized character skeleton to be recognized according to the distance between the point and the nearest point in the normalized reference character skeleton to obtain a translated character skeleton to be recognized;

the similarity calculation module is used for dividing the translation character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes, and calculating the similarity between the translation character skeleton to be recognized and the regular reference character skeleton according to the number of points in the translation character skeleton to be recognized and the number of points in the regular reference character skeleton in the grids;

and the character acquisition module is used for determining the character corresponding to the reference character image with the maximum similarity as the character in the character image to be recognized.

In one embodiment, the character skeleton includes bones and bone nodes, and the bone nodes are intersections between the bones and the bones. The character recognition apparatus further includes: and the eliminating module is used for eliminating the non-node skeleton and the single-node skeleton of the character skeleton to be recognized and the reference character skeleton.

In one embodiment, the character skeleton includes bones and bone nodes, and the bone nodes are intersections between the bones and the bones. The character recognition apparatus further includes: and the eliminating module is used for eliminating the skeletons of the character skeleton to be recognized and the skeleton length of the reference character skeleton which are smaller than a preset value.

In one embodiment, the culling module comprises: a first bone length and number calculating unit for calculating the total bone length L1 and the number N1 of the character skeleton to be recognized; a second skeleton length and number calculating unit for calculating a total skeleton length L2 and a number N2 of the reference character skeleton; a first average length calculating unit for calculating the average bone length of the character skeleton to be recognized according to the total bone length L1 and the number of bones N1

(ii) a A second average length calculating unit for calculating average length of skeleton of the reference character skeleton according to the total skeleton length L2 and the number of skeleton N2

(ii) a A first eliminating unit for making the skeleton length of the character skeleton to be recognized less than 0.1 times the average skeleton length

Removing the skeleton; a second eliminating unit for making the skeleton length of the reference character skeleton less than 0.1 times the average skeleton length

The bone is removed.

In one embodiment, the warping module comprises: a first regularization unit for recording the coordinates of each pixel point in the character skeleton to be recognized as

The minimum x coordinate value in the pixel points is

The maximum x coordinate value is

The minimum y coordinate value is

The maximum y coordinate value is

(ii) a Wherein, the N is a positive integer,

；

a second regularization unit for noting the coordinate of each pixel point in the reference character skeleton as

The minimum X coordinate value in the pixel points is

The maximum X coordinate value is

The minimum Y coordinate value is

The maximum Y coordinate value is

Regularizing to a pixel range of NxN, then obtaining regularized treatIdentifying coordinates of each pixel point in a character skeleton

(ii) a Wherein the content of the first and second substances,

。

in one embodiment, the translation module comprises: a closest point distance obtaining unit, configured to obtain a distance O between each point in the normalized character skeleton to be recognized and a closest point in the normalized reference character skeleton in the horizontal axis direction_i1Distance O from the closest point in the direction of the longitudinal axis_i2(ii) a A mean closest point distance calculating unit, configured to calculate a distance O between the midpoint of the regular character skeleton to be recognized and the closest point in the horizontal axis direction in the regular reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2Calculating the average closest point distance in the direction of the horizontal axis and the average closest point distance in the direction of the vertical axis; and the translation unit is used for taking the average closest point distance in the direction of the transverse axis and the average closest point distance in the direction of the longitudinal axis as translation adjustment vectors, translating each point in the structured character skeleton to be recognized and obtaining the character skeleton to be recognized in translation.

In one embodiment, the similarity calculation module includes: the grid dividing unit is used for dividing the translational character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes according to the pixel range; the number of pixels within each grid is the same; the number-of-points calculating unit is used for calculating the number ni of the points in the character skeleton to be recognized in the translation and the number mi of the points in the regular reference character skeleton for the grids with the middle points in the character skeleton to be recognized in the translation; the ratio calculation unit is used for calculating the ratio of the number mi of the regular reference character skeleton to the number ni of the translation character skeleton to be recognized; the grid score calculating unit is used for taking the minimum value between the ratio and 1 as the grid score of the grid with the middle point of the character skeleton to be recognized; and the similarity calculation unit is used for calculating the average value of the grid scores of all grids with the midpoints of the translation character frameworks to be recognized, so as to obtain the similarity between the translation character frameworks to be recognized and the regular reference character frameworks.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

According to the character recognition method, the character recognition device, the computer equipment and the storage medium, the similarity calculation is carried out on the characters by extracting the character skeleton, so that the influence of the thickness of the characters on the character recognition can be avoided; moreover, characters in pictures with different sizes can be recognized by regulating the character skeleton to be recognized and the reference character skeleton to the same pixel range; meanwhile, the similarity between the translational character skeleton to be recognized and the regular reference character skeleton is calculated according to the grids, so that the difference tolerance of the image of the character to be recognized is stronger; finally, the characters corresponding to the reference character images with high similarity are obtained by setting threshold judgment, so that the accuracy and comprehensiveness of character recognition are improved.

Drawings

FIG. 1 is a flow diagram illustrating a character recognition method in one embodiment;

FIG. 2 is a schematic illustration of skeleton extraction in one embodiment;

FIG. 3 is a diagram of a character skeleton in one embodiment;

FIG. 4 is a schematic illustration of an embodiment of removing a single node bone;

FIG. 5 is a diagram illustrating skeleton translation for regularizing a character to be recognized in one embodiment;

FIG. 6 is a diagram illustrating the division of a character skeleton into meshes according to one embodiment;

FIG. 7 is a block diagram showing the structure of a character recognition apparatus according to an embodiment;

FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, a character recognition method is provided, which is described by taking the method as an example applied to a server, and includes the following steps:

step S110, acquiring a character skeleton to be recognized in the character image to be recognized and a reference character skeleton in the reference character image.

The character image to be recognized is an image of a character to be recognized, and the character image to be recognized includes a character image obtained by shooting and a character image input by handwriting, for example, a picture with characters shot by a camera, characters in a picture format, and a character picture for early education and infant copying. The reference character image is a character picture stored in a database and used as a reference for character recognition.

The specific process of acquiring the character skeleton of the character image comprises the following steps: after the segmented character image is acquired, the character image is subjected to threshold segmentation to obtain a binary image of the character, and in the application, the binary image of the character is assumed to be a black character with white background (the processing of the black character with white background is similar), and skeleton extraction (also called image refinement) is performed on the binary image. Because the character skeleton is extracted, the thickness of the character has no great influence on the extraction result of the skeleton, and therefore the influence of the thickness of the strokes of the character on character recognition can be avoided. The method for acquiring the character skeleton to be recognized in the character image to be recognized is the same as the method for acquiring the reference character skeleton in the reference character image.

In this embodiment, a classical Zhang fast parallel refinement algorithm is adopted to extract a character skeleton of a character image, as shown in fig. 2. In the extracted character skeleton, each skeleton has a single-pixel width, the intersection points of the skeletons are called skeleton nodes, as shown in fig. 3, the skeletons in the skeleton can be divided into non-node skeletons, single-node skeletons and double-node skeletons according to the number of the skeleton nodes,

and step S120, regulating the character skeleton to be recognized and the standard character skeleton to the same pixel range, and obtaining a regulated character skeleton to be recognized and a regulated standard character skeleton.

The character skeleton to be recognized and the reference character skeleton may be different from each other due to the influence of the size of the text image, and the character skeleton to be recognized and the reference character skeleton are adjusted to be in the same pixel range in order to facilitate comparison between the character skeleton to be recognized and the reference character skeleton, where the pixel range refers to the size of the image, for example, the pixel range may be 100 × 100, and the pixel range may also be 200 × 200, where the pixel range may be set as required.

And step S130, translating the position of each point in the regular character skeleton to be recognized according to the distance between the point and the nearest point in the regular reference character skeleton to obtain the translated character skeleton to be recognized.

And all the points in the regular character skeleton to be recognized and the regular reference character skeleton are pixel points, and each pixel point has a corresponding position in the character image. Wherein, a representative point (such as a point at the center position of the character skeleton) in the regular character skeleton to be recognized can be selected, the distance between the representative point and the nearest point in the regular reference character skeleton is calculated, and then all points in the whole regular character skeleton to be recognized are translated according to the distance, wherein the translation comprises left-right translation and up-down translation; or calculating the distance between each point in the regular character skeleton to be recognized and the nearest point in the regular reference character skeleton, averaging the distances of all the points, and then translating all the points in the regular character skeleton to be recognized according to the average value of the distances. And forming a character skeleton to be recognized by translation by the translated points.

Step S140, dividing the translation character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes, and calculating the similarity between the translation character skeleton to be recognized and the regular reference character skeleton according to the number of points in the translation character skeleton to be recognized and the number of points in the regular reference character skeleton in the grids.

The mesh may be a triangle, a quadrangle, or a hexagon, and the shape of the specific mesh is not limited, for example, as shown in fig. 6, the character skeleton to be recognized by translation and the skeleton of the regular reference character are divided into quadrangle meshes, and the number of the meshes is 8 × 8. The closer the number of the points in the translation character skeleton to be recognized and the number of the points in the regular benchmark character skeleton are, the closer the translation character skeleton to be recognized and the regular benchmark character skeleton are, the higher the similarity is. The number of the grids can be 8 × 8, 5 × 5 or 10 × 10, and of course, other numbers are also possible, the finer the divided grids are, the stricter the comparison check is represented, but the character recognition result is difficult to resist interference; on the contrary, the more sparse the grid division is, the less strict the comparison check is, and the anti-interference performance of the character recognition result is better.

For example, in each grid, the ratio of the number of points in the character skeleton to be recognized in translation in the grid to the number of points in the regular reference character skeleton may be calculated, and the average value of the ratios corresponding to all the grids may be calculated as the similarity between the character skeleton to be recognized in translation and the regular reference character skeleton, and certainly, the grids which have neither the points in the character skeleton to be recognized in translation nor the points in the regular reference character skeleton may be eliminated without calculating the ratios; alternatively, the ratio may also be equal to the number of points in the warping reference character skeleton divided by the number of points in the translation character skeleton to be recognized. For another example, each grid is scored according to the number of points in the character skeleton to be recognized in the translation and the number of points in the regular reference character skeleton, and an average value of scores of all grids is calculated to serve as the similarity between the character skeleton to be recognized in the translation and the regular reference character skeleton.

This application calculates through the net the translation treat the discernment character skeleton with the similarity of regular benchmark character skeleton, especially early education infant imitates, to askew word easier recognition, has improved character difference tolerance.

Step S150, determining the character corresponding to the reference character image with the largest similarity as the character in the character image to be recognized.

The reference character image with the maximum similarity has the highest matching degree with the character image to be recognized, and therefore the corresponding character is used as the character in the character image to be recognized.

In the character recognition method, the similarity calculation is carried out on the characters by extracting the character skeleton, so that the influence of the thickness of the characters on the character recognition can be avoided; moreover, characters in pictures with different sizes can be recognized by regulating the character skeleton to be recognized and the reference character skeleton to the same pixel range; meanwhile, the similarity between the translational character skeleton to be recognized and the regular reference character skeleton is calculated according to the grids, so that the difference tolerance of the image of the character to be recognized is stronger; finally, the characters corresponding to the reference character images with high similarity are obtained by setting threshold judgment, so that the accuracy and comprehensiveness of character recognition are improved.

In one embodiment, as shown in fig. 3, the character skeleton includes bones and bone nodes, and the bone nodes are intersections between the bones and the bones. Before the step S120, the method includes: and eliminating the non-node bones and the single-node bones of the character skeleton to be recognized and the reference character skeleton. For example, in fig. 3, a non-node skeleton exists, which may be caused by miswriting, and at this time, the accuracy of character recognition can be improved by removing the non-node skeleton. As shown in fig. 4, the single-node skeleton is removed, so that the character skeleton to be recognized can conform to the shape of the character, and the accuracy of character recognition can be improved.

In one embodiment, as shown in fig. 3, the character skeleton includes bones and bone nodes, and the bone nodes are intersections between the bones and the bones. Before the step S120, the method includes: and removing bones of which the lengths of the bones of the character skeleton to be recognized and the reference character skeleton are smaller than a preset value. The preset value can be set according to the requirement, for example, the preset value is 0.1 time of the average length of the bone. In the embodiment, by removing the skeleton with the skeleton length smaller than the preset value, the influence of non-character strokes on the calculation of the similarity between the translation character skeleton to be recognized and the regular reference character skeleton can be avoided, and the accuracy of character recognition is provided.

The bone is removed.

The calculation of the total bone length and the bone data of the character skeleton to be recognized and the calculation of the total bone length and the bone data of the reference character skeleton are not sequentially carried out at the same time, and the bone removing step of the character skeleton to be recognized and the bone removing step of the character skeleton not meeting the requirements are not sequentially carried out at the same time.

recording each image in the character skeleton to be recognizedThe coordinates of the prime point are

The minimum x coordinate value in the pixel points is

The maximum x coordinate value is

The minimum y coordinate value is

The maximum y coordinate value is

(ii) a Wherein, the N is a positive integer,

；

The minimum X coordinate value in the pixel points is

The maximum X coordinate value is

The minimum Y coordinate value is

The maximum Y coordinate value is

(ii) a Wherein the content of the first and second substances,

。

where letters in this application do not denote a particular meaning (unless otherwise indicated), the letters are not to be construed as limiting the application and may be replaced with other letters, if desired. When the pixel range is 100 × 100, N is 100; when the pixel range is 200 × 200, N is 200. When the pixel range is 100 multiplied by 100 and N is 100, the coordinates of each pixel point in the character skeleton to be recognized are correspondingly regulated

(ii) a Wherein the content of the first and second substances,

；

obtaining coordinates of each pixel point in the structured character skeleton to be recognized

(ii) a Wherein the content of the first and second substances,

。

in one embodiment, the step S130 includes: obtaining the distance O between each point in the structured character skeleton to be recognized and the closest point in the direction of the transverse axis in the structured reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2(ii) a According to the distance O between the midpoint of the structured character skeleton to be recognized and the closest point in the direction of the transverse axis in the structured reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2Is calculated on the horizontalAverage closest point distance in the axial direction and average closest point distance in the longitudinal axis direction; and taking the average closest point distance in the direction of the transverse axis and the average closest point distance in the direction of the longitudinal axis as translation adjustment vectors, and translating each point in the regular character skeleton to be recognized to obtain a translation character skeleton to be recognized.

For example, as shown in fig. 5, a thick solid line represents a skeleton of a regular character to be recognized, and a thin solid line represents a skeleton of a regular reference character, in practical applications, the obtained skeleton of the regular character to be recognized may not be consistent with the position of the skeleton of the regular reference character in a character image, as shown in fig. 5 (left side), in order to make the skeleton of the regular character to be recognized better fit with the skeleton of the regular reference character, it is necessary to adjust the position of the skeleton of the regular character to be recognized with the skeleton of the regular reference character as a reference; as shown in fig. 5, the points on the skeleton of the character to be recognized are normalized

And finding the nearest point in the regular reference character skeleton along the horizontal direction and the vertical direction respectively. In the horizontal direction, the point is recorded

A closest distance of

(if the nearest point in the skeleton of the regular reference character is on the right side of the point, the distance takes a positive value, otherwise, the distance takes a negative value); in the vertical direction, point recording

A closest distance of

(if the nearest point in the skeleton of the regular reference character is below this point, the distance takes a positive value, otherwise it takes a negative value). If the total N points on the skeleton of the character to be recognized are regulated, the adjustment vectors in the horizontal direction and the vertical direction are as follows:

；

wherein, the character skeleton to be recognized is translated (the shape indicated by the thick solid line on the right side in fig. 5) after translation.

In one embodiment, the step S140 includes: dividing the translational character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes according to the pixel range; the number of pixels within each grid is the same; calculating the number ni of the points in the character skeleton to be recognized in the translation and the number mi of the points in the regular reference character skeleton for the grids with the points in the character skeleton to be recognized in the translation; calculating the ratio of the number mi of the regular reference character skeleton to the number ni of the translation character skeleton to be recognized; taking the minimum value between the ratio and 1 as the grid fraction of the grid with the middle point of the character skeleton to be recognized; and calculating the average value of the grid fractions of all grids with the midpoints of the translation character frameworks to be recognized, and obtaining the similarity between the translation character frameworks to be recognized and the regular reference character frameworks.

For example, as shown in fig. 6, the skeleton of the translation character to be recognized and the skeleton of the normalization reference character are divided into 8 × 8 rectangular grids with uniform size, and in the ith grid, the number of skeleton points to be compared is equal to

The number of reference skeleton points is

And recording the similarity between the translational character skeleton to be recognized and the regular reference character skeleton as follows:

；

wherein the content of the first and second substances,

；

；

the value range of the Score is 0-1, and the higher the value of the Score is, the more similar the translation character skeleton to be recognized and the regular reference character skeleton are. If a plurality of regular reference character frameworks exist, comparing the character framework to be recognized in the translation with the regular reference character frameworks respectively to obtain the similarity, and selecting the character corresponding to the regular reference character framework with the highest similarity as the recognition result.

After the regular reference character skeleton with the highest similarity is determined, the similarity between the translation to-be-recognized character skeleton and the regular reference character skeleton can also be used as a label for judging whether the character description in the to-be-recognized character image is standard or not, so that a threshold value (such as 0.3) of the similarity can be set, and if the threshold value is lower than the threshold value, the to-be-recognized character skeleton and the regular reference character skeleton are considered to have a larger difference, namely the normalization of the character in the to-be-recognized character image and the character in the regular reference character image is insufficient; therefore, the degree of standardization of the character can be detected according to the similarity, for example, the standard degree for automatically judging the character copying degree of children in the early education industry.

It should be understood that although the various steps of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

In one embodiment, as shown in fig. 7, there is provided a character recognition apparatus including: a skeleton extraction module 210, a warping module 220, a translation module 230, a similarity calculation module 240, and a character acquisition module 250, wherein:

the skeleton extraction module 210 is configured to obtain a character skeleton to be recognized in the character image to be recognized and a reference character skeleton in the reference character image.

And a regularizing module 220, configured to regularize the character skeleton to be recognized and the reference character skeleton to the same pixel range, so as to obtain a regularized character skeleton to be recognized and a regularized reference character skeleton.

The translation module 230 is configured to translate the position of each point in the normalized character skeleton to be recognized according to the distance between the point and the closest point in the normalized reference character skeleton, so as to obtain a translated character skeleton to be recognized.

And the similarity calculation module 240 is used for dividing the translation to-be-recognized character skeleton and the regular benchmark character skeleton into grids with uniform sizes, and calculating the similarity between the translation to-be-recognized character skeleton and the regular benchmark character skeleton according to the number of the translation to-be-recognized character skeleton intermediate points and the number of the regular benchmark character skeleton intermediate points in the grids.

And the character obtaining module 250 is configured to determine that the character corresponding to the reference character image with the largest similarity is the character in the character image to be recognized.

In one embodiment, the culling module packageComprises the following steps: a first bone length and number calculating unit for calculating the total bone length L1 and the number N1 of the character skeleton to be recognized; a second skeleton length and number calculating unit for calculating a total skeleton length L2 and a number N2 of the reference character skeleton; a first average length calculating unit for calculating the average bone length of the character skeleton to be recognized according to the total bone length L1 and the number of bones N1

The bone is removed.

In one embodiment, the warping module 220 includes: a first regularization unit for recording the coordinates of each pixel point in the character skeleton to be recognized as

The minimum x coordinate value in the pixel points is

The maximum x coordinate value is

The minimum y coordinate value is

The maximum y coordinate value is

(ii) a Wherein, the N is a positive integer,

；

The minimum X coordinate value in the pixel points is

The maximum X coordinate value is

The minimum Y coordinate value is

The maximum Y coordinate value is

(ii) a Wherein the content of the first and second substances,

。

in one embodiment, the translation module 230 comprises: a closest point distance obtaining unit for obtaining the regular character skeleton to be recognizedThe distance O between each point and the closest point in the direction of the horizontal axis in the regular reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2(ii) a A mean closest point distance calculating unit, configured to calculate a distance O between the midpoint of the regular character skeleton to be recognized and the closest point in the horizontal axis direction in the regular reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2Calculating the average closest point distance in the direction of the horizontal axis and the average closest point distance in the direction of the vertical axis; and the translation unit is used for taking the average closest point distance in the direction of the transverse axis and the average closest point distance in the direction of the longitudinal axis as translation adjustment vectors, translating each point in the structured character skeleton to be recognized and obtaining the character skeleton to be recognized in translation.

In one embodiment, the similarity calculation module 240 includes: the grid dividing unit is used for dividing the translational character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes according to the pixel range; the number of pixels within each grid is the same; the number-of-points calculating unit is used for calculating the number ni of the points in the character skeleton to be recognized in the translation and the number mi of the points in the regular reference character skeleton for the grids with the middle points in the character skeleton to be recognized in the translation; the ratio calculation unit is used for calculating the ratio of the number mi of the regular reference character skeleton to the number ni of the translation character skeleton to be recognized; the grid score calculating unit is used for taking the minimum value between the ratio and 1 as the grid score of the grid with the middle point of the character skeleton to be recognized; and the similarity calculation unit is used for calculating the average value of the grid scores of all grids with the midpoints of the translation character frameworks to be recognized, so as to obtain the similarity between the translation character frameworks to be recognized and the regular reference character frameworks.

For the specific definition of the character recognition device, reference may be made to the above definition of the character recognition method, which is not described herein again. The respective modules in the character recognition apparatus described above may be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing reference character image data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a character recognition method.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of character recognition, the method comprising:

translating the position of each point in the structured character skeleton to be recognized according to the distance between each point in the structured character skeleton to be recognized and the nearest point in the structured reference character skeleton to obtain a translated character skeleton to be recognized; wherein the distance between the point and the nearest point in the regular reference character skeleton is as follows: the average value of the distance between each point in the structured character skeleton to be recognized and the nearest point in the structured reference character skeleton;

determining the character corresponding to the reference character image with the maximum similarity as the character in the character image to be recognized;

wherein, will translation waits to discern character skeleton and regular benchmark character skeleton and divides into the grid of big or small homogeneity to according to in the grid translation waits to discern the number of points in the character skeleton and the number of points in the regular benchmark character skeleton, calculate translation waits to discern the similarity of character skeleton with regular benchmark character skeleton, include:

dividing the translational character skeleton to be recognized and the regular reference character skeleton into grids with uniform sizes according to the pixel range; the number of pixels within each grid is the same;

calculating the number ni of the points in the character skeleton to be recognized in the translation and the number mi of the points in the regular reference character skeleton for the grids with the points in the character skeleton to be recognized in the translation;

calculating the ratio of the number mi of the regular reference character skeleton to the number ni of the translation character skeleton to be recognized;

taking the minimum value between the ratio and 1 as the grid fraction of the grid with the middle point of the character skeleton to be recognized;

and calculating the average value of the grid fractions of all grids with the midpoints of the translation character frameworks to be recognized, and obtaining the similarity between the translation character frameworks to be recognized and the regular reference character frameworks.

2. The method of claim 1, wherein the character skeleton comprises bones and bone nodes, the bone nodes being intersections between the bones and bones;

before the character skeleton to be recognized and the standard character skeleton are structured to the same pixel range and the structured character skeleton to be recognized and the structured standard character skeleton are obtained, the method comprises the following steps:

and eliminating the non-node bones and the single-node bones of the character skeleton to be recognized and the reference character skeleton.

3. The method of claim 1, wherein the character skeleton comprises bones and bone nodes, the bone nodes being intersections between the bones and bones;

and removing bones of which the lengths of the bones of the character skeleton to be recognized and the reference character skeleton are smaller than a preset value.

4. The method according to claim 3, wherein the removing bones with the bone length of the character skeleton to be recognized and the reference character skeleton smaller than a preset value comprises:

calculating the total bone length L1 and the number N1 of the character skeleton to be recognized;

calculating the total bone length L2 and the number of bones N2 of the reference character skeleton;

calculating the average bone length of the character skeleton to be recognized according to the total bone length L1 and the number of bones N1

；

According to the boneCalculating the average bone length of the reference character skeleton from the total bone length L2 and the number of bones N2

；

The skeleton length of the character skeleton to be recognized is less than 0.1 time of the average skeleton length

Removing the skeleton;

enabling the skeleton length of the reference character skeleton to be less than 0.1 time of the average skeleton length

The bone is removed.

5. The method according to claim 1, wherein the warping the character skeleton to be recognized and the reference character skeleton to the same pixel range to obtain a warped character skeleton to be recognized and a warped reference character skeleton, comprises:

The minimum x coordinate value in the pixel points is

The maximum x coordinate value is

The minimum y coordinate value is

The maximum y coordinate value is

Regularization is performed to a pixel range of NxN, then regularization is obtainedCoordinates of each pixel point in character skeleton to be recognized

(ii) a Wherein, the N is a positive integer,

；

The minimum X coordinate value in the pixel points is

The maximum X coordinate value is

The minimum Y coordinate value is

The maximum Y coordinate value is

(ii) a Wherein the content of the first and second substances,

。

6. the method according to claim 1, wherein translating the position of each point in the regularized character skeleton to be recognized according to the distance between each point in the regularized character skeleton to be recognized and the nearest point in the regularized reference character skeleton to obtain a translated character skeleton to be recognized comprises:

obtaining the distance O between each point in the structured character skeleton to be recognized and the closest point in the direction of the transverse axis in the structured reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2；

According to the distance O between the midpoint of the structured character skeleton to be recognized and the closest point in the direction of the transverse axis in the structured reference character skeleton_i1Distance O from the closest point in the direction of the longitudinal axis_i2Calculating the average closest point distance in the direction of the horizontal axis and the average closest point distance in the direction of the vertical axis;

and taking the average closest point distance in the direction of the transverse axis and the average closest point distance in the direction of the longitudinal axis as translation adjustment vectors, and translating each point in the regular character skeleton to be recognized to obtain a translation character skeleton to be recognized.

7. An apparatus for character recognition, the apparatus comprising:

the translation module is used for translating the position of each point in the structured character skeleton to be recognized according to the distance between each point in the structured character skeleton to be recognized and the nearest point in the structured reference character skeleton to obtain a translated character skeleton to be recognized; wherein the distance between the point and the nearest point in the regular reference character skeleton is as follows: the average value of the distance between each point in the structured character skeleton to be recognized and the nearest point in the structured reference character skeleton;

the character acquisition module is used for determining the character corresponding to the reference character image with the maximum similarity as the character in the character image to be recognized;

8. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.