CN109325483B - Method and device for processing internal short pen section - Google Patents

Method and device for processing internal short pen section Download PDF

Info

Publication number
CN109325483B
CN109325483B CN201811057035.3A CN201811057035A CN109325483B CN 109325483 B CN109325483 B CN 109325483B CN 201811057035 A CN201811057035 A CN 201811057035A CN 109325483 B CN109325483 B CN 109325483B
Authority
CN
China
Prior art keywords
segments
skeleton
internal short
segment
pen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811057035.3A
Other languages
Chinese (zh)
Other versions
CN109325483A (en
Inventor
安维华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING LANGUAGE AND CULTURE UNIVERSITY
Original Assignee
BEIJING LANGUAGE AND CULTURE UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING LANGUAGE AND CULTURE UNIVERSITY filed Critical BEIJING LANGUAGE AND CULTURE UNIVERSITY
Priority to CN201811057035.3A priority Critical patent/CN109325483B/en
Publication of CN109325483A publication Critical patent/CN109325483A/en
Application granted granted Critical
Publication of CN109325483B publication Critical patent/CN109325483B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments
    • G06V30/1423Image acquisition using hand-held instruments; Constructional details of the instruments the instrument generating sequences of position coordinates corresponding to handwriting
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments

Abstract

The invention discloses a method and a device for processing an internal short stroke section. The method comprises the following steps: deleting the internal short stroke section under the condition that the skeleton stroke section of the Chinese character is judged to be the internal short stroke section, wherein the internal short stroke section is the skeleton stroke section of which two ends are respectively provided with two adjacent stroke sections; deleting skeleton pen segments with preset lengths from skeleton pen segments adjacent to the internal short pen segments to obtain partial adjacent pen segments, wherein the skeleton pen segments with the preset lengths are partial skeleton pen segments within a preset threshold range of the internal short pen segments; and smoothly connecting the partial adjacent stroke sections to form a new skeleton stroke section, wherein the new skeleton stroke section is matched with the central line of the Chinese character binary image. The invention solves the technical problem that the ambiguous distortion of the stroke segment cannot be automatically processed in the disambiguation process of the Chinese character stroke segment in the prior art.

Description

Method and device for processing internal short pen section
Technical Field
The invention relates to the technical field of computer application, in particular to a method and a device for processing an internal short stroke.
Background
The works of the famous calligraphers of the past generation are all static; many of the calligraphy copybooks on the market are also static. However, the dynamic process of writing Chinese characters has obvious significance in calligraphy art appreciation, writing teaching and the like. Therefore, it is necessary to dynamically restore the writing process of Chinese characters with static copybooks.
Two key problems need to be solved in the dynamic recovery process of the static copybook Chinese character writing process: extracting skeleton stroke segments of the copybook Chinese characters, and disambiguating and sequencing skeleton stroke segment sequences. At present, many methods have appeared in the aspect of extracting skeleton stroke segments of Chinese characters. For example, a contour method, a mathematical morphology method, a segmentation method, a region decomposition method, a fuzzy region detection method, a refinement-based method, a distance-based method, a direction run length-based method, a coding-based method, a neural network-based method, and the like. These methods are basically proposed based on the Chinese character recognition of the printed form, so they have strong limitations to Chinese characters, and have strong constraints on the capacity of character sets and the visual effect of font patterns.
At present, the Chinese character stroke disambiguation and sequencing aspects are less researched, and the following methods are mainly adopted: firstly, the stroke order of the Chinese characters is generated by defining rules, and the similarity between the stroke order and the stroke order in a standard template is calculated for correction, so that the method is difficult to distinguish approximate characters with the same stroke order and has low resolution; secondly, a method for reconstructing the sequence of the digital character handwriting reconstructs the writing handwriting by searching a Hamilton path with minimum cost, but the method can only be used for character sets with low complexity such as numbers and the like, and the method can not be suitable for various strokes of Chinese character fonts; thirdly, a stroke disambiguation rule is established for each Chinese character, the method cannot realize automatic identification and processing of stroke ambiguity distortion, and has large data volume and low adaptability, and ambiguity which is not contained in the stroke disambiguation rule cannot be eliminated.
Aiming at the problem that the prior art cannot automatically process the ambiguous distortion of the stroke segment in the disambiguation of the Chinese character stroke segment, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a method and a device for processing an internal short stroke segment, which at least solve the technical problem that the ambiguous distortion of the stroke segment cannot be automatically processed in the disambiguation process of the Chinese character stroke segment in the prior art.
According to an aspect of the embodiments of the present invention, there is provided a method for processing an internal short segment, including: deleting the internal short stroke section under the condition that the skeleton stroke section of the Chinese character is judged to be the internal short stroke section, wherein the internal short stroke section is the skeleton stroke section of which two ends of the stroke section are respectively provided with two adjacent stroke sections; deleting skeleton pen segments with preset lengths from other skeleton pen segments adjacent to the internal short pen segment to obtain partial adjacent pen segments, wherein the skeleton pen segments with the preset lengths are partial skeleton pen segments within a preset threshold range of the internal short pen segment; and smoothly connecting the partial adjacent stroke segments to form a new skeleton stroke segment, wherein the new skeleton stroke segment is matched with the central line of the Chinese character binary image.
Further, deleting skeleton segments with preset lengths from other skeleton segments adjacent to the internal short segment to obtain partial adjacent segments, wherein the deleting step comprises the following steps: judging the number of all internal short pen sections communicated with the internal short pen sections; and determining the preset threshold value according to the number of the connected internal short pen segments.
Further, when the number of all internal short segments communicated with the internal short segments is judged to be one, the predetermined threshold is determined according to the length of the internal short segments.
Further, in the case that it is determined that the number of all internal short segments communicating with the internal short segment is not one, the predetermined threshold is determined according to the average width of the internal short segments.
Further, smoothly connecting the part of adjacent segments to form a new skeleton segment comprises: judging the included angle of the end tangent vectors of any two adjacent segments; and if the included angle of the tangent vector accords with a preset angle, smoothly connecting the adjacent segments of the two parts to form a new skeleton segment.
Further, smoothly connecting the part of adjacent segments to form a new skeleton segment comprises: judging the included angle of the end tangent vectors of any two adjacent segments; and if the included angle of the tangent vector does not accord with a preset angle, extending the part of the adjacent pen segments to the position intersected with the internal short pen segment.
Further, smoothly connecting the part of adjacent segments to form a new skeleton segment comprises: judging whether any two partially adjacent pen segments can be fitted into a smooth line segment without an inflection point; if so, smoothly connecting the two adjacent partial pen segments to form a new skeleton pen segment.
Further, smoothly connecting the part of adjacent segments to form a new skeleton segment comprises: judging whether any two partially adjacent pen segments can be fitted into a smooth line segment without an inflection point; and if not, extending the two partially adjacent pen segments to a position intersected with a preset straight line, wherein the preset straight line is a straight line corresponding to the center point coordinate of the bounding box of the internal short pen segment.
According to an aspect of the embodiments of the present invention, there is provided an internal short segment processing apparatus, including: the first deleting module is used for deleting the internal short stroke section under the condition that the skeleton stroke section of the Chinese character is judged to be the internal short stroke section, wherein the internal short stroke section is the skeleton stroke section of which two ends of the stroke section are respectively provided with two adjacent stroke sections; the second deleting module is used for deleting skeleton pen segments with preset lengths from other skeleton pen segments adjacent to the internal short pen segment to obtain partial adjacent pen segments, wherein the skeleton pen segments with the preset lengths are partial skeleton pen segments within a preset threshold range of the internal short pen segment; and the construction module is used for smoothly connecting the partial adjacent stroke sections to form a new skeleton stroke section, wherein the new skeleton stroke section is matched with the central line of the Chinese character binary image.
According to an aspect of an embodiment of the present invention, there is provided a storage medium characterized in that the storage medium includes a stored program, wherein the program performs the above-described method.
In the embodiment of the invention, the internal short stroke is deleted under the condition that the skeleton stroke of the Chinese character is judged to be the internal short stroke, wherein the internal short stroke is the skeleton stroke of which two ends are respectively provided with two adjacent strokes; deleting skeleton pen segments with preset lengths from other skeleton pen segments adjacent to the internal short pen segment to obtain partial adjacent pen segments, wherein the skeleton pen segments with the preset lengths are partial skeleton pen segments within a preset threshold range of the internal short pen segment; and smoothly connecting the part of adjacent pen segments to form a new skeleton pen segment, wherein the new skeleton pen segment is matched with the central line of the Chinese character binary image, so that the technical problem that the ambiguous distortion of the pen segment cannot be automatically processed in the disambiguation process of the Chinese character pen segment in the prior art is solved, when the pen segment is detected to be the internal short pen segment, the internal short pen segment can be automatically deleted and smoothly connected at a proper position, and the subsequent sequencing and other processing are more smooth.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method of processing internal short segments according to the prior art;
FIG. 2 is a schematic view of an internal short segment handling device according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an alternative signature Chinese character collection according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an alternative signature Chinese character collection according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating binarization results of a copybook Chinese character image according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a result of thinning a Chinese character image of a copybook according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating the classification of pixel points in a Chinese character skeleton according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of extracted skeleton segments according to an embodiment of the invention;
FIG. 9 is a schematic diagram of skeleton segment classification according to an embodiment of the invention;
FIG. 10 is a flow diagram of a "glitch determination module" according to an embodiment of the present invention;
FIG. 11 is a diagram illustrating an example of identification results of skeleton segments of burr type according to an embodiment of the present invention;
FIG. 12 is a flow chart of an "internal short segment determination module" according to an embodiment of the present invention;
FIG. 13 is a first schematic diagram of triangle rule according to an embodiment of the present invention;
FIG. 14 is a second schematic diagram of triangle rule according to an embodiment of the present invention;
FIG. 15 is a third schematic diagram of triangle rule according to an embodiment of the present invention;
FIG. 16 is a fourth schematic diagram of triangle rules according to an embodiment of the present invention;
FIG. 17 is a fifth schematic diagram of triangle rules according to an embodiment of the present invention;
FIG. 18 is a sixth schematic of triangle rule according to an embodiment of the present invention;
FIG. 19 is a schematic illustration of an internal short segment type according to an embodiment of the present invention;
FIG. 20 is a graph comparing processing effects when only one internal short segment is included in a packet according to an embodiment of the present invention;
FIG. 21 is a graph comparing processing effects when a plurality of internal short segments are included in a packet according to an embodiment of the present invention;
FIG. 22 is a schematic diagram of a coordinate system according to an embodiment of the invention;
FIG. 23 is a schematic diagram of a spur and its direction vector during a "stroke break spur" disambiguation process according to an embodiment of the invention;
FIG. 24 is a schematic diagram of skeleton segment end points during a "stroke break spur" disambiguation process in accordance with an embodiment of the present invention;
FIG. 25 is a schematic diagram of a connection point during a "stroke break spur" disambiguation process in accordance with an embodiment of the present invention;
FIG. 26 is a schematic diagram of new segment generation during the "stroke break spur" disambiguation process according to an embodiment of the invention;
FIG. 27 is a cross-sectional diagram of a stroke break spur process in accordance with an embodiment of the present invention;
FIG. 28 is a schematic diagram of new end points of skeleton segments during a "stroke adhesion burr" disambiguation process according to an embodiment of the invention;
FIG. 29 is a schematic diagram of key points during a "stroke stuck burr" disambiguation process according to an embodiment of the invention;
FIG. 30 is a schematic illustration of a comparison of line fits during a "stroke stuck burr" disambiguation process according to an embodiment of the invention;
FIG. 31 is a diagram illustrating the results of new stroke segment generation during the "stroke sticky burr" disambiguation process according to an embodiment of the present invention;
FIG. 32 is a comparison graph of "stroke stick burr" processing before and after;
FIG. 33 is a comparison of trifurcation before and after adjustment according to an embodiment of the present invention;
FIG. 34 is a schematic diagram of a dynamic reproduction and results of a Song writing process according to an embodiment of the present invention;
FIG. 35 is a diagram illustrating glyph structure information of a standard word "Song" according to an embodiment of the present invention;
fig. 36 is a diagram illustrating sample point information of a standard word "song" according to an embodiment of the present invention;
fig. 37 is a schematic view of the collection results of "song" according to an embodiment of the present invention;
FIG. 38 is a flow chart of a binarization algorithm flow according to an embodiment of the present invention;
FIG. 39 is a diagram of a template for smoothing handwriting edge pixels, according to an embodiment of the invention;
FIG. 40 is a schematic representation of the results of pre-processing of "Song" digital images according to embodiments of the present invention;
FIG. 41 is a schematic diagram of a refined skeleton result of "Song" words according to an embodiment of the present invention;
FIG. 42 is a diagram illustrating the result of the skeleton extraction of the shape of the Chinese character "Song" according to an embodiment of the present invention;
FIG. 43 is a diagram illustrating the classification result of skeleton segments of "Song" font according to an embodiment of the present invention;
FIG. 44 is a schematic diagram of processing of short segments within the "Song" word according to an embodiment of the present invention;
FIG. 45 is a graph showing the effect of processing short segments within the "Song" font according to the embodiment of the present invention;
FIG. 46 is a schematic diagram of a process for stroke sticking burrs according to an embodiment of the present invention;
FIG. 47 is a process diagram of stroke break burrs according to an embodiment of the present invention;
FIG. 48 is a graph showing a comparison of treatment effects of the brush segments of the Song's line according to the embodiment of the present invention;
FIG. 49 is a schematic view of processing the triple point of the long segment in the "Song" font according to an embodiment of the present invention;
FIG. 50 is a graph comparing the effect of processing long segments of "Song" according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In accordance with an embodiment of the present invention, there is provided a method embodiment of processing of internal short segments, it should be noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
Fig. 1 is a method for processing an internal short segment according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S102, deleting the internal short stroke section under the condition that the skeleton stroke section of the Chinese character is judged to be the internal short stroke section, wherein the internal short stroke section is the skeleton stroke section of which two ends of the stroke section are respectively provided with two adjacent stroke sections;
step S104, deleting skeleton segments with preset lengths from other skeleton segments adjacent to the internal short segment to obtain partial adjacent segments, wherein the skeleton segments with the preset lengths are partial skeleton segments within a preset threshold range of the internal short segment;
and S106, smoothly connecting the partial adjacent stroke segments to form a new skeleton stroke segment, wherein the new skeleton stroke segment is matched with the central line of the Chinese character binary image.
When the type of the skeleton segment is judged to be an internal short segment, disambiguation processing needs to be carried out on the type of the skeleton segment, and the prior art does not have a disambiguation method for the type of the skeleton segment, in the embodiment, new errors and deformities cannot be brought after the internal short segment type segment is deleted, the deleted part is reconnected to be a smooth segment which is matched with the central line of the Chinese character binary image, the correct original segment simulates the connection of the correct original segment (corresponding to the standard character segment in the standard character library) is reserved, the technical problem that the prior art cannot automatically process ambiguous distortion of the segment in the disambiguation process of the Chinese character segment is solved, so that when the segment is detected to be the internal short segment, the internal short segment can be automatically deleted and smoothly connected at a proper position, subsequent sequencing and the like are smoother, thereby ensuring the vivid dynamic restoration effect.
The internal short segment may be divided into two categories for different elimination processing during disambiguation processing, the category division may be performed before deleting skeleton segments with a predetermined length from other skeleton segments adjacent to the internal short segment to obtain a part of adjacent segments, and in an optional embodiment, the number of all internal short segments communicated with the internal short segment is determined; and determining the preset threshold value according to the number of the connected internal short pen segments.
In the case that the number of all internal short segments communicated with the internal short segment is judged to be one, in an alternative embodiment, the predetermined threshold may be determined according to the length of the internal short segment.
In the case that the number of all internal short segments communicated with the internal short segment is judged not to be one, in an alternative embodiment, the predetermined threshold may be determined according to an average width of the internal short segments.
By adopting different threshold calculation modes for different types of internal short pen sections, the resolution ratio is higher, and the disambiguation is more accurate.
In a case that the number of all the internal short segments communicated with the internal short segments is judged to be one, smoothly connecting the partial adjacent segments to form a new skeleton segment, and judging the new skeleton segment through an included angle, wherein in an optional embodiment, the included angle of end point tangent vectors of any two partial adjacent segments is judged; and if the included angle of the tangent vector accords with a preset angle, smoothly connecting the adjacent segments of the two parts to form a new skeleton segment.
In an optional embodiment, the smoothly connecting the part of the adjacent segments to form a new skeleton segment includes: judging the included angle of the end tangent vectors of any two adjacent segments; and if the included angle of the tangent vector does not accord with a preset angle, extending the part of the adjacent pen segments to the position intersected with the internal short pen segment.
By the method, the stroke intersection condition in the static Chinese character can be processed more accurately. The method can effectively identify and connect the intersecting strokes, and enables the intersecting strokes to be smoothly and accurately fitted with the central line of the original Chinese character handwriting.
In a preferred embodiment, the method further comprises the steps of smoothly connecting the partial adjacent pen segments to form a new skeleton pen segment under the condition that the number of all the internal short pen segments communicated with the internal short pen segments is not one, and in an optional embodiment, firstly judging whether any two partial adjacent pen segments can be fitted into a smooth line segment without an inflection point; if so, smoothly connecting the two adjacent partial pen segments to form a new skeleton pen segment. In the case that it is determined that any two partially adjacent segments cannot fit into a smooth line segment without an inflection point, in an optional embodiment, the two partially adjacent segments are extended to a position intersecting with a preset straight line, where the preset straight line is a straight line corresponding to a bounding box center point coordinate of the inner short segment.
By the method, the condition that 3 strokes or more in static Chinese characters are jointed together can be more accurately processed. Aiming at the condition, the method can effectively identify and connect the intersected strokes, and simultaneously properly prolong the connected strokes, thereby finally obtaining a series of stroke segments which can accurately fit the central line of the original Chinese character handwriting.
All the above steps are described below by way of example in connection with an alternative embodiment:
firstly, collecting a copybook Chinese character image as a Chinese character to be processed: the copybook Chinese characters with different authors and different fonts on the market are converted into digital images by using image acquisition equipment such as a scanner, a camera and the like so as to carry out subsequent processing. Digital images collected from the two Chinese character copybooks are shown in FIGS. 3 and 4.
And secondly, carrying out binarization on the Chinese character image in the copybook, wherein the step aims to remove noise information in the Chinese character image in the copybook and distinguish a Chinese character area with a foreground from a background area. Specifically, the method converts the copybook Chinese character image into a binary image with only black and white colors, and as shown in fig. 5, the binarization work of the copybook Chinese character image specifically comprises the following steps: converting the copybook Chinese character image into a gray image, and removing color information; the grayscale image is converted into a binary image. Wherein the foreground color is black and represents Chinese characters; and denoising the binary image. And removing isolated noise points in the image and smoothing the edges of the foreground Chinese characters.
And thirdly, thinning the binarized image, and performing thinning operation on the binary image obtained in the second step to obtain a Chinese character skeleton image with single pixel width. The Chinese character skeleton is overlapped with the central line of the binary image as much as possible. As shown in fig. 6, the black information is a chinese character writing area, and the middle white line is a chinese character skeleton image.
Fourthly, extracting the image of the skeleton pen segment:
and extracting skeleton stroke segments of the copybook Chinese characters from the Chinese characters on the basis of the result of the step three. The rule for extracting the skeleton stroke segment of the Chinese character of the copybook can comprise the following two steps:
(1) and traversing all pixel points on the skeleton. For a certain skeleton pixel P, the following determination is made: if the number of other skeleton pixel points in the 8 connected region is 2, marking P as an internal point of the skeleton segment; if the number of other skeleton pixels in its 8-connected region is not 2, then P is marked as the end point of the skeleton segment. As shown in fig. 7, it enlarges four local regions of the skeleton image. In the local region A, B, black pixels represent the inner points of the skeleton segment, and diagonally shaded pixels represent the end points of the skeleton segment. According to the above rules, the skeleton image in step 4 can be represented as a set S containing n skeleton segments. The expression of S is as follows: s ═ S1, S2, …, sn }, where si ═ u, w1, w2, …, wj, …, v }, i ═ 1,2, …, n. Here, u and v are from the end point set of the skeleton segment, and wj is an internal point between u and v and adjacent to each other in sequence.
(2) And for the skeleton pen segment obtained in the last step, further subdividing by adopting the following operations: and traversing each skeleton segment si and finding all inflection points in the segment. And based on the inflection points, segmenting the original skeleton segment. For example, in the local area C in fig. 7, the diagonally shaded pixel is an inflection point, which divides the original skeleton segment into two parts. The local region D in fig. 7 has no inflection point, and is an internal point of the skeleton segment. And obtaining all skeleton segment lists to be processed through the two steps of processing. As shown in fig. 8, fig. 8 shows the final result after processing on fig. 6. The shadow dots represent all end points of the skeleton pen section, and two adjacent shadow dots and the middle white line part represent one skeleton pen section.
Fifthly, disambiguating the skeleton pen sections of different types:
the skeleton segment obtained in the step four is not consistent with the standard word segment in the standard word stock, because the skeleton segment has the following two problems: (1) the refinement operation in the third step can cause the skeleton segment to have distortion at the intersection, turning and overlapping of strokes; (2) the skeleton segment is over-segmented at the stroke junction. These two problems lead to the skeleton segment being too finely divided to establish a connection with the standard character segment. There is a need to further eliminate distortion in the skeleton segment. Processing the output result of the fourth step, which specifically comprises: classifying all skeleton pen segments; and designing a processing rule for each type of skeleton pen segment to eliminate distortion. The final purpose of this step is to eliminate all distortions in the skeleton stroke segment and obtain a stroke segment list consistent with the standard character stroke segment on the premise of ensuring that the effective information of the Chinese character skeleton is not changed.
I. The classification of the skeleton segment comprises the following steps:
in this step, all skeleton segments are divided into three types: burr type, internal short segment type, long segment type. The classification method is shown in fig. 9, and fig. 9 is a flowchart of the overall steps, which includes 3 modules, wherein the detailed processing algorithm of the "spur judgment module" is shown in fig. 10; the detailed processing algorithm of the "internal segment stroke judgment module" is shown in fig. 11. The classification method comprises the following specific steps: traversing the skeleton stroke segment list; for a certain pen segment s with an undetermined skeleton, if the adjacent attribute of the pen segment s is (0,2) or (2,0), the pen segment s is input into a 'burr judgment module'; if its adjacency attribute is (2,2), it is input to the "internal short segment determination module"; if its adjacency is otherwise, it is directly marked as "long fragment type", and other cases include: (0,1)(1,0),(0,0)(1,1)(2,1)(1,2). The "burr determination module" in fig. 9 is used to determine the skeleton segment s (also referred to as s in the foregoing text)iThe case where there is only one pen segment to be judged) is of the burr type. The specific flow is shown in fig. 10. The detailed description is as follows: if s satisfies either of the following two conditions, then it is marked as a spur type; otherwise it is marked as long-stroke type.
1) The length of the skeleton segment s is less than the threshold value W multiplied by alpha 1(W is the average width of the Chinese character segment, alpha 1 is a certain real number between [0,1 ])
2) The length of the skeleton segment s is greater than or equal to a threshold value W multiplied by alpha 1 and less than the threshold value W multiplied by beta 1 (beta 1 is a certain real number between [1.0, 1.5 ]); and the width of s is greater than a threshold value W x gamma 1 (gamma 1 is some real number greater than 1)
Fig. 11 gives an example of the recognition result of the "glitch determination module". Wherein, the skeleton pen segment circled by the dotted line circle is marked as a 'burr type'. The "internal short segment determination module" in fig. 9 is used to determine whether the skeleton segment s is an internal short segment type. The specific flow is shown in fig. 12. The detailed description is as follows: if the skeleton segment s meets any one of the following conditions, marking the skeleton segment s as an internal short segment type; otherwise, it is marked as a long stroke type.
1) The length of the skeleton segment s is smaller than a threshold value W multiplied by alpha 2 (alpha 2 is a certain real number between [0,1 ]);
2) the length of the skeleton segment s is greater than or equal to a threshold value W × α 2, but less than the threshold value W × β 2(β 2 is some real number greater than 1); and the width is greater than a threshold value W × γ 2(γ 2 is some real number greater than 1);
3) the skeleton segment s and the adjacent segments meet the triangle rule;
for the operation of fig. 12 to determine whether the "triangle rule" is satisfied, the specific processing method is as follows:
1) for the skeleton segment s to be processed, assume its path length is Ls.
2) Starting from an end point P of s, performing depth traversal on adjacent skeleton pen segments, and finding out all point sets with the path distance Ls from the point P
Figure BDA0001796054350000081
3) Starting from the other end point Q of the s, performing depth traversal on adjacent skeleton pen segments, and finding out all point sets with the path distance Ls from the point Q
Figure BDA0001796054350000082
4) The following triangle is traversed:
Figure BDA0001796054350000091
for the binary image in the step two, if the pixels in the internal regions of all the triangles are foreground pixels (the black pixel regions in the binary image in the step two, namely the regions of the handwriting content), it indicates that the skeleton segment s satisfies the triangle rule; otherwise, the skeleton segment s does not satisfy the triangle rule.
The triangle rule is illustrated with reference to the figure, as shown in fig. 13, the end point of the skeleton segment s to be processed is P, Q, and the length is Ls; the point set with the path length Ls from the point P is { E1, E2 }; the set of points at a path length Ls from point Q is { E3, E4, E5 }. All triangles that need to be traversed are Δ PQE1 (shown in fig. 14), Δ PQE2 (shown in fig. 15), Δ PQE3 (shown in fig. 16), Δ PQE4 (shown in fig. 17), and Δ PQE5 (shown in fig. 18). Fig. 19 shows the recognition result of the "internal short segment judgment module". Wherein, the skeleton segment circled by the dotted line circle is marked as an 'inner short segment type'.
II. Removing distortion for each type of skeleton segment includes:
the general idea of handling three types of skeleton segments is explained below:
due to the following limitations of the "thinning" operation in the above steps: the skeleton segments are positioned at the mutual adjacent positions and can not be matched with the central line of the original handwriting, and the deviation is serious, so that the final dynamic restoration effect has flaws, and after the 'burr segments' and the 'internal short segments' are deleted, a small part (which is controlled by using a threshold value) of the flaws is also deleted aiming at the adjacent segments; the smooth concatenation is then performed using the parametric curve in graphics (i.e. the Hermit curve). Thus, the final skeleton segment is ensured to be smooth and also to be matched with the central line of the handwriting. Thereby ensuring the vivid dynamic restoration effect.
(one) for internal short segment type processing
The internal short segments are grouped according to their adjacency. The grouping rules are as follows: for the internal short stroke section si, traversing other internal short stroke sections adjacent to the internal short stroke section si; if after a series of depth traversals the inner short segments sj can be accessed, then si and sj belong to the same group. G is a set of short segments within a group, siAnd sjAre the elements in this set. In this set, all the inner short segments are adjacent to each other, so in G, the number of inner segment segments may be 1,2, 3, etc.
For a certain packet G, if its element number is 1, the processing method is as follows:
(1) the length of an internal short segment in the group G is assumed to be LG; assume that all other skeleton segments adjacent to G are grouped as
Figure BDA0001796054350000092
(2) For each skeleton segment
Figure BDA0001796054350000093
Deleting partial skeleton information which is adjacent to G and has the length of LG multiplied by alpha G (alpha G is a certain scaling factor); (3) traverse SGAny two skeleton pen sections in
Figure BDA0001796054350000094
And
Figure BDA0001796054350000095
if the angle between their end tangent vectors is greater than theta (theta is 90, 180)]Some real number in between), they are connected using the Hermit curve and thus merged into one skeleton segment: (4) will SGThe skeleton pen section which can not be combined in the middle is extended to intersect with G;
for a certain packet G, if its number of elements is greater than 1, the processing method is as follows:
(1) assume that all skeleton segments adjacent to G are grouped as
Figure BDA0001796054350000096
(2) Calculating bounding box center point coordinates (m, n) of the group G; (3) for each skeleton segment
Figure BDA0001796054350000097
Deleting partial skeleton information which is adjacent to G and has the length of W multiplied by beta G (beta G is a certain scaling factor), wherein W is the average width of the Chinese character stroke; (4) traverse SGAny two skeleton pen sections
Figure BDA0001796054350000098
And
Figure BDA0001796054350000099
if they can be fitted to a smooth line without inflection points, they are joined using a Hermit curve and merged into a skeleton segment: (5) will SGThe rest skeleton segments which cannot be combined in the skeleton drawing process are extended to be intersected with a straight line x-n or y-m; FIG. 20 shows a comparison of the effect before and after processing when there is only one element in the internal short segment grouping; fig. 21 shows a comparison of the effect before and after processing when the number of elements in the internal short segment grouping is greater than 1. The explanation is as follows: in the left figure, the dotted circle marks the inner segment. As shown in fig. 20, there is only one internal short segment, which becomes a packet by itself. After the algorithm is processed, the effect of the right image is obtained. That is, the internal short segment is deleted and other segments are connected and combined as much as possible. As shown in fig. 21, there are three internal short segments. They belong to one packet. After the algorithm is processed, the effect of the right image is obtained. That is, the group of internal short segments is deleted, and other segments are connected and combined as much as possible.
(II) for burr type treatment
For a certain bur segment s, its end point is assumed to be M, N, the adjacency attribute is (2,0), and the length is Ls. In addition, assume that two other skeleton segments adjacent to s are respectively
Figure BDA0001796054350000101
S is further classified according to the following steps: computing
Figure BDA0001796054350000102
The angle between them theta. If theta is smaller than a certain threshold epsilon, classifying s as 'stroke turning burr', and judging to end; otherwise, go to the next step. The coordinate system used in the present invention is shown in FIG. 22, if the vector is
Figure BDA0001796054350000103
If the direction of the stroke is in the first quadrant and the third quadrant, the s is classified as 'stroke adhesion burr', and the judgment is finished; if vector
Figure BDA0001796054350000104
If the direction of the stroke is in the fourth quadrant, classifying s as stroke turning burr, and judging to end; if not, then,proceed to the next step. In that
Figure BDA0001796054350000105
And
Figure BDA0001796054350000106
in (d), the segment closest to the vertical direction is denoted as s'. And calculating an included angle theta between s 'and the burr s, if theta is smaller than a certain threshold epsilon', classifying s as 'stroke turning burr', otherwise classifying s as 'stroke adhesion burr'.
I. For the stroke turning burr s, the following steps are adopted for processing:
(1) as shown in fig. 23 and 24, for each skeleton segment adjacent to s
Figure BDA0001796054350000107
Deleting partial skeleton information adjacent to s and with length of Ls multiplied by alpha s (alpha s is a scaling factor) to obtain two pen segments with end points of A and B
Figure BDA0001796054350000108
(2) As shown in fig. 25, the burr is extended in the s direction to the boundary of the binary image, and an appropriate point C is selected on the extended burr. (3) As shown in FIG. 26, the Hermit curves were used to link AC and connect AC to AC
Figure BDA0001796054350000109
Merging into a skeleton pen section; the Hermit curve is used to link BC and the BC is connected to BC
Figure BDA00017960543500001010
Merging into a skeleton pen section; and deleting the burr stroke s. As shown in fig. 27, the complete processing effect of "stroke turning burr" is finally obtained, and it can be seen that the burr at the lower left corner of the cloud word is removed and connected into a smooth stroke for turning left and turning right.
II. For the stroke adhesion burr s, the following steps are adopted for processing:
(1) as shown in fig. 28, for each skeleton segment adjacent to s
Figure BDA00017960543500001011
Deleting partial skeleton information adjacent to s and with length of Ls multiplied by alpha s (alpha s is a scaling factor) to obtain two pen segments with end points of A' and B
Figure BDA00017960543500001012
(2) As shown in fig. 29, the burr is extended in the s direction to the boundary of the binary image, and an appropriate point C' is selected on the extended burr; (3) as shown in fig. 30 and 31, in the case of { a 'C'),
Figure BDA00017960543500001013
And { B 'C'),
Figure BDA00017960543500001014
In this way, the line fitting effect is improved. Smoothly connecting and combining the sets into a skeleton segment by using a Hermit curve, and recording the skeleton segment as
Figure BDA00017960543500001015
Extending the rest skeleton segment to the skeleton segment
Figure BDA00017960543500001016
Intersecting; and deleting the burr stroke s. As shown in fig. 32, the complete processing effect of the stroke adhesion burr is finally obtained, and it can be seen that the stroke adhesion burr segment is processed into a normal vertical stroke.
(III) processing adjustment for long segment types
Defining: for a certain coordinate point P, if it is an end point of three long segments at the same time, it is called a "trifurcation". For each trifurcation in a skeleton segment, processing as follows: (1) assuming a set of long skeleton segments connected to the Y-point as
Figure BDA0001796054350000111
(2) For each long skeleton segment
Figure BDA0001796054350000112
Deleting partial skeleton information which is adjacent to the triple point H and has the length of W multiplied by alpha H (alpha H is a scaling factor); (3) traverse any two long skeleton pen segments
Figure BDA0001796054350000113
And
Figure BDA0001796054350000114
if they can be fit to a smooth straight line segment, they are joined using a Hermit curve to merge into a skeleton segment; (4) for SHThe remaining long skeleton segments that cannot be merged in (a) are extended appropriately at their end points. As shown in FIG. 33, the effect contrast before and after the adjustment of the trifurcation point is obtained, and it can be seen that the long stroke segment is correctly divided into strokes which normally conform to the standard word stock.
Sixthly, sequencing the skeleton segments after disambiguation
And the skeleton stroke segments obtained in the step five are consistent with the standard character stroke segments in number, but are different in sequence. The aim of the step is to adjust the skeleton segment of the copybook image according to the sequence of the standard character segments, so as to obtain the skeleton segment with the correct sequence. The specific operation comprises the following steps: firstly, matching the skeleton stroke segment obtained in the step five with a standard stroke segment of a corresponding Chinese character in a standard Chinese character library; and then, adjusting the sequence of the skeleton segments and the sequence of sampling points inside each skeleton segment according to the sequence of the standard segments to obtain a skeleton segment list arranged according to a writing rule.
Dynamic reduction of seven, Chinese character copybook image
Based on the result of the sixth step, the original copybook Chinese character image is displayed in the form of animation, and the dynamic restoration of the Chinese character writing process specifically comprises the following steps: (1) preparing a blank image with the same size as the original copybook image; (2) traversing each pixel point of the skeleton segment on the basis of the sequenced skeleton segments; (3) for a certain skeleton pixel point A, calculating the handwriting radius wA of the certain skeleton pixel point A; (4) generating a circle on the blank image by taking A as the center of a circle and wA as the radius; the pixel values in this circular area in the original copybook image are copied into the blank image. According to the steps, the dynamic writing effect of the Chinese character copybook image can be realized. The specific effect is shown in fig. 34.
The embodiments of the present invention will be described in detail below with reference to the specific example "song".
Firstly, preparing standard word stock
The standard word stock stores the font information of all Chinese characters. The font information of each standard Chinese character comprises components, strokes and standard character segments. Figure 35 shows glyph information for the standard word "song". This word contains two components, seven strokes, eight standard word segments. The first part contains three strokes and the second part contains four strokes. Wherein the third stroke contains two standard character segments and the other strokes contain one segment. Each standard word segment stores a series of sample points, and the data of the sample points is shown in fig. 36. The dots represent sampling points, the black lines represent writing processes, and the numbers are the serial numbers of the pen segments.
In order to obtain the standard word stock, the embodiment of the invention takes the black-body TrueType font as the basis, and obtains all strokes and stroke segments of the Chinese characters in a manual drawing mode; then, information such as components, stroke sequences and the like is marked on the standard characters in a manual marking mode. Through the method, the embodiment of the invention obtains the font information of 3027 standard Chinese characters.
Secondly, collecting Chinese character image of copybook
The method utilizes a scanner to convert the Chinese characters in the copybooks with different fonts of different authors into static images, for example, the collection result of the Chinese character "Song" of the copybook is shown in FIG. 37.
Thirdly, preprocessing the image
As shown in fig. 38, this step performs graying and binarization operations on the copybook chinese character image, and converts the copybook chinese character image into a binary image. In the present embodiment, the graying operation employs a weighted average grayscale method; the binarization operation adopts an OTSU algorithm. For the binary image, the embodiment adopts a template method to smooth the edges of the Chinese characters. The specific mode is as follows: the left image in fig. 39 is a template to fill in pixels. Traversing the binary image by using the template, and setting a central pixel as a foreground pixel when a certain region in the binary image is matched with the template; the right image in fig. 39 is a template with pixels removed. And traversing the binary image by using the template, and setting a central pixel as a background pixel when a certain region of the binary image is inosculated with the template. Attention is paid to: in the traversing process, the two templates can rotate clockwise by 90 degrees, 180 degrees and 270 degrees. FIG. 40 is the results after pretreatment.
Fourthly, image thinning processing
The binary image is refined by adopting a Rosenfeld algorithm. The Rosenfeld algorithm is simple to realize and high in efficiency; meanwhile, connectivity of the thinning result on the eight neighborhoods can be ensured, and the stroke breakage problem is avoided. A Chinese character skeleton is obtained after a copybook Chinese character image is refined through a Rosenfeld algorithm, and the skeleton is stored as a point sequence in the embodiment of the invention: p ═ P1, P2, …, pn }. The thinning result is shown in fig. 41.
Fifth, skeleton segment extraction
And traversing the skeleton point sequence P, and calculating the number of adjacent points of each pixel point in the 8 communication areas in P. And taking pixel points with the number not being 2 of adjacent points (the adjacent points refer to the adjacent points of one pixel point and other pixel points) as end points for dividing the skeleton segment for the first time, and taking the mutually adjacent pixel points between the end points as the internal points of the skeleton segment. Thus, a preliminary segment list L ═ L1, L2, … is obtained. For each skeleton segment li, the inflection point T of the segment is calculated by adopting a dynamic ray algorithm (Huang Xiang, Cheng Nu, Yang Bo, etc.. natural handwritten Chinese character preprocessing subsystem [ J ]. Chongqing university bulletin, 2000,23(4):33-37) in the embodiment. Further dividing li by using T as a dividing point. Thus, a final skeleton segment set S can be obtained. At this time, all the segments in the S are simple segments, so that subsequent processing and adjustment are facilitated. As shown in fig. 42, the shaded dots represent all the end points of the skeleton segment, and two adjacent shaded dots and the middle white line represent one skeleton segment.
Sixth, skeleton segment disambiguation processing
The step is to perform disambiguation processing on the skeleton pen section, and the specific operation is as follows.
A. Calculation of the average width of the strokes: and traversing the skeleton segment list and calculating the length of each skeleton segment. The embodiment selects three skeleton segments with the largest length. And calculating the widths of the points of the three skeleton segments, and taking the average value of the widths as the average width W of the strokes of the whole Chinese character.
B. Recognizing the type of the skeleton pen section: according to the technical scheme of the invention, the skeleton pen segments are classified. The results of the classification of the segments of the "Song" digital images are shown in FIG. 43. The skeleton pen section group G is a short internal pen section type; the pen segments indicated by M1N1 and M2N2 are of burr type; the other skeleton pen sections are all long pen sections; point H is a triple point formed by three long segments.
C. Disambiguation processing of skeleton pen sections: according to the technical scheme of the invention, the skeleton segment of the Song' character is disambiguated. The specific sequence is as follows: the internal short segment type is processed first, then the burr type is processed, and finally the long segment type is processed.
(a) Treatment of internal short segments
In fig. 43, the "song" word contains an internal short segment group G, and the number of short segments in G is greater than 1. For convenience of illustration, the embodiment of the present invention is enlarged as fig. 44. In addition, the skeleton segment set adjacent to G is recorded as
Figure BDA0001796054350000131
The coordinates of the center of the bounding box of group G (gray filled dots in FIG. 44) are recorded as (m, n). According to the technical scheme of the invention, the following treatment is carried out on G:
1) for each skeleton segment
Figure BDA0001796054350000132
And deleting partial skeleton information adjacent to the G. In this embodiment, the deletion length is the smaller of the following two values: skeleton segment with average width W1.5 times of stroke
Figure BDA0001796054350000133
20% of the path length;
2) for SGAll skeleton segments in (1) are marked as end points near one end of G
Figure BDA0001796054350000134
The tangent vector at these end points is noted as
Figure BDA0001796054350000135
In this embodiment, the skeleton segments that can be merged are selected according to the following rules:
i. for two skeleton pen sections
Figure BDA0001796054350000136
And
Figure BDA0001796054350000137
if they satisfy both of the following conditions, then it is decided that they can be merged: (1) tangent vector quantity
Figure BDA0001796054350000138
And
Figure BDA0001796054350000139
the included angle is greater than the threshold value of 170 degrees; (2)
Figure BDA00017960543500001310
to "
Figure BDA00017960543500001311
And
Figure BDA00017960543500001312
the distance of the defined straight line "is less than W × 0.875;
for two skeleton segments
Figure BDA00017960543500001313
And
Figure BDA00017960543500001314
if they satisfy both of the following two conditions, then it is determined that they can be merged: (1)
Figure BDA00017960543500001315
and
Figure BDA00017960543500001316
adjacent to the same short segment in G and tangent to vector
Figure BDA00017960543500001317
And
Figure BDA00017960543500001318
the included angle is more than 160 degrees; (2)
Figure BDA00017960543500001319
to "
Figure BDA00017960543500001320
And
Figure BDA00017960543500001321
the distance of the defined straight line "is less than W.
3) From the above judgment rule, it can be seen that FIG. 44 shows
Figure BDA00017960543500001322
And
Figure BDA00017960543500001323
it is possible to merge the data streams,
Figure BDA00017960543500001324
and
Figure BDA00017960543500001325
may be combined. For the
Figure BDA00017960543500001326
And
Figure BDA00017960543500001327
the merging method comprises the following steps: constructing a Hermit curve according to the end points of the two pen segments and the tangent vectors at the end points
Figure BDA00017960543500001328
And
Figure BDA00017960543500001329
connected and combined into a pen section. In the same way will
Figure BDA00017960543500001330
And
Figure BDA00017960543500001331
merging into a skeleton pen section;
4) the skeleton is divided into pen sections
Figure BDA00017960543500001332
And
Figure BDA00017960543500001333
the elongation is performed. Extending along the tangent vector at the end of each segment until intersecting the line y or x n.
The results before and after the processing of the short segment groups in the "Song" word are shown in FIG. 45.
(b) Glitch type processing
In fig. 43, the bur segments are M1N1 and M2N 2. According to the description of the invention scheme, M1N1 is stroke bonding burr, and M2N2 is stroke turning burr. As shown in fig. 46, the processing of the bur segment M1N1 includes the following steps:
1) the length of the burr segment M1N1 is denoted as LM1N 1. The skeleton segment adjacent to M1N1 is marked as
Figure BDA00017960543500001334
And
Figure BDA00017960543500001335
2) for each skeleton segment
Figure BDA00017960543500001336
Part of the skeleton information adjacent to M1N1 is deleted. In this embodiment, the deletion length is the smaller of the following two values: LM1N 1X 0.5, skeleton pen segment
Figure BDA00017960543500001337
40% of the path length. Thus, two pen segments with end points A and B are obtained
Figure BDA00017960543500001338
3) Lengthening the burr stroke segment M1N1 to obtain a point C which is 0.5 multiplied by W away from the outer boundary of the stroke;
4) calculating the sum of AC
Figure BDA00017960543500001339
Angle therebetween, BC and
Figure BDA00017960543500001340
the included angle therebetween. And selecting one group with larger included angle for combination. As in fig. 31, here it can be determined that: AC and
Figure BDA0001796054350000141
may be combined into one stroke segment. Therefore, the sum of the coordinates of point A, C is used
Figure BDA0001796054350000142
The tangential direction of M1N1, calculating a Hermit curve, and connecting the Hermit curve and the Hermit curve;
5) will be provided with
Figure BDA0001796054350000143
Extending to intersect with the new skeleton segment;
as shown in fig. 47, the processing steps for the bur segment M2N2 are as follows:
1) the burr M2N2 is extended to the outer boundary of the stroke, which is calculated to be LM2N2 in length. Assume that the skeleton segment adjacent to M2N2 is
Figure BDA0001796054350000144
And
Figure BDA0001796054350000145
2) for each skeleton segment
Figure BDA0001796054350000146
Part of the skeleton information adjacent to M2N2 is deleted. In the present embodiment, the deletion length is the smaller of the following two values: LM2N 2X 0.5, skeleton pen segment
Figure BDA0001796054350000147
40% of the path length. Thus, two pen segments with end points of A, B are obtained
Figure BDA0001796054350000148
3) Finding a point C which is 0.5 multiplied by W away from the outer boundary of the stroke on the extended burr stroke section;
4) point A, C is connected by a Hermit curve and is connected to a pen segment
Figure BDA0001796054350000149
Merging to form a new skeleton pen section; point B, C was connected using the Hermit curve and
Figure BDA00017960543500001410
and combining to form a new skeleton pen section.
FIG. 48 is a comparison of results of burred pen segments before and after treatment.
(c) Long segment type of processing
In fig. 43, there is a triple point H formed by three long segments being adjacent. This is enlarged as shown in FIG. 49. The skeleton segment adjacent to the trifurcation point H is denoted as
Figure BDA00017960543500001411
According to the technical scheme of the invention, H is adjusted as follows:
1) for each bonePen rack
Figure BDA00017960543500001412
Part of the skeleton information adjacent to the point H is deleted. In the present embodiment, the deletion length is the smaller of the following two values: wx 1.2, skeleton pen section
Figure BDA00017960543500001413
40% of the path length;
2) computing
Figure BDA00017960543500001414
And
Figure BDA00017960543500001415
the included angle between the two parts is included,
Figure BDA00017960543500001416
and
Figure BDA00017960543500001417
the included angle between the two parts is included,
Figure BDA00017960543500001418
and
Figure BDA00017960543500001419
the included angle therebetween. Find the two segments with the largest included angle, here
Figure BDA00017960543500001420
And
Figure BDA00017960543500001421
ligation using the Hermit Curve
Figure BDA00017960543500001422
Thereby combining them into one segment;
3) will leave the remaining pen segments (here, the
Figure BDA00017960543500001423
) Is extended byTo intersect with other segments.
FIG. 50 is a graph comparing results before and after "Song" long stroke type treatment.
Thus, disambiguation processing of all skeleton segments in the Song' character is completed.
Seventh, sequencing of skeleton segments
As shown in the right diagram of fig. 50, the skeleton segment obtained after disambiguation is the final unambiguous skeleton segment of the "song" character. The skeleton segments and the standard segments of the Song' characters in the standard character library have one-to-one correspondence. Here, the embodiment selects a relaxation matching algorithm (Cheng F H, Hsu W H, Kuo M C. registration of transformed chip characters via stroke repetition [ J ]. Pattern registration, 1993,26(4): 579-. And then, reordering the skeleton stroke segments and the point sequences in the stroke segments according to the matching result, and finally obtaining the Chinese character skeleton stroke segment sequence arranged according to the writing sequence.
Eight, dynamic reduction
After the matching and sorting are finished, the Song skeleton segments arranged according to the correct writing sequence are obtained in the embodiment of the invention. The sequence of points in each stroke is also arranged from the start position to the end position in the correct writing order. According to the contents in the technical scheme of the invention, the dynamic reduction of the Song writing process can be completed.
The embodiment of the invention also provides a processing device for the internal short segments, which can realize the functions through the acquisition unit 22, the determination unit 24, the calculation unit 26 and the processing unit 28. It should be noted that the processing apparatus for an internal short segment according to the embodiment of the present invention may be configured to execute the processing method for an internal short segment according to the embodiment of the present invention, and the processing method for an internal short segment according to the embodiment of the present invention may also be executed by the processing apparatus for an internal short segment according to the embodiment of the present invention. Fig. 2 is a schematic diagram of an internal short segment processing device according to an embodiment of the invention. As shown in fig. 2, a processing device for internal short segments comprises:
the first deleting module 22 is configured to delete an internal short stroke segment when the skeleton stroke segment of the chinese character is determined to be the internal short stroke segment, where the internal short stroke segment is a skeleton stroke segment whose two ends respectively have two adjacent stroke segments;
a second deleting module 24, configured to delete a skeleton segment with a predetermined length from a skeleton segment adjacent to the internal short segment to obtain a partial adjacent segment, where the skeleton segment with the predetermined length is a partial skeleton segment within a predetermined threshold range of the internal short segment;
and the construction module 26 is configured to smoothly connect the partial adjacent stroke segments to form a new skeleton stroke segment, where the new skeleton stroke segment is matched with the central line of the binary image of the Chinese character.
The embodiment of the invention provides a storage medium, which comprises a stored program, wherein when the program runs, a device on which the storage medium is positioned is controlled to execute the method.
The embodiment of the invention provides a processor, which comprises a processing program, wherein when the program runs, a device where the processor is located is controlled to execute the method.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (7)

1. A processing method of an internal short segment is characterized by comprising the following steps:
deleting the internal short stroke section under the condition that the skeleton stroke section of the Chinese character is judged to be the internal short stroke section, wherein the internal short stroke section is the skeleton stroke section of which two ends of the stroke section are respectively provided with two adjacent stroke sections;
classifying the categories, judging the number of all internal short segments communicated with the internal short segments, and determining a preset threshold value according to the number of the communicated internal short segments; wherein, when the number of all internal short segments communicated with the internal short segments is judged to be one, the predetermined threshold is determined according to the length of the internal short segments, and when the number of all internal short segments communicated with the internal short segments is judged not to be one, the predetermined threshold is determined according to the average width of the internal short segments;
deleting skeleton pen segments with preset lengths from other skeleton pen segments adjacent to the internal short pen segment to obtain partial adjacent pen segments, wherein the skeleton pen segments with the preset lengths are partial skeleton pen segments within a preset threshold range of the internal short pen segment;
and smoothly connecting the partial adjacent stroke segments to form a new skeleton stroke segment, wherein the new skeleton stroke segment is matched with the central line of the Chinese character binary image.
2. The method of claim 1, wherein smoothly connecting the partially adjacent segments into a new skeleton segment comprises:
judging the included angle of the end tangent vectors of any two adjacent segments;
and if the included angle of the tangent vector accords with a preset angle, smoothly connecting the adjacent segments of the two parts to form a new skeleton segment.
3. The method of claim 1, wherein smoothly connecting the partially adjacent segments into a new skeleton segment comprises:
judging the included angle of the end tangent vectors of any two adjacent segments;
and if the included angle of the tangent vector does not accord with a preset angle, extending the part of the adjacent pen segments to the position intersected with the internal short pen segment.
4. The method of claim 1, wherein smoothly connecting the partially adjacent segments into a new skeleton segment comprises:
judging whether any two partially adjacent pen segments can be fitted into a smooth line segment without an inflection point;
if so, smoothly connecting the two adjacent partial pen segments to form a new skeleton pen segment.
5. The method of claim 1, wherein smoothly connecting the partially adjacent segments into a new skeleton segment comprises:
judging whether any two partially adjacent pen segments can be fitted into a smooth line segment without an inflection point;
and if not, extending the two partially adjacent pen segments to a position intersected with a preset straight line, wherein the preset straight line is a straight line corresponding to the center point coordinate of the bounding box of the internal short pen segment.
6. A device for processing internal short segments, comprising:
the first deleting module is used for deleting the internal short stroke section under the condition that the skeleton stroke section of the Chinese character is judged to be the internal short stroke section, wherein the internal short stroke section is the skeleton stroke section of which two ends of the stroke section are respectively provided with two adjacent stroke sections;
the second deleting module is used for carrying out category division, judging the number of all internal short segments communicated with the internal short segments and determining a preset threshold value according to the number of the communicated internal short segments; wherein, when the number of all internal short segments communicated with the internal short segments is judged to be one, the predetermined threshold is determined according to the length of the internal short segments, and when the number of all internal short segments communicated with the internal short segments is judged not to be one, the predetermined threshold is determined according to the average width of the internal short segments; deleting skeleton pen segments with preset lengths from other skeleton pen segments adjacent to the internal short pen segment to obtain partial adjacent pen segments, wherein the skeleton pen segments with the preset lengths are partial skeleton pen segments within a preset threshold range of the internal short pen segment;
and the construction module is used for smoothly connecting the partial adjacent stroke sections to form a new skeleton stroke section, wherein the new skeleton stroke section is matched with the central line of the Chinese character binary image.
7. A storage medium, characterized in that the storage medium comprises a stored program, wherein the program performs the method of any one of claims 1 to 5.
CN201811057035.3A 2018-09-11 2018-09-11 Method and device for processing internal short pen section Active CN109325483B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811057035.3A CN109325483B (en) 2018-09-11 2018-09-11 Method and device for processing internal short pen section

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811057035.3A CN109325483B (en) 2018-09-11 2018-09-11 Method and device for processing internal short pen section

Publications (2)

Publication Number Publication Date
CN109325483A CN109325483A (en) 2019-02-12
CN109325483B true CN109325483B (en) 2021-05-07

Family

ID=65264936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811057035.3A Active CN109325483B (en) 2018-09-11 2018-09-11 Method and device for processing internal short pen section

Country Status (1)

Country Link
CN (1) CN109325483B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175539B (en) * 2019-05-10 2022-05-20 广东智媒云图科技股份有限公司 Character creating method and device, terminal equipment and readable storage medium
CN116580129A (en) * 2023-04-18 2023-08-11 南京信息工程大学 Method, device and storage medium for improving calligraphy character skeleton based on distance transformation

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101699518B (en) * 2009-10-30 2012-06-20 华南理工大学 Method for beautifying handwritten Chinese character based on trajectory analysis
CN102147600B (en) * 2011-04-30 2012-09-19 上海交通大学 Numerical control interpolation system for real-time generation of curvature-continuous path
JP6270565B2 (en) * 2014-03-18 2018-01-31 株式会社東芝 Electronic apparatus and method
CN104063723B (en) * 2014-06-25 2017-06-06 北京语言大学 The stroke restoring method and device of the Off-line Handwritten Chinese

Also Published As

Publication number Publication date
CN109325483A (en) 2019-02-12

Similar Documents

Publication Publication Date Title
CN109409211B (en) Processing method, processing device and storage medium for Chinese character skeleton stroke segments
CN110659644B (en) Automatic extraction method for strokes of calligraphy single characters
AU2006252025B2 (en) Recognition of parameterised shapes from document images
AU2006252019B2 (en) Method and Apparatus for Dynamic Connector Analysis
US7729541B2 (en) Comparative and analytic apparatus method for converting two-dimensional bit map data into three-dimensional data
CN110188778B (en) Residential area element outline regularization method based on image extraction result
US8670615B2 (en) Refinement of segmentation markup
CN105528614A (en) Cartoon image layout recognition method and automatic recognition system
CN110180186A (en) A kind of topographic map conversion method and system
Parakkat et al. A Delaunay triangulation based approach for cleaning rough sketches
CN107944451B (en) Line segmentation method and system for ancient Tibetan book documents
CN112418216A (en) Method for detecting characters in complex natural scene image
CN112364834A (en) Form identification restoration method based on deep learning and image processing
CN109325483B (en) Method and device for processing internal short pen section
CN111368695A (en) Table structure extraction method
WO2024041032A1 (en) Method and device for generating editable document based on non-editable graphics-text image
Narang et al. Drop flow method: an iterative algorithm for complete segmentation of Devanagari ancient manuscripts
CN109271882B (en) Method for extracting color-distinguished handwritten Chinese characters
CN115841671B (en) Handwriting skeleton correction method, system and storage medium
Park et al. A method for automatically translating print books into electronic Braille books
CN109359529B (en) Long stroke processing method and device
Silva et al. Segmenting sinhala handwritten characters
CN109410291B (en) Processing method and device for burr type pen segments
Mostafa An adaptive algorithm for the automatic segmentation of printed Arabic text
CN111860173B (en) Remote sensing image ground feature element extraction method and system based on weak supervision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant