CN104156725A - Novel Chinese character stroke combination method based on angle between stroke segments - Google Patents

Novel Chinese character stroke combination method based on angle between stroke segments Download PDF

Info

Publication number
CN104156725A
CN104156725A CN201410400326.3A CN201410400326A CN104156725A CN 104156725 A CN104156725 A CN 104156725A CN 201410400326 A CN201410400326 A CN 201410400326A CN 104156725 A CN104156725 A CN 104156725A
Authority
CN
China
Prior art keywords
stroke
point
crossing
stroke section
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410400326.3A
Other languages
Chinese (zh)
Other versions
CN104156725B (en
Inventor
董乐
吕娜
封宁
谢山山
张宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201410400326.3A priority Critical patent/CN104156725B/en
Publication of CN104156725A publication Critical patent/CN104156725A/en
Application granted granted Critical
Publication of CN104156725B publication Critical patent/CN104156725B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The invention provides a novel Chinese character stroke combination method based on the angle between stroke segments, and aims to solve the problem of printed Chinese character stroke combination. The scheme is that the method comprises the following steps: binaryzation operation is carried out on an input Chinese character picture to make the input picture into a binary image, the binary image is split, the image is scanned from the horizontal and vertical directions respectively, parts having no pixel connection are split apart, and the skeleton graph of the split binary image is extracted; intersections of the skeleton graph are extracted; for each intersection, stroke combination is carried out according to combination rules; and a final stroke combination is output according to the means of stroke combination marked in the previous step.

Description

A kind of new Chinese-character stroke combined method based on the intersegmental angle of stroke
Technical field
The present invention relates to the automatic extraction of Chinese-character stroke, relate in particular in the automatic leaching process of Chinese-character stroke to Chinese character have intersection region stroke section combination rule and formulate respective combination rules technology field, a kind of new Chinese-character stroke combination side based on the intersegmental angle of stroke is provided.
Background technology
Stroke is the minimum component unit of Chinese character, and the automatic extraction of stroke is an important preconditioning technique in Chinese character shape analysis and processing, is widely used in the automatic generation of word identification, computer calligraphy and Chinese character.Traditional Chinese-character stroke extraction algorithm adopts the stroke section of cutting apart is carried out to any combination of two, is judging that array mode is correct by certain method.This mode all will be calculated every kind of combination, very consuming time.We further investigate the array mode of the stroke section to conventional word, excavate out rule wherein and establish relevant regulations." horizontal, vertical, skim, right-falling stroke " four kinds of ratios that basic stroke occurs in Chinese character of Chinese character are 28%, 18%, 15%, 13%, want height very than the frequency of occurrences of other " folding, hook " etc.Therefore Chinese character can regard as by " horizontal, vertical, skim, right-falling stroke " four kinds of basic stroke content number form.These four kinds of basic strokes can be regarded four direction as, and the direction character based on Chinese character basic stroke direction is applied in handwritten form recognition system more and more in recent years.This project intends proposing a kind of effectively Chinese-character stroke combined method, through extracting the structure behind point of crossing, judges array mode according to angle.
Summary of the invention
The object of the invention is to solve matrix method body stroke combination problem, for the accuracy of stroke combination, a kind of matrix method body stroke combination model based on stroke angle is proposed, research realizes and automatically extracting on basis at Chinese-character stroke, the model of stroke Rapid Combination is proposed, for the diversity of Chinese-character stroke and the diversity of Chinese-character stroke array mode, by composite design mode cleverly, judge whether combination by the angle between stroke, thereby reach the object that combines fast and accurately stroke.
The present invention is by the following technical solutions to achieve these goals:
The new Chinese-character stroke combined method based on the intersegmental angle of stroke, is characterized in that comprising the following steps:
Step 1: input Chinese character picture is carried out to pretreatment operation,
1-a, first input Chinese character picture is carried out to binaryzation operation, making to input picture becomes bianry image,
1-b, binary image is carried out to fractured operation, from horizontal and vertical directions scan image, the part that does not have pixel to be connected is split and come,
1-c, the bianry image spliting extract skeleton diagram;
Step 2: skeleton diagram is extracted to point of crossing,
Step 3: for each point of crossing, according to rule of combination, combine the stroke section that each point of crossing connects;
Step 4: according to the stroke section obtaining in step 3, for each complete word, combination stroke section, and export final stroke section.
Point of crossing described in step 2, for its label is pi (i=1, N), the number that wherein N is point of crossing, the stroke section of being blocked by this point of crossing is designated as to kj (j=1, M), wherein M is split the number of stroke section, to each point of crossing pi, taking this point of crossing as initial point, getting R pixel is that radius does circle, and the intersection point of remembering this circle and each stroke section is fj, if intersection point number sum equals M, taking fj as terminal, point of crossing pj is that initial point is vectorial bj and represents stroke section kj.
Judge the angle between any two sections of stroke section ba and the bb being separated by pi, and specify that two angles between stroke section must be acute angles, if wherein the angle between ba and bb is between 160 °-180 °, ba and bb belong to a stroke, the stroke section that ba is separated from pi point of crossing with bb stroke section, deleting and record these two stroke sections is connected, by angle after the stroke section of 160 °-180 ° all finds, angle in the remaining stroke section of relatively being separated by pi between each stroke section, if the angle between bc and bd is close to 45 °, the stroke section of these two strokes being separated from pi point of crossing, deleting and record these two stroke sections is connected, if now also there is the stroke section of being separated by pi, think that this stroke section is an independent stroke, judging next point of crossing afterwards, until has combined all point of crossing.
The present invention, because adopt above technical scheme, therefore possesses following beneficial effect:
The present invention has proposed a kind of framework that adopts angular way between stroke to carry out matrix method body stroke combination on the basis of tradition Chinese character stroke combination mode, and the present invention has the following advantages:
1, on the basis of research stroke combination mode, extract under the prerequisite of stroke point of crossing, only calculate the angle between the stroke being separated by point of crossing, according to the rule of setting, judge the array mode between stroke, therefore, the method can ensure, under the prerequisite of combination accuracy rate, to have greatly reduced time cost and resource overhead.
2, for the diversity of Chinese character itself and Chinese-character stroke, study the rule of combination between stroke, and summing up the matrix method body stroke combination rule based on angle, this rule is the summary to each stroke of matrix method body, has therefore ensured the combination accuracy rate of Chinese-character stroke.
3, in order to verify effect of the present invention, we have manually selected 500 Chinese character pictures to do experiment, first these 500 Chinese characters are carried out to binaryzation operation and obtain binary map, on the basis of binary map, extract skeleton diagram, because skeleton diagram is made up of single pixel, now extract the point of crossing of this skeleton diagram, each stroke that has just obtained being separated by point of crossing, the method on these strokes, we being proposed is tested, can find out, our Chinese-character stroke combined method has reached extraordinary effect.
Brief description of the drawings
Fig. 1 is to the pre-service of input matrix method body;
The stroke combination mode process flow diagram of Fig. 2 based on angle;
Fig. 3 part font skeleton diagram for example;
Fig. 4 standard character library for example;
The stroke combination mode of Fig. 5 based on angle for example.
Embodiment
In order to make object of the present invention, technical scheme and beneficial effect clearer, below in conjunction with concrete case, and with reference to accompanying drawing, the present invention is described in more detail.
The present invention is for matrix method body stroke combination, the method combines Chinese-character stroke at matrix method body and to the basis of Chinese-character stroke array mode research, first, analyze the Chinese-character stroke newest research results of the association area of extraction automatically, stroke after matrix method body crosspoint extracting is carried out to the design of the algorithm of the combination of stroke, propose, on the basis of automatically extracting at Chinese-character stroke, the framework of the matrix method body stroke combination based on angle between stroke.The method has been avoided the loaded down with trivial details work of stroke combination exhaustive method, is ensureing that array mode accurately under prerequisite, has reduced time cost.
Extensive view data is carried out feature learning, and the design of receptive field selection and sorting algorithm has proposed on large data processing platform (DPP) hadoop basis, and the Large Scale Graphs based on degree of depth level feature learning is as taxonomy model.The method has been avoided the loaded down with trivial details work of the extensive characteristics of image of artificial design, is ensureing, under the prerequisite of classify accuracy, to have reduced the training time, and the achievement of this framework has great significance in word identification, computer calligraphy and Chinese character generate automatically.
Our test experiments hardware environment is:
Hardware environment:
Computer type: desktop computer;
CPU:Pentium(R) Dual-Core CPU E5600@2.93GHz
Internal memory: 4.00GB(3.49GB can use)
System type: 32-bit operating system
Display card: integrated graphics card
Software environment:
IDE:Microsoft visual studio 2010
Image treatment S DK:OpenCV
Development language: C++;
The invention provides a kind of new Chinese-character stroke combined method based on the intersegmental angle of stroke, comprise the following steps:
Step 1: original Chinese character picture is carried out to pre-service.Based on the research that tradition Chinese character stroke is extracted automatically, summed up before extracting point of crossing and carried out preprocess method, comprise the Chinese character picture binaryzation to input and the operation of the image through binaryzation being extracted to skeleton diagram, these pre-service can ensure in the time extracting the operation of point of crossing, image only comprises key message, and pixel value forms by 0 or 1, and ensuing operation is had to very important meaning.
Step 2: extract the point of crossing of binary image the stroke section that combination is separated by point of crossing.We split Chinese character, and extract the cross-point region of each coupling part, and the stroke section that just each point of crossing need to be cut apart afterwards combines.The stroke structure of Chinese character has been divided into a lot of sections by each point of crossing, traditional method is that the stroke section that point of crossing is divided into is entered row stochastic combination, in the time having n section, just there is a different combination, it is legal also to need afterwards this combination to judge whether, but in actual stroke section combination, this random array mode success ratio is very low, has wasted a large amount of computer resources.Based on above analysis and the rule based on Chinese-character stroke itself, we have proposed to have the Chinese character rule of combination of novelty.On the basis that a large amount of Chinese characters is analyzed, we sum up the combination rule that a kind of most of stroke section all meets, and probability that two stroke sections the belong to same unicursal angle intersegmental with stroke is relevant.According to this rule, we have proposed the rule of combination of stroke section, by calculating the angle between each stroke section under same point of crossing, and judge according to this probability that two stroke sections can combine, when two stroke sections reach the probability that can combine and just stop at the combination of these two stroke sections and other stroke sections within the scope of same point of crossing, this shows the section rule of combination that we propose, not only save a large amount of computer times, also improved the success ratio of section combination.
embodiment 1
As Fig. 1 and Fig. 2 are the system framework figure that the present invention combines matrix method body stroke, as follows to the step of Chinese-character stroke array mode:
Step 1: first in order to extract key message to the Chinese character of input, abandon interfere information, input Chinese character picture is carried out to pretreatment operation, as shown in Figure 1, first input Chinese character picture is carried out to binaryzation operation, making to input picture becomes bianry image, and the pixel that is 0 and 1 by value due to bianry image forms, and has ensured the structural information of former picture.Then, we carry out fractured operation to binary image, from horizontal and vertical directions scan image, the part that does not have pixel to be connected are split and are come, and are conducive to ensuing work.Finally, the bianry image spliting is extracted skeleton diagram by we, and skeleton diagram is made up of single pixel, has both retained key message, has abandoned again interfere information, has very important meaning to next extracting point of crossing.
Step 2: the skeleton diagram that previous step is obtained, adopts general method to extract point of crossing, for next step rule of combination lays the foundation.
Step 3: for each point of crossing, according to rule of combination, combination stroke.To the point of crossing obtaining in previous step, we for its label be pi (i=1, N), the number that wherein N is point of crossing, we are designated as kj (j=1 the stroke section of being blocked by this point of crossing, M), wherein M is split the number of stroke section, to each point of crossing pi, we are taking this point of crossing as initial point, getting R pixel is that radius does circle, and the intersection point of remembering this circle and each stroke section is fj, if intersection point number sum equals M, taking fj as terminal, point of crossing pj is that initial point is vectorial bj and represents stroke section kj.We judge the angle between any two sections of the stroke section of being separated by pi, here specify that two angles between stroke section must be acute angles, if wherein ba and bb (a, b=1, M) close to 180o(, in the time that angle is more than or equal to 160o, we think that this angle is close to 180o to the angle between), we think that ba and bb belong to a stroke, the stroke section that ba is separated from pi point of crossing with bb stroke section, deleting and record these two stroke sections is connected, after angle is all found close to the stroke section of 180o, angle in the remaining stroke section of relatively being separated by pi between each stroke section, if bc and bd (c, d=except angle be the stroke section after 180o) angle is close to 45o, the stroke section of these two strokes being separated from pi point of crossing, deleting and record these two stroke sections is connected, if now also there is the stroke section of being separated by pi, think that this stroke section is an independent stroke.Judging next point of crossing afterwards, until has combined all point of crossing.Detailed process as shown in Figure 5.
Step 4: according to the stroke combination mode of previous step mark, export final stroke combination.

Claims (3)

1. the new Chinese-character stroke combined method based on the intersegmental angle of stroke, is characterized in that comprising the following steps:
Step 1: input Chinese character picture is carried out to pretreatment operation,
1-a, first input Chinese character picture is carried out to binaryzation operation, making to input picture becomes bianry image,
1-b, binary image is carried out to fractured operation, from horizontal and vertical directions scan image, the part that does not have pixel to be connected is split and come,
1-c, the bianry image spliting extract skeleton diagram;
Step 2: skeleton diagram is extracted to point of crossing, each stroke section of being cut apart by point of crossing is carried out to mark.
2. step 3: for each point of crossing, according to rule of combination, combine the stroke section that each point of crossing connects;
Step 4: according to the stroke section obtaining in step 3, for each complete word, combination stroke section, and export final stroke section.
3. a kind of new Chinese-character stroke combined method based on the intersegmental angle of stroke according to claim 1, is characterized in that,
Point of crossing described in step 2, for its label is pi (i=1, N), the number that wherein N is point of crossing, the stroke section of being blocked by this point of crossing is designated as to kj (j=1, M), wherein M is split the number of stroke section, to each point of crossing pi, taking this point of crossing as initial point, getting R pixel is that radius does circle, and the intersection point of remembering this circle and each stroke section is fj, if intersection point number sum equals M, taking fj as terminal, point of crossing pj is that initial point is vectorial bj and represents stroke section kj;
A kind of new Chinese-character stroke combined method based on the intersegmental angle of stroke according to claim 1 and 2, is characterized in that,
Rule of combination described in step 2 is specially: judge the angle between any two sections of stroke section ba and the bb being separated by pi, and specify that two angles between stroke section are acute angle, if wherein the angle between ba and bb is between 160 °-180 °, ba and bb belong to a stroke, the stroke section that ba is separated from pi point of crossing with bb stroke section, deleting and record these two stroke sections is connected, again by angle after the stroke section of 160 °-180 ° all finds, angle in the remaining stroke section of relatively being separated by pi between each stroke section, if the angle between bc and bd is close to 45o, the stroke section of these two strokes being separated from pi point of crossing, deleting and record these two stroke sections is connected, if now also there is the stroke section of being separated by pi, think that this stroke section is an independent stroke, judging next point of crossing afterwards, until has combined all point of crossing.
CN201410400326.3A 2014-08-14 2014-08-14 Novel Chinese character stroke combination method based on angle between stroke segments Active CN104156725B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410400326.3A CN104156725B (en) 2014-08-14 2014-08-14 Novel Chinese character stroke combination method based on angle between stroke segments

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410400326.3A CN104156725B (en) 2014-08-14 2014-08-14 Novel Chinese character stroke combination method based on angle between stroke segments

Publications (2)

Publication Number Publication Date
CN104156725A true CN104156725A (en) 2014-11-19
CN104156725B CN104156725B (en) 2017-05-10

Family

ID=51882222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410400326.3A Active CN104156725B (en) 2014-08-14 2014-08-14 Novel Chinese character stroke combination method based on angle between stroke segments

Country Status (1)

Country Link
CN (1) CN104156725B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804397A (en) * 2018-06-12 2018-11-13 华南理工大学 A method of the Chinese character style conversion based on a small amount of target font generates
CN109410291A (en) * 2018-09-11 2019-03-01 北京语言大学 The treating method and apparatus of burr type pen section
CN111523622A (en) * 2020-04-26 2020-08-11 重庆邮电大学 Method for simulating handwriting by mechanical arm based on characteristic image self-learning
CN112016419A (en) * 2020-08-19 2020-12-01 浙江无极互联科技有限公司 Intelligent handwritten Chinese character planimetric algorithm
CN112434763A (en) * 2020-11-24 2021-03-02 伍曙光 Chinese character skeleton generating method based on computer
CN115841671A (en) * 2023-02-21 2023-03-24 南京信息工程大学 Calligraphy character skeleton correction method, system and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060072824A1 (en) * 2003-09-16 2006-04-06 Van Meurs Pim System and method for Chinese input using a joystick
CN102375994A (en) * 2010-08-10 2012-03-14 广东因豪信息科技有限公司 Method and device for detecting and reducing correctness of order of strokes of written Chinese character

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060072824A1 (en) * 2003-09-16 2006-04-06 Van Meurs Pim System and method for Chinese input using a joystick
CN102375994A (en) * 2010-08-10 2012-03-14 广东因豪信息科技有限公司 Method and device for detecting and reducing correctness of order of strokes of written Chinese character

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈睿等: ""基于笔画段分割和组合的汉字笔画提取模型"", 《计算机科学》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804397A (en) * 2018-06-12 2018-11-13 华南理工大学 A method of the Chinese character style conversion based on a small amount of target font generates
CN108804397B (en) * 2018-06-12 2021-07-20 华南理工大学 Chinese character font conversion generation method based on small amount of target fonts
CN109410291A (en) * 2018-09-11 2019-03-01 北京语言大学 The treating method and apparatus of burr type pen section
CN109410291B (en) * 2018-09-11 2023-03-07 北京语言大学 Processing method and device for burr type pen segments
CN111523622A (en) * 2020-04-26 2020-08-11 重庆邮电大学 Method for simulating handwriting by mechanical arm based on characteristic image self-learning
CN111523622B (en) * 2020-04-26 2023-01-31 重庆邮电大学 Method for simulating handwriting by mechanical arm based on characteristic image self-learning
CN112016419A (en) * 2020-08-19 2020-12-01 浙江无极互联科技有限公司 Intelligent handwritten Chinese character planimetric algorithm
CN112434763A (en) * 2020-11-24 2021-03-02 伍曙光 Chinese character skeleton generating method based on computer
CN115841671A (en) * 2023-02-21 2023-03-24 南京信息工程大学 Calligraphy character skeleton correction method, system and storage medium

Also Published As

Publication number Publication date
CN104156725B (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN104156725A (en) Novel Chinese character stroke combination method based on angle between stroke segments
CN111401371B (en) Text detection and identification method and system and computer equipment
US20140313216A1 (en) Recognition and Representation of Image Sketches
CN107798321A (en) A kind of examination paper analysis method and computing device
CN108509881A (en) A kind of the Off-line Handwritten Chinese text recognition method of no cutting
EP3916629A1 (en) Method, apparatus and device for identifying illegal building, and storage medium
CN104182748A (en) A method for extracting automatically character strokes based on splitting and matching
CN107491536B (en) Test question checking method, test question checking device and electronic equipment
CN106202380A (en) The construction method of a kind of corpus of classifying, system and there is the server of this system
CN114782970B (en) Table extraction method, system and readable medium
WO2020125062A1 (en) Image fusion method and related device
CN113032862B (en) Building information model checking method and device and terminal equipment
CN106155327A (en) Gesture identification method and system
CN112818852A (en) Seal checking method, device, equipment and storage medium
CN114049568A (en) Object shape change detection method, device, equipment and medium based on image comparison
CN115730605A (en) Data analysis method based on multi-dimensional information
CN111339290A (en) Text classification method and system
CN111797685B (en) Identification method and device of table structure
CN116385789B (en) Image processing method, training device, electronic equipment and storage medium
CN110598196A (en) Method and device for extracting table data missing outer frame and storage medium
CN115995092A (en) Drawing text information extraction method, device and equipment
CN112084103A (en) Interface test method, device, equipment and medium
CN111476090B (en) Watermark identification method and device
CN104091319B (en) The shredded paper picture joining method of energy function is built based on Monte carlo algorithm
Sun et al. Contextual models for automatic building extraction in high resolution remote sensing image using object-based boosting method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant