CN104346390A - Method and device for forming word stock - Google Patents

Method and device for forming word stock Download PDF

Info

Publication number
CN104346390A
CN104346390A CN201310332045.4A CN201310332045A CN104346390A CN 104346390 A CN104346390 A CN 104346390A CN 201310332045 A CN201310332045 A CN 201310332045A CN 104346390 A CN104346390 A CN 104346390A
Authority
CN
China
Prior art keywords
contour
character outline
character
curve
contour curve
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310332045.4A
Other languages
Chinese (zh)
Other versions
CN104346390B (en
Inventor
王玉欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Original Assignee
Founder Information Industry Holdings Co Ltd
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Founder Information Industry Holdings Co Ltd, Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Founder Information Industry Holdings Co Ltd
Priority to CN201310332045.4A priority Critical patent/CN104346390B/en
Publication of CN104346390A publication Critical patent/CN104346390A/en
Application granted granted Critical
Publication of CN104346390B publication Critical patent/CN104346390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The invention provides a method for forming a word stock. The method includes steps of 1) acquiring word scripts; 2) scanning the word scripts into script images; 3) digitally fitting the script images to form a character outline; 4) automatically processing the character outline so as to remove excessive outline curves in the character outline and excessive points on the outline curves; 5) forming the word stock and testing. Correspondingly, the invention further provides a device for forming the word stock. By the aid of the method and the device for forming the word stock, an individual word stock can be formed and is lower in data flow as compared with existing individual word stock.

Description

A kind of method and device forming character library
Technical field
The present invention relates to computer library software development technical field, be specifically related to a kind of method and the device that form character library.
Background technology
At present, the manufacturing process of computer library is roughly divided into following several stages: design word original text; By the word original text scanning input computer designed; Digitizing matching is carried out to the word original text after scanning; Manually repair word; Quality inspection; Be integrated into storehouse etc.Although the efficiency adopting computing machine to carry out digitizing matching to the word original text after scanning is very high, but just as the work that computer cannot replace human brain, it can only complete just level work, no matter be that character is (concerning Chinese word library, character refers to the letter, numeral, symbol or the Chinese character that use in computing machine) quality, or structure, only has and manually repaiies word could form satisfactory character library through follow-up.Manually repairing word is the systems engineering that a workload is very great, because simplified form of Chinese Character character library comprises six or seven thousand Chinese characters, Chinese-traditional character library comprises 14,000 Chinese characters, and GBK character library then comprises more than 20,000 Chinese characters.Each character after manually repairing word also will through strict quality inspection, i.e. quality inspection, check with needing a character, a character during quality inspection, check with even needing a point, a point, because a character is well-done, except character outline is smooth, rational in infrastructure, a whole set of technical manual is also had to need to follow, such as need add an extreme point in the distalmost end of curve, each stroke will be described with minimum point, to reduce storage information as much as possible, improve reduction rate etc.
Computer library is developed so far, and the exploitation of " character library " product also receives more and more concern had deep love for calligraphy and Chinese character is used to interested individual in the diversification of information age.Font manufacturer, in the process of exploitation new font, also experiences the great demand of character library outside traditional publication and distribution.Simultaneously, Internet era the revolution of information propagation pattern that brings bring new problem to the utilization of computer library---the media transmission modes such as individual blog, personalized publication thing emerge in an endless stream, require that " computer font " this important transmitting carrier can serve " personalization " expression and the showing of individual individual character to a greater degree, therefore occurred a kind of personal character library being different from traditional computer character library.Personal computer character library product personal handwritten font be transformed into truly that described personal character library is demand according to individual calligraphist and fan colony thereof and produces, its birth indicates that " computer library " will enter the personalization epoch as a kind of " consumer products ".
But, because personal character library is different from traditional computer library, it is mainly as " consumer products " location, not mandatory uniform technical specification, therefore current manufacturer is discrepant for the process of personal character library and the fine work character library of oneself release on Making programme, in order to reduce production cost, shorten the production cycle, the Making programme of existing personal character library decreases compared with traditional computer character library manually repaiies word and quality inspection flow process.First personal character library needs author all to write in word original text by required all characters, then by word original text scanning input computer, after digitizing matching, directly be integrated into character library, if there is wrongly written character in this process, only need to modify to wrongly written character, the raw data of other the equal retention figures matching of character, and manually do not repair the link of word and quality inspection.Consequently, in the personal character library formed, the character outline of character is rough, the point that each character outline stored comprises is too many, cause the data volume larger (namely personal character library file is larger) of the personal character library formed itself, it is more than 2 times of normal computer character library, wherein, the data volume of the personal character library that Brush calligraphy is formed is maximum, 3-4 may be reached doubly, this is because Brush calligraphy generally writes on rice paper to form word original text, there is a lot of burrs at the edge of the word manuscript base picture formed after scanning, smooth not, and the point that the character outline formed after digitizing matching comprises is many especially.Therefore, the basic reason that personal character library data volume is large is, the point that each character outline comprises too much, too close, character outline is smooth not.Ensure character in character library quality and while not improving production cost, how effectively to reduce personal character library file size, be the problem that font manufacturer release personal character library these " consumer products " of facing needs solution badly.
Summary of the invention
Technical matters to be solved by this invention is for above-mentioned defect existing in prior art, provides a kind of method and the device that form character library, and its personal character library data volume compared with existing personal character library formed is less.
Solve the technical scheme that the technology of the present invention problem adopts:
The method of described formation character library comprises the steps:
1) word original text is obtained;
2) described word original text is scanned into word manuscript base picture;
3) digitizing matching is carried out, to form character outline to described word manuscript base picture;
4) automatic business processing is carried out to described character outline, to remove point unnecessary on contour curve unnecessary in character outline and contour curve;
5) dress up character library and test.
Preferably, described step 2) in, described word manuscript base picture is bianry image.
Preferably, also comprise after described step 3):
3A. density of setting threshold value, determining step 3) in the density of point in the character outline that formed in arbitrary region on all contour curves whether be greater than described density threshold, in this way, then perform step 3B, as no, then perform step 4);
3B., according to the some structure approximating function on contour curves all in region described in step 3A, to form corresponding matched curve, and substitutes all contour curves in described region with described matched curve, then returns step 3A.
Preferably, in described step 4), remove unnecessary contour curve in character outline and be specially:
Remove the contour curve formed after digitizing matching by the noise of described word manuscript base picture in character outline;
And/or, preset the first area threshold, if the area of the closed figure be made up of many contour curves in character outline is less than described area threshold, then remove many contour curves of this composition closed figure.
Preferably, in described step 4), in removal character outline, on contour curve, unnecessary point is specially:
Preset curvature radius threshold, if the radius-of-curvature minimum value of arbitrary contour curve is greater than described radius-of-curvature threshold value in character outline, then removes the reference mark on this contour curve;
And/or, if many contour curves be connected successively are all located along the same line in character outline, then remove the public point of every two connected contour curves in these many contour curves be connected successively;
And/or predeterminable range threshold value, if the distance between two end points in character outline on arbitrary contour curve is less than described distance threshold, then removes any one in two end points on this contour curve;
And/or, preset second area threshold value, if in character outline in arbitrary region the area of kick of the contour curve composition that many are connected successively be less than described second area threshold value, then remove the public point of the reference mark in many contour curves of this kick of composition on every bar contour curve and every two contour curves that are connected.
The present invention provides a kind of device forming character library simultaneously, comprising: acquiring unit, scanning element, digitizing fitting unit, automatic business processing unit and integration testing unit;
The word original text of acquisition for obtaining word original text, and is sent to scanning element by described acquiring unit;
Described scanning element is used for described word original text to be scanned into word manuscript base picture, and described word manuscript base picture is sent to digitizing fitting unit;
Described digitizing fitting unit is used for carrying out digitizing matching to described word manuscript base picture, to form character outline, and described character outline is sent to automatic business processing unit;
Described automatic business processing unit is used for carrying out automatic business processing to described character outline, to remove point unnecessary on contour curve unnecessary in character outline and contour curve, and the character outline after process is sent to integration testing unit;
Described integration testing unit is used for the character outline after described process being dressed up character library and testing.
Preferably, the word manuscript base picture that described scanning element is scanned into is bianry image.
Preferably, also comprise judging unit and approach fitting unit;
Density threshold is preset with, for judging whether the density of the point in the character outline that digitizing fitting unit is formed in arbitrary region on all contour curves is greater than described density threshold in described judging unit;
As no, then judge that all contour curves in described region all meet the requirements, and send qualifying signal to automatic business processing unit, to make automatic business processing unit, automatic business processing is carried out to satisfactory character outline;
In this way, then judge that all contour curves in described region are undesirable, and send defective signal to approaching fitting unit;
The described fitting unit that approaches is for when receiving described defective signal, according to the some structure approximating function on contour curves all in described region, to form corresponding matched curve, and substitute all contour curves in described region with described matched curve, then continued to judge whether the density put in described region is greater than described density threshold by judging unit, till the matched curve judged in described region until judging unit meets the requirements.
Preferably, in described automatic business processing unit removal character outline, unnecessary contour curve is specially:
Remove the contour curve formed after digitizing matching by the noise of described word manuscript base picture in the character outline of digitizing fitting unit formation;
And/or, be preset with the first area threshold in it, if the area of the closed figure be made up of many contour curves in the character outline of digitizing fitting unit formation is less than described area threshold, then remove many contour curves of this composition closed figure.
Preferably, in described automatic business processing unit removal character outline, on contour curve, unnecessary point is specially:
Be preset with radius-of-curvature threshold value in it, if the radius-of-curvature minimum value of arbitrary contour curve is greater than described radius-of-curvature threshold value in the character outline of digitizing fitting unit formation, then remove the reference mark on this contour curve;
And/or, if many contour curves be connected successively are all located along the same line in character outline, then remove the public point of every two connected contour curves in these many contour curves be connected successively;
And/or, be preset with distance threshold in it, if the distance between two end points in the character outline of digitizing fitting unit formation on arbitrary contour curve is less than described distance threshold, then remove any one in two end points on this contour curve;
And/or, second area threshold value is preset with in it, if the area of the kick of the contour curve composition that many are connected successively in arbitrary region is less than described second area threshold value in the character outline that formed of digitizing fitting unit, then remove the public point of the reference mark in many contour curves of this kick of composition on every bar contour curve and every two contour curves that are connected.
Beneficial effect:
In the character library that the method for formation character library of the present invention and device are formed, character outline smoother, contour curve negligible amounts in character outline (namely the point that comprises of character outline is less), therefore the data volume of character library is less.That is, the character library that the present invention is formed as a kind of personal character library, compared with existing personal character library, the quality of wherein character can be ensured, do not improve production cost again, font file is also smaller, thus efficiently solves the problem that existing personal character library file is bigger than normal.
Accompanying drawing explanation
Fig. 1 is the method flow diagram forming personal character library in the embodiment of the present invention 1;
Fig. 2 is the word manuscript base picture of Chinese character " ";
The character outline of Fig. 3 for being formed after carrying out digitizing matching to the manuscript base of word shown in Fig. 2 picture;
Fig. 4 is for carrying out the character outline after automatic business processing to character outline shown in Fig. 3;
Fig. 5 is for manually repairing the character outline after word to character outline shown in Fig. 4;
Fig. 6 is the method flow diagram forming personal character library in the embodiment of the present invention 2;
Fig. 7 is the structural representation of the device forming personal character library in the embodiment of the present invention 3;
Fig. 8 is the structural representation of the device forming personal character library in the embodiment of the present invention 4.
Embodiment
For making those skilled in the art understand technical scheme of the present invention better, below in conjunction with drawings and Examples, the method for formation character library of the present invention and device are described in further detail.
It should be noted that, character outline is made up of many contour curves, and described contour curve is quafric curve, cubic curve or straight line.In the present invention, " point on contour curve " does not also mean that " point " is positioned on " contour curve ", actually refers to " point corresponding to contour curve ".This is because contour curve is the mode record (storage) put, and described point comprises end points and reference mark; Described end points is positioned at stem or the afterbody of every bar contour curve, is starting point or the terminating point of every bar contour curve; Described reference mark, for controlling the shape of every bar contour curve, is not in most cases positioned on contour curve.And, when described contour curve is quafric curve, the corresponding reference mark of this contour curve and two end points, i.e. a reference mark and two end points records quafric curve; When described contour curve is cubic curve, corresponding two reference mark of this contour curve and two end points, i.e. two reference mark and two end points records cubic curve; When described contour curve is straight line, this contour curve is corresponding two end points, i.e. two end points record straight lines only.The end points that two contour curves be connected are corresponding is three, namely one of them end points is corresponding with described two contour curves respectively as the public point of these two connected contour curves, here, article two, the be connected starting point of the starting point that refers to a wherein contour curve or terminating point and another contour curve or terminating point of contour curve overlaps and the situation of public point as these two contour curves, instead of refers to the situation that these two contour curves intersect.In the present invention, described many refer to two or more, and such as many contour curves refer to two or more contour curve; Described line refers to line segment.
Embodiment 1:
As shown in Figure 1, the present embodiment provides a kind of method forming personal character library, comprises the steps:
S101. word original text is obtained.
All characters needed for character library all need to write in advance in word original text, such as do simplified character library, follow GB2312-1980, need writing Chinese characters 6763.In the format write of word original text and word original text, the sequential write of character is set in advance by designer.
S102. described word original text is scanned into word manuscript base picture.
Because word original text writes on paper or on other carrier, need in its scanning input computer.
Preferably, described word manuscript base picture is bianry image (binary image).Bianry image refers to the digital picture that each pixel only has two probable values.Such as, conventional black and white, monochrome image represent bianry image, and its advantage occupies little space.Here, resolution and other correlation parameter of adjusting word manuscript base picture according to the size of word original text and sharpness is neatly needed, with the overall picture making word manuscript base picture can reflect word occurrence original text without distortion.The word manuscript base picture that the word original text of Brush calligraphy " " is scanned into as shown in Figure 2.
S103. digitizing matching is carried out, to form character outline to described word manuscript base picture.
Automatically word manuscript base picture (bianry image) digitizing that scanning is formed is fitted to as far as possible close to the character outline of word original text by Fontlab ScanFont software or other digitizing fitting software.The character outline that the manuscript base of word shown in Fig. 2 picture is formed after digitizing matching as shown in Figure 3, can contour curve as shown in Figure 3 in character outline many especially, the point that character outline comprises is many especially, and character outline is smooth not, has a lot of kick.
S104. according to the feature of the character outline formed, automatic business processing is carried out to described character outline, to remove point unnecessary on contour curve unnecessary in character outline and contour curve.As shown in Figure 4, as seen compared with Fig. 3, the contour curve in character outline shown in Fig. 4 is a lot of less for the character outline of character outline shown in Fig. 3 after automatic business processing, and the point that character outline comprises is also few a lot, and character outline is comparatively smooth.
Preferably, remove unnecessary contour curve in character outline to be specially:
Remove the contour curve formed after digitizing matching by the noise of described word manuscript base picture in character outline; Described noise word original text is scanned into word manuscript base as time be formed in word manuscript base picture, namely occurred the external pixel that should not occur in word manuscript base picture, it is produced by electronic interferences usually;
And/or, preset the first area threshold, if the area of the closed figure be made up of many contour curves in character outline is less than described area threshold, then remove many contour curves of this composition closed figure.When using Brush calligraphy written word original text, blank spot may be there is in stroke on word original text, the blank spot that upper right portion as the left avertence other " mouth " of " " in Fig. 2 occurs, this blank spot can form the closed figure be made up of many contour curves after the digitizing matching of step s103, and the area of this closed figure is all smaller, described closed figure as shown in Figure 3, therefore by presetting the mode of the first area threshold, the part contour curve that composition area is less than the closed figure of the first area threshold is removed, remove the character outline after meeting the closed curve of above-mentioned condition as shown in Figure 4, described first area threshold can by the stroke feature sets itself of those skilled in the art according to character outline itself.
Preferably, in removal character outline, on contour curve, unnecessary point is specially:
Preset curvature radius threshold, if the radius-of-curvature minimum value of arbitrary contour curve is greater than described radius-of-curvature threshold value in character outline, then remove the reference mark on this contour curve, thus the quafric curve of class straight line (or being called near linear) in character outline, cubic curve are revised as straight line; The quafric curve of described class straight line, cubic curve refer to quafric curve, the cubic curve that radius-of-curvature minimum value is greater than described radius-of-curvature threshold value respectively, because the quafric curve of this kind straight line, the curvature of cubic curve less (namely radius-of-curvature is larger), no camber, the perception of whole character outline is not affected after being revised as straight line, also can reduce the quantity at reference mark, and then reduce the quantity of the point that character outline comprises; Described radius-of-curvature threshold value can by the stroke feature sets itself of those skilled in the art according to character outline itself;
And/or, if many contour curves be connected successively are all located along the same line in character outline, then remove the public point of every two connected contour curves in these many contour curves be connected successively; That is, in the character outline that described word manuscript base picture is formed after digitizing matching, many straight lines may be there are to be connected successively and without the situation of knuckle, and the public point removing these many straight lines can make these many straight lines become straight line, such as, if three straight line is connected and successively without one having four end points during knuckle, wherein two is public point, and after these two public points are removed, these three straight lines can be made to become straight line, because of the quantity of the quantity and point that this reduce contour curve in character outline (during for straight line);
And/or, predeterminable range threshold value, if the distance between two end points in character outline on arbitrary contour curve is less than described distance threshold, then remove any one in two end points on this contour curve, to make described contour curve and another coupled contour curve become a contour curve (namely two contour curves merge into a contour curve), and the end points of this removal is the public point of described contour curve and another coupled contour curve; It should be noted that, if described contour curve and another coupled contour curve are respectively quafric curve (or cubic curve) and straight line, after then removing its public point, the reference mark of this quafric curve (or cubic curve) becomes the reference mark of the contour curve after merging; If described contour curve and another coupled contour curve are quafric curve, after then removing its public point, the reference mark of these two quafric curves is only surplus next, can be relevant to the sequential write of character as which reference mark remaining, also can be preset by designer; If described contour curve and another coupled contour curve are cubic curve, after then removing its public point, only remaining two, the reference mark of these two cubic curves, can be relevant to the sequential write of character as which two reference mark remaining, also can be preset by designer; If described contour curve and another coupled contour curve are respectively quafric curve and cubic curve, after then removing its public point, the reference mark of the contour curve formed may be one, also may be two, as for remaining several reference mark, and which reference mark remaining can be relevant to the sequential write of character, also can be preset by designer; Described distance threshold can by the stroke feature sets itself of those skilled in the art according to character outline itself;
And/or, preset second area threshold value, if in arbitrary region, the area of the kick of the contour curve composition that many are connected successively is less than described second area threshold value in character outline, then remove the public point of the reference mark in many contour curves of this kick of composition on every bar contour curve and every two contour curves that are connected, to eliminate this kick.A lot of burrs that left-half as the left avertence other " mouth " of " " in Fig. 2 occurs, this burr can form the kick be made up of many contour curves after the digitizing matching of step s103, as shown in Figure 3, by the mode of default second area threshold value, the public point at reference mark and every two contour curves that are connected that composition area is less than the part contour curve of the kick of second area threshold value is removed, and removes the character outline after meeting the kick of above-mentioned condition as shown in Figure 4; Described second area threshold value can by the stroke feature sets itself of those skilled in the art according to character outline itself.
S105. dress up character library and test.This step is prior art, repeats no more.
Visible, the method of personal character library is formed compared with prior art described in the present embodiment, merely add the quantity that step s104 can reduce the point that the quantity of the contour curve in character outline and character outline comprise dramatically, make character outline comparatively smooth, also eliminate the kick in the character outline formed because of the burr in word original text, therefore compared with the personal character library that the personal character library formed and prior art are formed, data volume is less, solves the problem that file that existing personal character library faces is too large.Through reality test, the file size of the personal character library that method described in the present embodiment is formed only has about 2/3rds of existing personal character library file size.
The method forming personal character library described in the present embodiment also can be applicable in the manufacturing process of active computer character library, only need increase the step of manually repairing word and quality inspection between step s104 and step s105.The formation method of active computer character library is that the word manuscript base picture after digitizing matching is directly manually repaiied word, such as, directly character outline shown in Fig. 3 is modified as the form of character outline shown in Fig. 4 by artificial mode, workload is very large, and after method described in the present embodiment being applied to the manufacturing process of active computer character library, owing to adding the step of character outline being carried out to automatic business processing, therefore without the need to directly manually repairing word to character outline shown in Fig. 3, only manually need repair word to character outline shown in the Fig. 4 after automatic business processing, and Fig. 4 is compared with painting, in character outline, the quantity of the point that the quantity of contour curve and character outline comprise is all few a lot, therefore method considerably reduces the workload of manually repairing word described in employing the present embodiment, shorten the generating period of computer library.
Embodiment 2:
In the present embodiment, the method for described formation personal character library comprises the steps:
S201-s203 is identical with the s101-s103 in embodiment 1, repeats no more.
S204. density of setting threshold value, whether the density of the point in the character outline formed in determining step s203 in arbitrary region on all contour curves is greater than described density threshold, in this way, then performs step s205, as no, then performs step s206.
S205. according to the some structure approximating function on contour curves all in region described in step s204, to form corresponding matched curve, and substitute all contour curves in described region with described matched curve, then return step s204.
S206-s207 is identical with the s104-s105 in embodiment 1, repeats no more.
Visible, if find that the point in the character outline formed after digitizing matching in certain panel region is relatively intensive, approximating function is constructed based on point then in this region on all contour curves, to form corresponding matched curve, namely use curve approaches the point in this region of matching, adopt the contour curve in this way again in this region of matching, reach the object reducing and put in described region.The error that curve approaches can set according to the actual conditions of character outline in this region; The position in described density threshold and described region and area can by the stroke feature sets itself of those skilled in the art according to character outline itself.
Additive method in the present embodiment and effect all identical with embodiment 1, repeat no more here.
Embodiment 3:
As shown in Figure 7, the present embodiment provides a kind of device forming personal character library, comprising: acquiring unit, scanning element, digitizing fitting unit, automatic business processing unit and integration testing unit.
The word original text of acquisition for obtaining word original text, and is sent to scanning element by described acquiring unit.
Described scanning element is used for described word original text to be scanned into word manuscript base picture, and described word manuscript base picture is sent to digitizing fitting unit.Preferably, the word manuscript base picture that described scanning element is scanned into is bianry image.
Described digitizing fitting unit is used for carrying out digitizing matching to described word manuscript base picture, to form character outline, and described character outline is sent to automatic business processing unit.
Described automatic business processing unit is used for carrying out automatic business processing to described character outline, to remove point unnecessary on contour curve unnecessary in character outline and contour curve, and the character outline after process is sent to integration testing unit.
Described integration testing unit is used for the character outline after described process being dressed up character library and testing.
Preferably, in described automatic business processing unit removal character outline, unnecessary contour curve is specially:
Remove the contour curve formed after digitizing matching by the noise of described word manuscript base picture in the character outline of digitizing fitting unit formation;
And/or, be preset with the first area threshold in it, if the area of the closed figure be made up of many contour curves in the character outline of digitizing fitting unit formation is less than described area threshold, then remove many contour curves of this composition closed figure; Described first area threshold can by the stroke feature sets itself of those skilled in the art according to character outline itself.
Preferably, in described automatic business processing unit removal character outline, on contour curve, unnecessary point is specially:
Be preset with radius-of-curvature threshold value in it, if the radius-of-curvature minimum value of arbitrary contour curve is greater than described radius-of-curvature threshold value in the character outline of digitizing fitting unit formation, then remove the reference mark on this contour curve; Described radius-of-curvature threshold value can by the stroke feature sets itself of those skilled in the art according to character outline itself;
And/or, if many contour curves be connected successively are all located along the same line in character outline, then remove the public point of every two connected contour curves in these many contour curves be connected successively;
And/or, be preset with distance threshold in it, if the distance between two end points in the character outline of digitizing fitting unit formation on arbitrary contour curve is less than described distance threshold, then remove any one in two end points on this contour curve; Described distance threshold can by the stroke feature sets itself of those skilled in the art according to character outline itself;
And/or, second area threshold value is preset with in it, if the area of the kick of the contour curve composition that many are connected successively in arbitrary region is less than described second area threshold value in the character outline that formed of digitizing fitting unit, then remove the public point of the reference mark in many contour curves of this kick of composition on every bar contour curve and every two contour curves that are connected; Described second area threshold value can by the stroke feature sets itself of those skilled in the art according to character outline itself.
Embodiment 4:
The difference of the present embodiment and embodiment 3 is:
In the present embodiment, the device of described formation personal character library also comprises judging unit and approaches fitting unit;
Density threshold is preset with, for judging whether the density of the point in the character outline that digitizing fitting unit is formed in arbitrary region on all contour curves is greater than described density threshold in described judging unit;
As no, then judge that all contour curves in described region all meet the requirements, and send qualifying signal to automatic business processing unit, to make automatic business processing unit, automatic business processing is carried out to satisfactory character outline;
In this way, then judge that all contour curves in described region are undesirable, and send defective signal to approaching fitting unit;
The described fitting unit that approaches, for when receiving described defective signal, according to the some structure approximating function on contour curves all in described region, to form corresponding matched curve, and substitutes all contour curves in described region with described matched curve;
Then continued to judge whether the density put in described region is greater than described density threshold by judging unit, in this way, then approach fitting unit and again matching is carried out to the point in described region, namely approximating function is re-constructed, to form new matched curve, and the matched curve formed before substituting with described new matched curve, and so forth, till the matched curve judged in described region until judging unit meets the requirements (namely in described region, the density of point is less than described density threshold), to reach the object reducing and put in described region
The position in described density threshold and described region and area can by the stroke feature sets itself of those skilled in the art according to character outline itself.
Additive method in the present embodiment and effect all identical with embodiment 3, repeat no more here.
Be understandable that, the illustrative embodiments that above embodiment is only used to principle of the present invention is described and adopts, but the present invention is not limited thereto.For those skilled in the art, without departing from the spirit and substance in the present invention, can make various modification and improvement, these modification and improvement are also considered as protection scope of the present invention.

Claims (10)

1. form a method for character library, it is characterized in that, comprise the steps:
1) word original text is obtained;
2) described word original text is scanned into word manuscript base picture;
3) digitizing matching is carried out, to form character outline to described word manuscript base picture;
4) automatic business processing is carried out to described character outline, to remove point unnecessary on contour curve unnecessary in character outline and contour curve;
5) dress up character library and test.
2. method according to claim 1, is characterized in that, described step 2) in, described word manuscript base picture is bianry image.
3. method according to claim 1, is characterized in that,
Also comprise after described step 3):
3A. density of setting threshold value, determining step 3) in the density of point in the character outline that formed in arbitrary region on all contour curves whether be greater than described density threshold, in this way, then perform step 3B, as no, then perform step 4);
3B., according to the some structure approximating function on contour curves all in region described in step 3A, to form corresponding matched curve, and substitutes all contour curves in described region with described matched curve, then returns step 3A.
4. method according to claim 1, is characterized in that, in described step 4), removes unnecessary contour curve in character outline and is specially:
Remove the contour curve formed after digitizing matching by the noise of described word manuscript base picture in character outline;
And/or, preset the first area threshold, if the area of the closed figure be made up of many contour curves in character outline is less than described area threshold, then remove many contour curves of this composition closed figure.
5. method according to claim 1, is characterized in that, in described step 4), in removal character outline, on contour curve, unnecessary point is specially:
Preset curvature radius threshold, if the radius-of-curvature minimum value of arbitrary contour curve is greater than described radius-of-curvature threshold value in character outline, then removes the reference mark on this contour curve;
And/or, if many contour curves be connected successively are all located along the same line in character outline, then remove the public point of every two connected contour curves in these many contour curves be connected successively;
And/or predeterminable range threshold value, if the distance between two end points in character outline on arbitrary contour curve is less than described distance threshold, then removes any one in two end points on this contour curve;
And/or, preset second area threshold value, if in character outline in arbitrary region the area of kick of the contour curve composition that many are connected successively be less than described second area threshold value, then remove the public point of the reference mark in many contour curves of this kick of composition on every bar contour curve and every two contour curves that are connected.
6. form a device for character library, it is characterized in that, comprising: acquiring unit, scanning element, digitizing fitting unit, automatic business processing unit and integration testing unit;
The word original text of acquisition for obtaining word original text, and is sent to scanning element by described acquiring unit;
Described scanning element is used for described word original text to be scanned into word manuscript base picture, and described word manuscript base picture is sent to digitizing fitting unit;
Described digitizing fitting unit is used for carrying out digitizing matching to described word manuscript base picture, to form character outline, and described character outline is sent to automatic business processing unit;
Described automatic business processing unit is used for carrying out automatic business processing to described character outline, to remove point unnecessary on contour curve unnecessary in character outline and contour curve, and the character outline after process is sent to integration testing unit;
Described integration testing unit is used for the character outline after described process being dressed up character library and testing.
7. device according to claim 5, is characterized in that, the word manuscript base picture that described scanning element is scanned into is bianry image.
8. device according to claim 5, is characterized in that,
Also comprise judging unit and approach fitting unit;
Density threshold is preset with, for judging whether the density of the point in the character outline that digitizing fitting unit is formed in arbitrary region on all contour curves is greater than described density threshold in described judging unit;
As no, then judge that all contour curves in described region all meet the requirements, and send qualifying signal to automatic business processing unit, to make automatic business processing unit, automatic business processing is carried out to satisfactory character outline;
In this way, then judge that all contour curves in described region are undesirable, and send defective signal to approaching fitting unit;
The described fitting unit that approaches is for when receiving described defective signal, according to the some structure approximating function on contour curves all in described region, to form corresponding matched curve, and substitute all contour curves in described region with described matched curve, then continued to judge whether the density put in described region is greater than described density threshold by judging unit, till the matched curve judged in described region until judging unit meets the requirements.
9. device according to claim 5, is characterized in that, described automatic business processing unit is removed unnecessary contour curve in character outline and is specially:
Remove the contour curve formed after digitizing matching by the noise of described word manuscript base picture in the character outline of digitizing fitting unit formation;
And/or, be preset with the first area threshold in it, if the area of the closed figure be made up of many contour curves in the character outline of digitizing fitting unit formation is less than described area threshold, then remove many contour curves of this composition closed figure.
10. device according to claim 5, is characterized in that, in described automatic business processing unit removal character outline, on contour curve, unnecessary point is specially:
Be preset with radius-of-curvature threshold value in it, if the radius-of-curvature minimum value of arbitrary contour curve is greater than described radius-of-curvature threshold value in the character outline of digitizing fitting unit formation, then remove the reference mark on this contour curve;
And/or, if many contour curves be connected successively are all located along the same line in character outline, then remove the public point of every two connected contour curves in these many contour curves be connected successively;
And/or, be preset with distance threshold in it, if the distance between two end points in the character outline of digitizing fitting unit formation on arbitrary contour curve is less than described distance threshold, then remove any one in two end points on this contour curve;
And/or, second area threshold value is preset with in it, if the area of the kick of the contour curve composition that many are connected successively in arbitrary region is less than second area threshold value in the character outline that formed of digitizing fitting unit, then remove the public point of the reference mark in many contour curves of this kick of composition on every bar contour curve and every two contour curves that are connected.
CN201310332045.4A 2013-08-01 2013-08-01 A kind of method and device for forming character library Active CN104346390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310332045.4A CN104346390B (en) 2013-08-01 2013-08-01 A kind of method and device for forming character library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310332045.4A CN104346390B (en) 2013-08-01 2013-08-01 A kind of method and device for forming character library

Publications (2)

Publication Number Publication Date
CN104346390A true CN104346390A (en) 2015-02-11
CN104346390B CN104346390B (en) 2018-01-23

Family

ID=52502004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310332045.4A Active CN104346390B (en) 2013-08-01 2013-08-01 A kind of method and device for forming character library

Country Status (1)

Country Link
CN (1) CN104346390B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760437A (en) * 2016-02-02 2016-07-13 刘敏 Word-stock creating and releasing method and system and font input method
CN106802800A (en) * 2016-12-30 2017-06-06 深圳芯智汇科技有限公司 The generation method and display device of graphical interfaces
CN107291261A (en) * 2016-04-11 2017-10-24 亚太戏网股份有限公司 Writing and word-making method and system
CN107610200A (en) * 2017-10-10 2018-01-19 南京师范大学 A kind of character library rapid generation of feature based template
CN110134921A (en) * 2018-02-09 2019-08-16 北大方正集团有限公司 Check the method and device whether character contour deforms
CN110427886A (en) * 2019-08-02 2019-11-08 福建天晴数码有限公司 A kind of the automation method of calibration and system of handwriting recognition
CN110532973A (en) * 2019-09-03 2019-12-03 海南阿凡题科技有限公司 The identification of pair of pages text image and locating segmentation method based on special anchor point
CN111898600A (en) * 2020-07-10 2020-11-06 浙江大华技术股份有限公司 Character outline extraction method and device, storage medium and electronic device
CN113052002A (en) * 2021-02-05 2021-06-29 广州八爪鱼教育科技有限公司 Method, device and equipment for screening handwriting sampling points and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1588350A (en) * 2004-09-17 2005-03-02 华南理工大学 Treating method and its use for dynamic Chinese character word library containing writing time sequence information
CN1650345A (en) * 2002-02-25 2005-08-03 夏普株式会社 Character display apparatus and character display method, control program for controlling the character display method and recording medium recording the control program
CN101055565A (en) * 2007-06-15 2007-10-17 中国科学院软件研究所 Character library and font standard detection method
CN101894370A (en) * 2010-07-14 2010-11-24 苏州大学 Automatic generation method of shape parameter-adaptive oracle-bone inscription contour glyphs
CN103136769A (en) * 2011-12-02 2013-06-05 北京三星通信技术研究有限公司 Method and device of generation of writing style font of user

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1650345A (en) * 2002-02-25 2005-08-03 夏普株式会社 Character display apparatus and character display method, control program for controlling the character display method and recording medium recording the control program
CN1588350A (en) * 2004-09-17 2005-03-02 华南理工大学 Treating method and its use for dynamic Chinese character word library containing writing time sequence information
CN101055565A (en) * 2007-06-15 2007-10-17 中国科学院软件研究所 Character library and font standard detection method
CN101894370A (en) * 2010-07-14 2010-11-24 苏州大学 Automatic generation method of shape parameter-adaptive oracle-bone inscription contour glyphs
CN103136769A (en) * 2011-12-02 2013-06-05 北京三星通信技术研究有限公司 Method and device of generation of writing style font of user

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张俊松: "书法碑帖图像去噪、轮廓拟合及纹理建模研究", 《中国博士学位论文全文数据库 哲学与人文科学辑》 *
王晓丽: "高精度曲线轮廓甲骨文字形生成系统的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
郭海: "纳西象形文轮廓字体设计制作及植入研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760437A (en) * 2016-02-02 2016-07-13 刘敏 Word-stock creating and releasing method and system and font input method
CN107291261A (en) * 2016-04-11 2017-10-24 亚太戏网股份有限公司 Writing and word-making method and system
CN106802800A (en) * 2016-12-30 2017-06-06 深圳芯智汇科技有限公司 The generation method and display device of graphical interfaces
CN107610200B (en) * 2017-10-10 2020-11-03 南京师范大学 Character library rapid generation method based on characteristic template
CN107610200A (en) * 2017-10-10 2018-01-19 南京师范大学 A kind of character library rapid generation of feature based template
CN110134921B (en) * 2018-02-09 2020-12-04 北大方正集团有限公司 Method and device for checking whether font outline is deformed
CN110134921A (en) * 2018-02-09 2019-08-16 北大方正集团有限公司 Check the method and device whether character contour deforms
CN110427886A (en) * 2019-08-02 2019-11-08 福建天晴数码有限公司 A kind of the automation method of calibration and system of handwriting recognition
CN110532973A (en) * 2019-09-03 2019-12-03 海南阿凡题科技有限公司 The identification of pair of pages text image and locating segmentation method based on special anchor point
CN110532973B (en) * 2019-09-03 2022-02-01 海南阿凡题科技有限公司 Double-page text image identification and positioning segmentation method based on special anchor points
CN111898600A (en) * 2020-07-10 2020-11-06 浙江大华技术股份有限公司 Character outline extraction method and device, storage medium and electronic device
CN113052002A (en) * 2021-02-05 2021-06-29 广州八爪鱼教育科技有限公司 Method, device and equipment for screening handwriting sampling points and storage medium
CN113052002B (en) * 2021-02-05 2024-05-14 广州八爪鱼教育科技有限公司 Method, device, equipment and storage medium for screening handwriting sampling points

Also Published As

Publication number Publication date
CN104346390B (en) 2018-01-23

Similar Documents

Publication Publication Date Title
CN104346390A (en) Method and device for forming word stock
CN109697468B (en) Sample image labeling method and device and storage medium
CN101021902A (en) Vector graphics identifying method for engineering CAD drawing
CN102609405A (en) Method for generating dynamic contract text and system
US11250199B1 (en) Methods and systems for generating shape data for electronic designs
CN106326565B (en) Method for rapidly converting three-dimensional design engineering drawing
CN103885942B (en) A kind of rapid translation device and method
CN112347288A (en) Character and picture vectorization method
CN103810018A (en) Method for designing componentized and parameterized simulation model
CN103559512A (en) Method and system for recognizing and outputting characters
CN115828349A (en) Geometric model processing method and device, electronic equipment and storage medium
CN105183678A (en) Communication method and apparatus of terminal interface
CN114299246A (en) Three-dimensional data geometric method, device, storage medium and equipment
CN112000591B (en) SSD scanning method capable of designating logical block address, SSD scanning device, SSD scanning computer equipment and storage medium
US20120019552A1 (en) Method for Automatically Modifying a Graphics Feature to Comply with a Resolution Limit
US9934610B2 (en) Techniques for rendering and caching graphics assets
CN102486757B (en) The method of memorizer memory devices and Memory Controller thereof and response host command
CN112307725B (en) Method for adding table information on two-dimensional drawing interface
CN111428429B (en) Method for transferring circuit and layout design information in Cadence system
CN113553454B (en) Primitive data processing method and device and electronic equipment
CN111353335A (en) Method for converting control layer logic diagram into simulation system configuration file
Quadros et al. Defeaturing CAD models using a geometry-based size field and facet-based reduction operators
CN114202762A (en) Handwritten sample generation method and device and application
CN113642054A (en) CAD drawing data processing method, device and storage medium
CN114495144A (en) Method and device for extracting form key-value information in text image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 100871, Beijing, Haidian District Cheng Fu Road 298, founder building, 9 floor

Patentee after: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee after: PKU FOUNDER INFORMATION INDUSTRY GROUP CO.,LTD.

Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District, Cheng Fu Road, No. 298, Zhongguancun Fangzheng building, 5 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: FOUNDER INFORMATION INDUSTRY HOLDINGS Co.,Ltd.

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220919

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District Cheng Fu Road 298, founder building, 9 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: PKU FOUNDER INFORMATION INDUSTRY GROUP CO.,LTD.

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.