Drawing Reading text method and system based on clustering
Technical field
The present invention relates to text composition field, and in particular to the drawing Reading text method and system based on clustering.
Background technology
As international cooperation is increasingly strengthened, either Chinese companies or foreign corporation, in international project and international scientific research
Project verification and development during, be required for being exchanged with other side by drawing file.Due to originals such as multiple modification, manual typesettings
Cause, it should be complete a word that drawing, which often occurs, be split as multiple text boxes be put into manually it is adjacent to each other
Position.And due to the characteristic (text box is write successively according to the write time hereof) of drawing file itself, institute backward
Might not be adjacent on content is preserved with the text box for seeming adjacent on drawing, some are possibly even very remote every obtaining, this
During so that extracting drawing text, continuous content, which is resolved to, is separated by very remote place, and this just goes forward side by side to extracting drawing text
Row translation causes huge puzzlement.
The content of the invention
The technical problems to be solved by the invention be extract drawing text when, continuous content be resolved to be separated by it is very remote
Place, cause extract drawing text and carry out translation inconvenience, it is therefore intended that provide the drawing Reading text based on clustering
Method and system, solve the above problems.
The present invention is achieved through the following technical solutions:
Drawing Reading text method based on clustering, comprises the following steps:S1:By the text box on drawing according to it
Angle carries out angle classification;S2:Extract the coordinate characteristic value of the text box of same angular type;S3:To same angular type
Text box carries out clustering so that the similar text box of coordinate characteristic value is gathered in same class, and according to cluster result pair
Text box is ranked up;S4:Text box after sequence is subjected to word output according to the angular type of text frame.
Due to reasons such as multiple modification, manual typesettings, it should be complete a word that drawing, which often occurs, be split
It is divided into multiple text boxes and has been put into position adjacent to each other manually.And due to the characteristic of drawing file itself, (text box is in file
In write backward successively according to the write time), so seeming adjacent text box on drawing on content is preserved and differing
Fixed adjacent, some are possibly even very remote every obtaining, and when this to extract drawing text, continuous content, which is resolved to, to be separated by very
Remote place, this just causes huge puzzlement to extracting drawing text and translate.
When the present invention is applied, angle classification first is carried out according to its angle to the text box on drawing, text box is divided into many
Individual different angular type, as often occurred in cad drawings:0 °, 90 °, 180 ° and 270 °;Then same angular type is extracted
The coordinate characteristic value of text box, this coordinate characteristic value is the coordinate value for identifying the exclusive position of each text box;Subsequently, it is right
The text box of same angular type carries out clustering so that the similar text box of coordinate characteristic value is gathered in same class, and
Text box is ranked up according to cluster result;Finally, the text box after sequence is carried out according to the angular type of text frame
Word is exported.By taking cad drawings as an example:Angle such as text box is 0 °, then the order of word output is from left to right;Such as text box
Angle be 90 °, then the order of word output is from top to bottom;Angle such as text box is 180 °, then the order of word output
For from right to left;Angle such as text box is 270 °, then the order of word output is from top to bottom.The present invention is by by text box
Cluster so that the close text box of content is gathered same class, then output character, it is to avoid continuous content is resolved to
It is separated by very remote place so that extract drawing text and translated more convenient.
Further, the coordinate characteristic value uses the coordinate value, the coordinate value in the lower left corner, the upper right corner in the text box upper left corner
Coordinate value, the coordinate value of the coordinate value in the lower right corner or central point.
When the present invention is applied, coordinate characteristic value uses the coordinate value, the coordinate value in the lower left corner, the upper right corner in the text box upper left corner
Coordinate value, the coordinate value of the coordinate value in the lower right corner or central point, because coordinate characteristic value is that to identify each text box exclusive
The coordinate value of position, above-mentioned five kinds of coordinate values can express the exclusive position of text frame, effectively increase cluster of the present invention
Accuracy.
Further, the cluster uses optics algorithms;The optics algorithms are by reading in orderly text box group
The coordinate characteristic value of text box determines the relative distance between text box;The similar standard of the coordinate characteristic value be it is relative away from
From less than or equal to threshold value.
When the present invention is applied, due to the text box on drawing have rule but it is irregular the characteristics of, inventor pass through wound
The property made work is found, because text box is irregular, so it is difficult to accurately determine clustering parameter, and when using optics algorithms, gather
When reasonable change occurs for class parameter, cluster result change is little.And in the optics algorithms that the present invention is applied, have by reading
The coordinate characteristic value of sequence text box group Chinese version frame determines the relative distance between text box, is then made with this relative distance
The data required for optics algorithms, this mode can effectively determine whether text box is assembled, and the text that will do not assemble
Frame is assigned in different classes.In optics algorithms of the present invention, the similar standard of coordinate characteristic value is that relative distance is less than threshold value,
The threshold value is density value in present invention application, it is only necessary to provide rational density value, you can reasonably cluster text box, have
Effect reduction cluster error.
Further, step S3 also includes following sub-step:After the completion of cluster, according to the coordinate characteristic value of text box to every
The shape that one class Chinese version frame is constituted is judged;If what text box was constituted is shaped as preset shape, to such execution
S4;If the shape that text box is constituted is not preset shape, clustering parameter is adjusted to such and S3 is performed.
When the present invention is applied, by taking civil construction cad drawings as an example:The characteristics of civil construction cad drawing Chinese version frames is, such as
The angle of fruit text box is 0 °, then the horizontal coordinate of the first row text box is same or similar, meanwhile, the longitudinal direction of first row text box
Coordinate is same or similar, so preset shape is same or similar while first row text using the horizontal coordinate of the first row text box
The same or analogous shape of longitudinal coordinate of frame.And in clustering, it sometimes appear that the shape that the text box after cluster is constituted
Shape is T-shape or circle, and at this moment, it is not preset shape to assert the shape, and is clustered to such adjustment clustering parameter and again.
The present invention effectively raises the accuracy of cluster by the judgement of the shape constituted to text box.
Drawing Reading text system based on clustering, including for the text box on drawing to be carried out according to its angle
The sort module of angle classification;Extraction module for the coordinate characteristic value of the text box that extracts same angular type;For right
The text box of same angular type carries out clustering so that the similar text box of coordinate characteristic value is gathered in same class, and
The cluster module being ranked up according to cluster result to text box;For by angle of the text box after sequence according to text frame
Type carries out the output module of word output.
When the present invention is applied, sort module first carries out angle classification to the text box on drawing according to its angle, by text
Frame is divided into multiple different angular types, as often occurred in cad drawings:0 °, 90 °, 180 ° and 270 °;Then extraction module is carried
The coordinate characteristic value of the text box of same angular type is taken, this coordinate characteristic value is to identify the exclusive position of each text box
Coordinate value;Subsequently, cluster module carries out clustering to the text box of same angular type so that coordinate characteristic value is similar
Text box is gathered in same class, and text box is ranked up according to cluster result;Finally, output module is by the text after sequence
This frame carries out word output according to the angular type of text frame.By taking cad drawings as an example:Angle such as text box is 0 °, then literary
The order of word output is from left to right;Angle such as text box is 90 °, then the order of word output is from top to bottom;Such as text
The angle of frame is 180 °, then the order of word output is from right to left;Angle such as text box is 270 °, then word output is suitable
Sequence is from top to bottom.The present invention is by the way that text box is clustered so that the close text box of content is gathered same class, then defeated
Go out word, it is to avoid continuous content, which is resolved to, is separated by very remote place so that extract drawing text and simultaneously translated more
Plus conveniently.
Further, the coordinate characteristic value uses the coordinate value, the coordinate value in the lower left corner, the upper right corner in the text box upper left corner
Coordinate value, the coordinate value of the coordinate value in the lower right corner or central point.
When the present invention is applied, coordinate characteristic value uses the coordinate value, the coordinate value in the lower left corner, the upper right corner in the text box upper left corner
Coordinate value, the coordinate value of the coordinate value in the lower right corner or central point, because coordinate characteristic value is that to identify each text box exclusive
The coordinate value of position, above-mentioned five kinds of coordinate values can express the exclusive position of text frame, effectively increase cluster of the present invention
Accuracy.
Further, the cluster uses optics algorithms;The optics algorithms are by reading in orderly text box group
The coordinate characteristic value of text box determines the relative distance between text box;The similar standard of the coordinate characteristic value be it is relative away from
From less than or equal to threshold value.
When the present invention is applied, due to the text box on drawing have rule but it is irregular the characteristics of, inventor pass through wound
The property made work is found, because text box is irregular, so it is difficult to accurately determine clustering parameter, and when using optics algorithms, gather
When reasonable change occurs for class parameter, cluster result change is little.And in the optics algorithms that the present invention is applied, have by reading
The coordinate characteristic value of sequence text box group Chinese version frame determines the relative distance between text box, is then made with this relative distance
The data required for optics algorithms, this mode can effectively determine whether text box is assembled, and the text that will do not assemble
Frame is assigned in different classes.In optics algorithms of the present invention, the similar standard of coordinate characteristic value is that relative distance is less than threshold value,
The threshold value is density value in present invention application, it is only necessary to provide rational density value, you can reasonably cluster text box, have
Effect reduction cluster error.
Further, cluster module is additionally operable to after the completion of cluster, according to the coordinate characteristic value of text box to each class
The shape that Chinese version frame is constituted is judged;If what text box was constituted is shaped as preset shape, such is sent to output
Module;If the shape that text box is constituted is not preset shape, cluster to such adjustment clustering parameter and again.
When the present invention is applied, by taking civil construction cad drawings as an example:The characteristics of civil construction cad drawing Chinese version frames is, such as
The angle of fruit text box is 0 °, then the horizontal coordinate of the first row text box is same or similar, meanwhile, the longitudinal direction of first row text box
Coordinate is same or similar, so preset shape is same or similar while first row text using the horizontal coordinate of the first row text box
The same or analogous shape of longitudinal coordinate of frame.And in clustering, it sometimes appear that the shape that the text box after cluster is constituted
Shape is T-shape or circle, and at this moment, it is not preset shape, and to being different from preset shape in such text box to assert the shape
Part cluster again.The present invention effectively raises the accuracy of cluster by the judgement of the shape constituted to text box.
The present invention compared with prior art, has the following advantages and advantages:
1st, the drawing Reading text method of the invention based on clustering, by the way that text box is clustered so that content is close
Text box gathered same class, then output character, it is to avoid continuous content, which is resolved to, is separated by very remote place,
More facilitate so that extracting drawing text and being translated;
2nd, the drawing Reading text method of the invention based on clustering, the coordinate characteristic value of use can express this article
The exclusive position of this frame, effectively increases the accuracy that the present invention is clustered;
3rd, the drawing Reading text method of the invention based on clustering, unwise to clustering parameter using optics algorithms
Sense, can effectively reduce cluster error;
4th, the drawing Reading text method of the invention based on clustering, by the judgement of the shape constituted to text box,
Effectively raise the accuracy of cluster;
5th, the drawing Reading text system of the invention based on clustering, by the way that text box is clustered so that content is close
Text box gathered same class, then output character, it is to avoid continuous content, which is resolved to, is separated by very remote place,
More facilitate so that extracting drawing text and being translated;
6th, the drawing Reading text system of the invention based on clustering, the coordinate characteristic value of use can express this article
The exclusive position of this frame, effectively increases the accuracy that the present invention is clustered;
7th, the drawing Reading text system of the invention based on clustering, unwise to clustering parameter using optics algorithms
Sense, can effectively reduce cluster error;
8th, the drawing Reading text system of the invention based on clustering, by the judgement of the shape constituted to text box,
Effectively raise the accuracy of cluster.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding the embodiment of the present invention, constitutes one of the application
Point, do not constitute the restriction to the embodiment of the present invention.In the accompanying drawings:
Fig. 1 is the inventive method step schematic diagram;
Fig. 2 is present system structural representation;
Fig. 3 is the schematic diagram of embodiment 5.
Embodiment
For the object, technical solutions and advantages of the present invention are more clearly understood, with reference to embodiment and accompanying drawing, to this
Invention is described in further detail, and exemplary embodiment and its explanation of the invention is only used for explaining the present invention, does not make
For limitation of the invention.
Embodiment 1
As shown in figure 1, the drawing Reading text method of the invention based on clustering, comprises the following steps:S1:By drawing
On text box according to its angle carry out angle classification;S2:Extract the coordinate characteristic value of the text box of same angular type;S3:
Clustering is carried out to the text box of same angular type so that the similar text box of coordinate characteristic value is gathered in same class,
And text box is ranked up according to cluster result;S4:Text box after sequence is carried out according to the angular type of text frame
Word is exported.
When the present embodiment is implemented, angle classification first is carried out according to its angle to the text box on drawing, text box is divided into
Multiple different angular types, as often occurred in cad drawings:0 °, 90 °, 180 ° and 270 °;Then same angular type is extracted
Text box coordinate characteristic value, this coordinate characteristic value is the coordinate value for identifying each text box exclusive position;Subsequently,
Clustering is carried out to the text box of same angular type so that the similar text box of coordinate characteristic value is gathered in same class,
And text box is ranked up according to cluster result;Finally, the text box after sequence is entered according to the angular type of text frame
Word of composing a piece of writing is exported.By taking cad drawings as an example:Angle such as text box is 0 °, then the order of word output is from left to right;Such as text
The angle of frame is 90 °, then the order of word output is from top to bottom;Angle such as text box is 180 °, then word output is suitable
Sequence is from right to left;Angle such as text box is 270 °, then the order of word output is from top to bottom.The present invention is by by text
Frame is clustered so that the close text box of content is gathered same class, then output character, it is to avoid continuous content is resolved
To being separated by very remote place so that extract drawing text and simultaneously translated and more facilitate.
Embodiment 2
The present embodiment is on the basis of embodiment 1, and the coordinate characteristic value uses the coordinate value in the text box upper left corner, lower-left
The coordinate value at angle, the coordinate value in the upper right corner, the coordinate value of the coordinate value in the lower right corner or central point.
When the present embodiment is implemented, coordinate characteristic value uses the coordinate value, the coordinate value in the lower left corner, upper right in the text box upper left corner
The coordinate value of the coordinate value at angle, the coordinate value in the lower right corner or central point, due to coordinate characteristic value be identify each text box it is only
There is the coordinate value of position, above-mentioned five kinds of coordinate values can express the exclusive position of text frame, effectively increase the present invention poly-
The accuracy of class.
Embodiment 3
The present embodiment is on the basis of embodiment 1, and the cluster uses optics algorithms;The optics algorithms are by reading
The coordinate characteristic value of orderly text box group Chinese version frame is taken to determine the relative distance between text box;The coordinate characteristic value phase
As standard be relative distance be less than or equal to threshold value.
The present embodiment implement when, due to the text box on drawing have rule but it is irregular the characteristics of, inventor passes through
Creative work is found, because text box is irregular, so it is difficult to accurately determine clustering parameter, and when using optics algorithms,
When reasonable change occurs for clustering parameter, cluster result change is little.And in the optics algorithms that the present invention is applied, by reading
The coordinate characteristic value of orderly text box group Chinese version frame determines the relative distance between text box, then with this relative distance
As data required for optics algorithms, this mode can effectively determine whether text box is assembled, and the text that will do not assemble
This frame is assigned in different classes.In optics algorithms of the present invention, the similar standard of coordinate characteristic value is that relative distance is less than threshold
Value, the threshold value is density value in present invention application, it is only necessary to provide rational density value, you can reasonably gather text box
Class, effectively reduction cluster error.
Embodiment 4
The present embodiment is on the basis of embodiment 1, and step S3 also includes following sub-step:After the completion of cluster, according to text
The shape that the coordinate characteristic value of frame is constituted to each class Chinese version frame judges;If what text box was constituted is shaped as presetting
Shape, then to such execution S4;If the shape that text box is constituted is not preset shape, clustering parameter is adjusted to such and held
Row S3.
When the present embodiment is implemented, by taking civil construction cad drawings as an example:The characteristics of civil construction cad drawing Chinese version frames is,
If the angle of text box is 0 °, the horizontal coordinate of the first row text box is same or similar, meanwhile, first row text box it is vertical
It is same or similar to coordinate, so preset shape is same or similar while first row is literary using the horizontal coordinate of the first row text box
The same or analogous shape of longitudinal coordinate of this frame.And in clustering, it sometimes appear that what the text box after cluster was constituted
T-shape or circle are shaped as, at this moment, it is not preset shape to assert the shape, and such adjustment clustering parameter is clustered again.
The present invention effectively raises the accuracy of cluster by the judgement of the shape constituted to text box.
Embodiment 5
As shown in figure 3, the present embodiment is on the basis of embodiment 1 to 4, the drawing word in Fig. 3 is handled.
When the present embodiment is implemented,
It is 0 ° that the text box on drawing first is carried out into text boxes all on angle classification, drawing according to its angle;Then
The coordinate characteristic value of the text box of same angular type is extracted, coordinate characteristic value is defined as the text box upper left corner and sat by the present embodiment
Scale value.
Text box extraction effect is as follows:
3637:One:[coordinate is:X=1447657412.032166 y=926543984.4671117]
3638:1 [coordinate is:X=1447657587.564081 y=926543984.4671117]
3639:, this engineering ± [coordinate is:X=1447657618.814081 y=926543984.4671117]
3640:0.000 [coordinate is:X=1447658040.20888 y=926543984.4671117]
3641:[coordinate is above wall:X=1447658233.95888 y=926543984.4671117]
3642:200 [coordinate is:X=1447658585.02271 y=926543984.4671117]
3643:Thick concrete brick, [coordinate is:X=1447658716.27271 y=926543984.4671117]
3644:Two:[coordinate is engineering way:X=1447661758.77883 y=926544012.7807627]
3645:Strength grade is [coordinate is:X=1447657709.707235 y=926543795.2022269]
3646:[coordinate is MU10:X=1447658148.537022 y=926543795.2022269]
3647:, use [coordinate for:X=1447658329.787022 y=926543795.2022269]
3648:[coordinate is M5:X=1447658593.084894 y=926543795.2022269]
3649:Mixed mortar is built by laying bricks or stones, and [coordinate is for thickness of wall body and column size:X=1447658693.084895 y=
926543795.2022269]
3650:[coordinate is exterior wall:X=1447661936.167565 y=926543783.6524901]
3651:1 [coordinate is:X=1447662111.69948 y=926543783.6524901]
3652::[coordinate is:X=1447662142.94948 y=926543783.6524901]
3653:1:1 [coordinate is:X=1447662373.119285 y=926543783.6524901]
3654:[coordinate is white cement wiping seam:X=1447662454.369285 y=926543783.6524901]
3655:Detailed plane and profile.± [coordinate is:X=1447657703.398538 y=
926543593.179681]
3656:0.000 [coordinate is:X=1447658475.857167 y=926543593.179681]
3657:Following wall in detail apply by knot.[coordinate is:X=1447658669.607167 y=926543593.179681]
3658:[coordinate is 8-10:X=1447662373.119285 y=926543599.0564725]
3659:Thick light gray facing tile, concrete interface agent, strong adherence force are brushed one times on brick sticking veneer with patch with painting
[coordinate is:X=1447662535.619285 y=926543599.0564725]
3660:2 [coordinate is:X=1447657568.554384 y=926543382.6952552]
3661:, enclosure wall is every [coordinate is:X=1447657612.304384 y=926543382.6952552]
3662:40 [coordinate is:X=1447658051.134172 y=926543382.6952552]
3663:Rice sets an expansion joint, and [coordinate is slit width:X=1447658144.884172 y=
926543382.6952552]
3664:[coordinate is 30mm.:X=1447658934.777789 y=926543382.6952552]
3665:3 [coordinate is:X=1447657558.485019 y=926543184.2143952]
3666:, wall damp-proof course:[coordinate is under terrace:X=1447657602.235019 y=
926543184.2143952]
3667:- 0.060 [coordinate is:X=1447658567.660551 y=926543184.2143952]
3668:Place does that [coordinate is:X=1447658805.160551 y=926543184.2143952]
3669:20 [coordinate is:X=1447658980.692466 y=926543184.2143952]
3670:It is thick that [coordinate is:X=1447659068.192466 y=926543184.2143952]
3671:1:2 [coordinate is:X=1447659155.958423 y=926543184.2143952]
3672:In cement mortar plus equivalent to [coordinate is:X=1447659249.708423 y=
926543184.2143952]
3673:[coordinate is cement weight:X=1447657722.28492 y=926542961.0158562]
3674:5% [coordinate is:X=1447658073.34875 y=926542961.0158562]
3675:Polymer mortar (this absolute altitude be reinforced concrete construction when do not do).[coordinate is:X=
1447658185.84875 y=926542961.0158562]
3676:4 [coordinate is:X=1447657558.485019 y=926542757.3695289]
3677:, cross between water hole two one, hole inwall smears that [coordinate is:X=1447657608.485019 y=
926542757.3695289]
3678:20 [coordinate is:X=1447658749.442466 y=926542757.3695289]
3679:It is thick that [coordinate is:X=1447658836.942466 y=926542757.3695289]
3680:1:2 [coordinate is:X=1447658924.708423 y=926542757.3695289]
3681:Cement mortar adds that [coordinate is:X=1447659018.458423 y=926542757.3695289]
3682:5% [coordinate is:X=1447659457.288211 y=926542757.3695289]
3683:Waterproofing agent.[coordinate is:X=1447659569.788211 y=926542757.3695289]
3684:5 [coordinate is:X=1447657558.485019 y=926542575.1381977]
3685:, iron artistic fence by specialized factory's fabrication and installation, its color and fancy have Party A to make by oneself.[coordinate is:X=
1447657602.235019 y=926542575.1381977]
3686:6 [coordinate is:X=1447657558.485019 y=926542375.0185961]
3687:, have not clear part in work progress on drawing, should contact and resolve through consultation with designing unit in time.[coordinate
For:X=1447657602.235019 y=926542375.0185961]
3688:It all should be performed in construction by Current Building ' installing engineering acceptance specification and relevant standard, specification, regulation.[sit
It is designated as:X=1447657725.294654 y=926542188.788369]
3690:6 [coordinate is:X=1447662373.119285 y=926543404.5512771]
3691:It is thick that [coordinate is:X=1447662416.869285 y=926543404.5512771]
3692:1:2.5 [coordinate is:X=1447662504.635242 y=926543404.5512771]
3693:[coordinate is cement mortar (mixing building adhesive):X=1447662660.885242 y=
926543404.5512771]
3694:12 [coordinate is:X=1447662357.412471 y=926543231.9392951]
3695:It is thick that [coordinate is:X=1447662432.412471 y=926543231.9392951]
3696:1 [coordinate is:X=1447662520.178428 y=926543231.9392951]
3697::[coordinate is:X=1447662551.428428 y=926543231.9392951]
3698:3 [coordinate is:X=1447662639.194386 y=926543231.9392951]
3699:[coordinate is for cement mortar bottoming brooming:X=1447662682.944385 y=
926543231.9392951]
3700:[coordinate is base course wall:X=1447662391.443578 y=926543056.9064493]
Foremost is the numbering of text box, and centre is text box content, is finally text box top left co-ordinate.By above-mentioned interior
Appearance is visible, after drawing text box is read, and the text box that No. 3644 grade should largely belong to right side has been inserted into left side text
In this frame.Then clustering is carried out to text box so that the similar text box of coordinate characteristic value is gathered in same class, and root
Text box clooating sequence is adjusted according to cluster result;Clustering algorithm uses optics algorithms, and the density value of optics algorithms takes drawing
1 the percent of size.
The later effect of cluster is as follows:
Class 1:
3637:One:[coordinate is:X=1447657412.032166 y=926543984.4671117]
3638:1 [coordinate is:X=1447657587.564081 y=926543984.4671117]
3639:, this engineering ± [coordinate is:X=1447657618.814081 y=926543984.4671117]
3640:0.000 [coordinate is:X=1447658040.20888 y=926543984.4671117]
3641:[coordinate is above wall:X=1447658233.95888 y=926543984.4671117]
3642:200 [coordinate is:X=1447658585.02271 y=926543984.4671117]
3643:Thick concrete brick, [coordinate is:X=1447658716.27271 y=926543984.4671117]
3645:Strength grade is [coordinate is:X=1447657709.707235 y=926543795.2022269]
3646:[coordinate is MU10:X=1447658148.537022 y=926543795.2022269]
3647:, use [coordinate for:X=1447658329.787022 y=926543795.2022269]
3648:[coordinate is M5:X=1447658593.084894 y=926543795.2022269]
3649:Mixed mortar is built by laying bricks or stones, and [coordinate is for thickness of wall body and column size:X=1447658693.084895 y=
926543795.2022269]
3655:Detailed plane and profile.± [coordinate is:X=1447657703.398538 y=
926543593.179681]
3656:0.000 [coordinate is:X=1447658475.857167 y=926543593.179681]
3657:Following wall in detail apply by knot.[coordinate is:X=1447658669.607167 y=926543593.179681]
3660:2 [coordinate is:X=1447657568.554384 y=926543382.6952552]
3661:, enclosure wall is every [coordinate is:X=1447657612.304384 y=926543382.6952552]
3662:40 [coordinate is:X=1447658051.134172 y=926543382.6952552]
3663:Rice sets an expansion joint, and [coordinate is slit width:X=1447658144.884172 y=
926543382.6952552]
3664:[coordinate is 30mm.:X=1447658934.777789 y=926543382.6952552]
3665:3 [coordinate is:X=1447657558.485019 y=926543184.2143952]
3666:, wall damp-proof course:[coordinate is under terrace:X=1447657602.235019 y=
926543184.2143952]
3667:- 0.060 [coordinate is:X=1447658567.660551 y=926543184.2143952]
3668:Place does that [coordinate is:X=1447658805.160551 y=926543184.2143952]
3669:20 [coordinate is:X=1447658980.692466 y=926543184.2143952]
3670:It is thick that [coordinate is:X=1447659068.192466 y=926543184.2143952]
3671:1:2 [coordinate is:X=1447659155.958423 y=926543184.2143952]
3672:In cement mortar plus equivalent to [coordinate is:X=1447659249.708423 y=
926543184.2143952]
3673:[coordinate is cement weight:X=1447657722.28492 y=926542961.0158562]
3674:5% [coordinate is:X=1447658073.34875 y=926542961.0158562]
3675:Polymer mortar (this absolute altitude be reinforced concrete construction when do not do).[coordinate is:X=
1447658185.84875 y=926542961.0158562]
3676:4 [coordinate is:X=1447657558.485019 y=926542757.3695289]
3677:, cross between water hole two one, hole inwall smears that [coordinate is:X=1447657608.485019 y=
926542757.3695289]
3678:20 [coordinate is:X=1447658749.442466 y=926542757.3695289]
3679:It is thick that [coordinate is:X=1447658836.942466 y=926542757.3695289]
3680:1:2 [coordinate is:X=1447658924.708423 y=926542757.3695289]
3681:Cement mortar adds that [coordinate is:X=1447659018.458423 y=926542757.3695289]
3682:5% [coordinate is:X=1447659457.288211 y=926542757.3695289]
3683:Waterproofing agent.[coordinate is:X=1447659569.788211 y=926542757.3695289]
3684:5 [coordinate is:X=1447657558.485019 y=926542575.1381977]
3685:, iron artistic fence by specialized factory's fabrication and installation, its color and fancy have Party A to make by oneself.[coordinate is:X=
1447657602.235019 y=926542575.1381977]
3686:6 [coordinate is:X=1447657558.485019 y=926542375.0185961]
3687:, have not clear part in work progress on drawing, should contact and resolve through consultation with designing unit in time.[coordinate
For:X=1447657602.235019 y=926542375.0185961]
3688:It all should be performed in construction by Current Building ' installing engineering acceptance specification and relevant standard, specification, regulation.[sit
It is designated as:X=1447657725.294654 y=926542188.788369]
Class 2:
3644:Two:[coordinate is engineering way:X=1447661758.77883 y=926544012.7807627]
3650:[coordinate is exterior wall:X=1447661936.167565 y=926543783.6524901]
Class 3:
3651:1 [coordinate is:X=1447662111.69948 y=926543783.6524901]
3652::[coordinate is:X=1447662142.94948 y=926543783.6524901]
3653:1:1 [coordinate is:X=1447662373.119285 y=926543783.6524901]
3654:[coordinate is white cement wiping seam:X=1447662454.369285 y=926543783.6524901]
3658:[coordinate is 8-10:X=1447662373.119285 y=926543599.0564725]
3659:Thick light gray facing tile, concrete interface agent, strong adherence force are brushed one times on brick sticking veneer with patch with painting
[coordinate is:X=1447662535.619285 y=926543599.0564725]
3690:6 [coordinate is:X=1447662373.119285 y=926543404.5512771]
3691:It is thick that [coordinate is:X=1447662416.869285 y=926543404.5512771]
3692:1:2.5 [coordinate is:X=1447662504.635242 y=926543404.5512771]
3693:[coordinate is cement mortar (mixing building adhesive):X=1447662660.885242 y=
926543404.5512771]
3694:12 [coordinate is:X=1447662357.412471 y=926543231.9392951]
3695:It is thick that [coordinate is:X=1447662432.412471 y=926543231.9392951]
3696:1 [coordinate is:X=1447662520.178428 y=926543231.9392951]
3697::[coordinate is:X=1447662551.428428 y=926543231.9392951]
3698:3 [coordinate is:X=1447662639.194386 y=926543231.9392951]
3699:[coordinate is for cement mortar bottoming brooming:X=1447662682.944385 y=
926543231.9392951]
3700:[coordinate is base course wall:X=1447662391.443578 y=926543056.9064493]
It can see from the effect after cluster, the text box for belonging to right side has been gathered in class 2 and class 3, and belong to left side
Text box gathered in class 1.
Finally, the text box after adjustment order is subjected to word output according to the angular type of text frame.
Content after output is as follows:
One:1st, the thick concrete brick of more than this engineering ± 0.000 wall 200, strength grade is MU10, using M5 mixed mortars
Build by laying bricks or stones, thickness of wall body and the detailed plane of column size and profile.Less than ± 0.000 wall in detail apply by knot.2nd, enclosure wall is set every 40 meters
One expansion joint, slit width 30mm3, wall damp-proof course:20 thickness 1 are done at -0.060 under terrace:In 2 cement mortar plus equivalent to cement
The polymer mortar (not done when this absolute altitude is reinforced concrete construction) of weight 5%.4th, one is crossed between water hole two, and hole inwall is smeared
20 thickness 1:2 cement mortar add 5% waterproofing agent.5th, iron artistic fence has Party A certainly by specialized factory's fabrication and installation, its color and fancy
It is fixed.6th, there is not clear part in work progress on drawing, should contact and resolve through consultation with designing unit in time.All should be by existing in construction
Building installation engineering acceptance specification and relevant standard, specification, regulation are performed.Two:Engineering way exterior wall 1:1:1 white cement wiping seam 8-
10 thick light grey facing tiles, concrete interface agent, the strong thickness 1 of adherence force 6 are brushed one times on brick sticking veneer with patch with painting:2.5 water
The thickness 1 of cement mortar (mixing building adhesive) 12:3 cement mortar bottoming brooming base course walls
From the foregoing, it will be observed that the content after output meets reading logic, it is to avoid continuous content, which is resolved to, to be separated by
Very remote place so that extract drawing text and translated more convenient.
Embodiment 6
As shown in Fig. 2 the drawing Reading text system based on clustering of the invention, including for by the text on drawing
Frame carries out the sort module of angle classification according to its angle;Coordinate characteristic value for the text box that extracts same angular type
Extraction module;Clustering is carried out for the text box to same angular type so that the similar text box quilt of coordinate characteristic value
Gather in same class, and the cluster module being ranked up according to cluster result to text box;For by the text box root after sequence
The output module of word output is carried out according to the angular type of text frame.
When the present embodiment is implemented, sort module, extraction module and output module are preferably ARM7 processors, and cluster module is excellent
Elect Cortex-A7 processors as.Sort module first carries out angle classification to the text box on drawing according to its angle, by text box
Multiple different angular types are divided into, as often occurred in cad drawings:0 °, 90 °, 180 ° and 270 °;Then extraction module is extracted
The coordinate characteristic value of the text box of same angular type, this coordinate characteristic value is the seat for identifying the exclusive position of each text box
Scale value;Subsequently, cluster module carries out clustering to the text box of same angular type so that the similar text of coordinate characteristic value
This frame is gathered in same class, and text box is ranked up according to cluster result;Finally, output module is by the text after sequence
Frame carries out word output according to the angular type of text frame.By taking cad drawings as an example:Angle such as text box is 0 °, then word
The order of output is from left to right;Angle such as text box is 90 °, then the order of word output is from top to bottom;Such as text box
Angle be 180 °, then the order of word output is from right to left;Angle such as text box is 270 °, then the order of word output
For from top to bottom.The present invention is by the way that text box is clustered so that the close text box of content is gathered same class, is then exported
Word, it is to avoid continuous content, which is resolved to, is separated by very remote place so that extract drawing text and simultaneously translated more
It is convenient.
Embodiment 7
The present embodiment is on the basis of embodiment 6, and the coordinate characteristic value uses the coordinate value in the text box upper left corner, lower-left
The coordinate value at angle, the coordinate value in the upper right corner, the coordinate value of the coordinate value in the lower right corner or central point.
When the present embodiment is implemented, coordinate characteristic value uses the coordinate value, the coordinate value in the lower left corner, upper right in the text box upper left corner
The coordinate value of the coordinate value at angle, the coordinate value in the lower right corner or central point, due to coordinate characteristic value be identify each text box it is only
There is the coordinate value of position, above-mentioned five kinds of coordinate values can express the exclusive position of text frame, effectively increase the present invention poly-
The accuracy of class.
Embodiment 8
The present embodiment is on the basis of embodiment 6, and the cluster uses optics algorithms;The optics algorithms are by reading
The coordinate characteristic value of orderly text box group Chinese version frame is taken to determine the relative distance between text box;The coordinate characteristic value phase
As standard be relative distance be less than or equal to threshold value.
The present embodiment implement when, due to the text box on drawing have rule but it is irregular the characteristics of, inventor passes through
Creative work is found, because text box is irregular, so it is difficult to accurately determine clustering parameter, and when using optics algorithms,
When reasonable change occurs for clustering parameter, cluster result change is little.And in the optics algorithms that the present invention is applied, by reading
The coordinate characteristic value of orderly text box group Chinese version frame determines the relative distance between text box, then with this relative distance
As data required for optics algorithms, this mode can effectively determine whether text box is assembled, and the text that will do not assemble
This frame is assigned in different classes.In optics algorithms of the present invention, the similar standard of coordinate characteristic value is that relative distance is less than threshold
Value, the threshold value is density value in present invention application, it is only necessary to provide rational density value, you can reasonably gather text box
Class, effectively reduction cluster error.
Embodiment 9
The present embodiment is on the basis of embodiment 6, and cluster module is additionally operable to after the completion of cluster, according to the coordinate of text box
The shape that characteristic value is constituted to each class Chinese version frame judges;If what text box was constituted is shaped as preset shape,
Such is sent to output module;If text box constitute shape be preset shape, adjustment clustering parameter to such again
Secondary cluster.
When the present embodiment is implemented, by taking civil construction cad drawings as an example:The characteristics of civil construction cad drawing Chinese version frames is,
If the angle of text box is 0 °, the horizontal coordinate of the first row text box is same or similar, meanwhile, first row text box it is vertical
It is same or similar to coordinate, so preset shape is same or similar while first row is literary using the horizontal coordinate of the first row text box
The same or analogous shape of longitudinal coordinate of this frame.And in clustering, it sometimes appear that what the text box after cluster was constituted
T-shape or circle are shaped as, at this moment, it is not preset shape, and adjust clustering parameter such is clustered again to assert the shape.
The present invention effectively raises the accuracy of cluster by the judgement of the shape constituted to text box.
Above-described embodiment, has been carried out further to the purpose of the present invention, technical scheme and beneficial effect
Describe in detail, should be understood that the embodiment that the foregoing is only the present invention, be not intended to limit the present invention
Protection domain, within the spirit and principles of the invention, any modification, equivalent substitution and improvements done etc. all should be included
Within protection scope of the present invention.