CN102541826A - Text block content reorganizing method and device - Google Patents

Text block content reorganizing method and device Download PDF

Info

Publication number
CN102541826A
CN102541826A CN2010106218064A CN201010621806A CN102541826A CN 102541826 A CN102541826 A CN 102541826A CN 2010106218064 A CN2010106218064 A CN 2010106218064A CN 201010621806 A CN201010621806 A CN 201010621806A CN 102541826 A CN102541826 A CN 102541826A
Authority
CN
China
Prior art keywords
original block
original
blocks
adjacent
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010106218064A
Other languages
Chinese (zh)
Other versions
CN102541826B (en
Inventor
徐剑波
黄文娟
董宁
朱兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Fangzheng Apapi Technology Co Ltd
New Founder Holdings Development Co ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Apabi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Apabi Technology Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN201010621806.4A priority Critical patent/CN102541826B/en
Publication of CN102541826A publication Critical patent/CN102541826A/en
Application granted granted Critical
Publication of CN102541826B publication Critical patent/CN102541826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)
  • Processing Or Creating Images (AREA)

Abstract

An embodiment of the invention discloses a text block content reorganizing method and a text block content reorganizing device, which relate to the field of information processing of computers and are used for solving the problem that original blocks in text blocks cannot be sorted to restore reading sequence thereof in the text blocks. The text block content reorganizing method includes: determining the reorganizing mode of the original blocks corresponding to the composing mode of the original blocks in the text blocks according to the preset corresponding relations of the composing mode and the reorganizing model of the original blocks, reorganizing the original blocks in the text blocks according to the reorganizing mode of the original blocks, and outputting and displaying the reorganizing original blocks. By the text block content reorganizing method and the text block content reorganizing device, the original blocks in the text blocks can be reorganizing by a certain reorganizing mode so that reading sequence of the original blocks in the text blocks can be restored.

Description

Literal piece content recombination method and device
Technical field
The present invention relates to the computer information processing field, relate in particular to a kind of literal piece content recombination method and device.
Background technology
The page of layout files can comprise one or more literal pieces, and each literal piece comprises one or more original blocks, and is as shown in Figure 1, and the passage in the page is divided into a literal piece, and the literal that each rectangle in this article block surrounds is an original block.
Can adopt type-setting modes such as horizontally-arranged, vertical setting of types that the original block in the literal piece is set type at present; For every kind of type-setting mode, has different writing directions again, for example; Horizontally-arranged comprises from left to right and writing direction from right to left that vertical setting of types also comprises from left to right and writing direction from right to left.
In realizing process of the present invention, the inventor finds to exist in the prior art following technical matters:
How literal piece for adopting certain type-setting mode sorts to the original block in this article block, and the reading order with original block in reduction this article block does not also have concrete implementation at present.
Summary of the invention
The embodiment of the invention provides a kind of literal piece content recombination method and device, is used for solving prior art and can't sorts to the original block in the literal piece with the problem of the reading order of original block in reduction this article block.
A kind of literal piece content recombination method, this method comprises:
According to the corresponding relation of predefined type-setting mode and original block sortord, confirm the corresponding original block sortord of type-setting mode of original block in the literal piece;
According to said original block sortord the original block in the said literal piece is sorted;
Original block after the ordering is exported demonstration.
A kind of literal piece content reconstruction unit, this device comprises:
Sortord is confirmed the unit, is used for the corresponding relation according to predefined type-setting mode and original block sortord, confirms the corresponding original block sortord of type-setting mode of original block in the literal piece;
The original block sequencing unit is used for according to said original block sortord the original block in the said literal piece being sorted;
The content reduction unit is exported demonstration with the original block after the ordering.
Among the present invention; Corresponding relation according to predefined type-setting mode and original block sortord; Confirm the corresponding original block sortord of type-setting mode of original block in the literal piece; According to this original block sortord the original block in the literal piece is sorted then, and the original block after will sorting is exported demonstration.It is thus clear that, adopt the present invention, can the original block in the literal piece that adopt certain type-setting mode be sorted, with the reading order of original block in reduction this article block.
Description of drawings
Fig. 1 is that original block of the prior art is divided synoptic diagram;
The method flow synoptic diagram that Fig. 2 provides for the embodiment of the invention;
Fig. 3 A-Fig. 3 D is the type-setting mode synoptic diagram in the embodiment of the invention;
Fig. 3 E is the original block boundary line synoptic diagram in the embodiment of the invention;
Fig. 4 A-Fig. 4 J is the literal piece content reorganization synoptic diagram in the embodiment of the invention;
The apparatus structure synoptic diagram that Fig. 5 provides for the embodiment of the invention.
Embodiment
For the original block in the literal piece that adopts certain type-setting mode is sorted; Reading order with original block in reduction this article block; The embodiment of the invention provides a kind of literal piece content recombination method; In this method,, confirm the corresponding original block sortord of type-setting mode of original block in the literal piece according to the corresponding relation of predefined type-setting mode and original block sortord; According to this original block sortord the original block in the literal piece is sorted then, and the original block after will sorting is exported demonstration.
Referring to Fig. 2, the literal piece content recombination method that the embodiment of the invention provides may further comprise the steps:
Step 20:, confirm the corresponding original block sortord of type-setting mode of original block in the literal piece according to the corresponding relation of predefined type-setting mode and original block sortord;
Step 21: the original block sortord according to confirming sorts to the original block in the literal piece;
Step 22: the original block after will sorting is exported demonstration.
In step 20 and the step 21, when the type-setting mode of original block was horizontally-arranged or vertical setting of types in the literal piece, the original block sortord specifically may further comprise the steps a and step b:
Step a: the sequence number based on each original block in the literal piece sorts the original block in the literal piece;
Step b: per two the adjacent original blocks in the ranking results of step a are carried out following steps: calculate the degree of overlapping of two adjacent original blocks, confirm the new position relation of two adjacent original blocks according to the degree of overlapping of two adjacent original blocks; If this position relation is different with the position relation of two adjacent original blocks in the ranking results of step a, then with the location swap of two adjacent original blocks in the ranking results of step a.
Behind the execution of step b, promptly obtain the ranking results of the original block in the literal piece.
Among the step a, the sequence number of original block is the sequence number that format document wright is provided with for each original block when making the format document, and this sequence number has been indicated the output order of the original block in the format document.When the sequence number based on each original block sorts original block, for the sequence number of original block setting increases successively, then can each original block be sorted according to sequence number order from small to large as if format document producer; If format document producer then can sort each original block according to sequence number order from big to small for the sequence number of original block setting reduces successively.
Among the step b, confirm that based on the degree of overlapping of two adjacent original blocks the new position of two adjacent original blocks concerns that its concrete realization can be following:
Branch one: confirm front and back position relation and the upper-lower position relation of two adjacent original blocks in the literal piece; If two adjacent original blocks degree of overlapping in the horizontal direction is less than the first threshold that sets in advance; Then confirm the new position relation of two adjacent original blocks according to the upper-lower position relation of two adjacent original blocks; In the relation of new position, be the original block that is positioned at first promptly at last original block, under original block in new position relation, be to be positioned at second original block;
Branch two: if two adjacent original blocks in the degree of overlapping of vertical direction less than 0; Then confirm the new position relation of two adjacent original blocks according to the front and back position relation of two adjacent original blocks; In the relation of new position, be the original block that is positioned at first promptly at preceding original block, after original block in new position relation, be to be positioned at second original block;
Branch three: if two adjacent original blocks in the horizontal direction degree of overlapping and in the degree of overlapping of vertical direction all greater than second threshold value that is provided with in advance; Then confirm the new position relation of two adjacent original blocks based on the sequence number of two adjacent original blocks; Be that the less original block of sequence number is the original block that is positioned at first in new position relation, the original block that sequence number is bigger is to be positioned at second original block in new position relation;
Branch four: if two adjacent original blocks in the degree of overlapping of vertical direction less than in the horizontal direction degree of overlapping, then confirm that according to the left margin difference of two adjacent original blocks and the magnitude relationship of right margin difference the new position of two adjacent original blocks concerns;
Branch five:, then confirm that according to the coboundary difference of two adjacent original blocks and the magnitude relationship of lower boundary difference the new position of two adjacent original blocks concerns if two adjacent original blocks are not less than degree of overlapping in the horizontal direction in the degree of overlapping of vertical direction.
In the above-mentioned branch one, confirm the front and back position relation of two adjacent original blocks, its concrete realization can be following:
Calculate the left margin difference of two adjacent original blocks and the right margin difference of two adjacent original blocks; Concrete, calculate Block in two adjacent original blocks iLeft margin and original block Block jLeft margin poor, obtain the left margin difference and the original block Block of two adjacent original blocks jRight margin and original block Block iRight margin poor, obtain the right margin difference of two adjacent original blocks, wherein: subscript i, j represent sequence number respectively;
If the left margin difference is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types less than the type-setting mode of original block in right margin difference and the literal piece; The front and back position relation of then confirming two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after;
If the left margin difference is from right to left horizontally-arranged or vertical setting of types from right to left less than the type-setting mode of original block in right margin difference and the literal piece; The front and back position relation of then confirming two adjacent original blocks is: second original block in these two adjacent original blocks be preceding, first original block after;
If it is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types that the left margin difference is not less than the type-setting mode of original block in right margin difference and the literal piece; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after;
If the type-setting mode that the left margin difference is not less than original block in right margin difference and the literal piece is from right to left horizontally-arranged or vertical setting of types from right to left; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
In the above-mentioned branch one, confirm the upper-lower position relation of two adjacent original blocks in the literal piece, its concrete realization can be following:
Calculate the coboundary difference of two adjacent original blocks and the lower boundary difference of two adjacent original blocks; Concrete, calculate Block in two adjacent original blocks iCoboundary and original block Block jCoboundary poor, obtain the coboundary difference and the original block Block of two adjacent original blocks jLower boundary and original block Block iLower boundary poor, obtain the lower boundary difference of two adjacent original blocks, wherein: subscript i, j represent sequence number respectively;
If the coboundary difference is less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down;
If the coboundary difference is not less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
In the above-mentioned branch four, confirm that according to the left margin difference of two adjacent original blocks and the magnitude relationship of right margin difference the new position of two adjacent original blocks concerns that its concrete realization can be following:
If the left margin difference is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after;
If the left margin difference is from right to left horizontally-arranged or vertical setting of types from right to left less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: second original block in these two adjacent original blocks be preceding, first original block after;
If it is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after;
If the type-setting mode that the left margin difference is not less than original block in right margin difference and the said literal piece is from right to left horizontally-arranged or vertical setting of types from right to left; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
In the above-mentioned branch five, confirm that according to the coboundary difference of two adjacent original blocks and the magnitude relationship of lower boundary difference the new position of two adjacent original blocks concerns that its concrete realization can be following:
If the coboundary difference is less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down;
If the coboundary difference is not less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
In the above-mentioned branch one, the type-setting mode of original block is horizontally-arranged or nondirectional vertical setting of types and two adjacent original blocks when in the literal piece, going together in the literal piece, first threshold can be between-0.07 and-0.1 value; In the literal piece type-setting mode of original block be horizontally-arranged or nondirectional vertical setting of types and two adjacent original blocks in the literal piece during different rows, first threshold is value between-0.03 and-0.06; The type-setting mode of original block is from left to right or during vertical setting of types from right to left, the value of first threshold can be 0 in the literal piece.In the above-mentioned branch three, second threshold value can be between 0.5 and 1 value.
Certainly, the above-mentioned listed first threshold and the span of second threshold value are more excellent span, and any other span and can realize goal of the invention of the present invention still in protection scope of the present invention.
Concrete, can confirm whether two adjacent original blocks go together in the literal piece according to following mode:
At first, the size of the horizontal base line value of two adjacent original blocks is relatively confirmed the line space size according to the font size that has than the original block of levels baseline value; For example, will have C than the font size of the original block of levels baseline value doubly as the line space size, C can be in 0.91-1 value, certainly, C also can get other greater than 0 numerical value;
Then, calculate the horizontal base line difference of these two adjacent original blocks, if the horizontal base line difference greater than said line space size, is then confirmed these two adjacent original blocks different rows in the literal piece, otherwise, confirm that these two adjacent original blocks go together in the literal piece.
In step 20 and the step 21, when the type-setting mode of original block was oblique row in the literal piece, the original block sortord specifically can comprise the steps c-step f:
Step c: the sequence number based on each original block in the literal piece sorts the original block in the literal piece;
Steps d: the coordinate system at literal piece place is rotated conversion;
Step e: the position coordinates based in the coordinate system of ordering each original block of back after rotation transformation among the step c, each original block is divided into groups, the original block after dividing into groups in same group is positioned at delegation or same row;
Step f: based on original block sequence number mean value of each group of back of dividing into groups, each group after dividing into groups is sorted, and, divide into groups for each, sort based on the writing direction of the said literal piece interior original block that will divide into groups.
Behind the execution of step f, promptly obtain the ranking results of the original block in the literal piece.
Among the step c, the sequence number of original block is the sequence number that format document wright is provided with for each original block when making the format document, and this sequence number has been indicated the output order of the original block in the format document.When the sequence number based on each original block sorts original block, for the sequence number of original block setting increases successively, then can each original block be sorted according to sequence number order from small to large as if format document producer; If format document producer then can sort each original block according to sequence number order from big to small for the sequence number of original block setting reduces successively.
In the steps d, the coordinate system that the literal piece is belonged to is rotated conversion, and its concrete realization can be following:
At first, the horizontal base line difference and the vertical parallax difference of per two the adjacent original blocks in ordering back among the calculation procedure c; Confirm maximum horizontal base line difference and the maximum vertical parallax differences of occurrence number of occurrence number that calculate; The horizontal base line difference f maximum according to occurrence number xThe vertical parallax difference f maximum with occurrence number y, the slope k of calculating oblique line, for example, k=f y/ f x
Then, be rotation angle value α according to the slope k coordinates computed, for example, computing formula can be α=arctan (k);
At last, according to coordinate system rotation angle value α, the coordinate system that the literal piece is belonged to is rotated conversion, and the coordinate system rotation mathematical formulae is: x '=x * cos α+y * sin α, and y '=-x * sin α+y * cos α.
Among the step e, the position coordinates based in the coordinate system of ordering each original block of back after rotation transformation divides into groups each original block, and its concrete realization can be following:
For per two the adjacent original blocks after the ordering among the step c, carry out following steps:
Calculate the horizontal base line value in last original block in two adjacent original blocks and the coordinate system of back one original block after rotational transform; Whether the horizontal base line value of a back original block of confirming to calculate and the difference of last original block horizontal base line value be less than predefined the 3rd threshold value; If; Confirm that then last original block and back one original block are positioned at same delegation, and last original block and back one original block are divided in same group; Otherwise, confirm that last original block and back one original block are positioned at different rows, and in being divided in not last original block and back one original block on the same group.
Here, the 3rd threshold value can be 0.91-1 times of original block height, and certain the 3rd threshold value also can be other any numerical value of 0 that is not less than.
Among the step f, the original block in will dividing into groups based on the writing direction of said literal piece sorts, and its concrete realization can be following:
Greater than 1 o'clock, confirm that the type-setting mode of original block is that vertical row type is tiltedly arranged in the literal piece in slope k, and the interior original block that will divide into groups sorts according to the upper boundary values order from small to large of original block;
Be not more than at 1 o'clock in slope k, confirm that the type-setting mode of original block is that horizontal-type is tiltedly arranged in the literal piece, and the interior original block that will divide into groups sorts according to the left side dividing value order from small to large of original block.
Among the present invention, the type-setting mode of horizontally-arranged can be referring to Fig. 3 A; The type-setting mode of vertical setting of types can be referring to Fig. 3 B; Tiltedly row's type-setting mode can be referring to Fig. 3 C for horizontal-type; Tiltedly row's type-setting mode can be referring to Fig. 3 D for vertical row type.
Among the present invention, the horizontal base line difference of two original blocks is meant, the difference of the horizontal base line value of the horizontal base line of an original block and another original block in these two original blocks.The vertical parallax difference of two original blocks is meant, the difference of the vertical parallax value of the vertical parallax value of an original block and another original block in these two original blocks.The left margin difference of two original blocks is meant, the difference of the left side dividing value of the left side dividing value of an original block and another original block in these two original blocks.The right margin difference of two original blocks is meant, the difference of the right dividing value of the right dividing value of an original block and another original block in these two original blocks.The coboundary difference of two original blocks is meant, the difference of the upper boundary values of the upper boundary values of an original block and another original block in these two original blocks.The lower boundary difference of two original blocks is meant, the difference of the lower border value of the lower border value of an original block and another original block in these two original blocks.
The vertical parallax value of the literal among Fig. 3 E in the rectangular area " " is 97.7, and the horizontal base line value is 522.18, and left side dividing value is 97.4, and upper boundary values is 506.0, and the right dividing value is 117.5, and lower border value is 525.5.Need to prove that the horizontal base line value of original block, vertical parallax value, left side dividing value, upper boundary values, the right dividing value, lower border value are the property values of original block, format document wright can be provided with when input characters automatically.
Among the present invention, degree of overlapping is meant: two original blocks overlap length and two original blocks on direction of measurement is at the ratio of the projected length that this side up.Two original blocks degree of overlapping in the horizontal direction is meant: the minimum lower boundary of two original blocks and maximum coboundary poor accounts for the ratio of the difference of maximum lower boundary and minimum coboundary.Two original blocks are meant in the degree of overlapping of vertical direction: the minimum left margin of two original blocks and maximum right margin poor accounts for the ratio of the difference of maximum left margin and minimum right margin.
The present invention will be described below in conjunction with specific embodiment:
Embodiment one:
In the present embodiment, the type-setting mode of original block is horizontally-arranged or directionless vertical setting of types in the literal piece, to the original block in the literal piece according to horizontally-arranged from left to right or from right to left the order of horizontally-arranged sort the literal of recombinating again piece content.Concrete sort method is following:
Step 01: the sequence number based on each original block in the literal piece sorts the original block in the literal piece;
Step 02: calculate two adjacent original block Block in the literal piece successively respectively iAnd Block I+1Degree of overlapping O in the horizontal direction yDegree of overlapping O with vertical direction x
Step 03:, confirm the front and back position relation of two adjacent original blocks according to the difference of the left margin of two adjacent original blocks and the extent of right margin;
Here, can use Ret xThe extent relation of difference and right margin of representing the left margin of two adjacent original blocks, for example, if the difference of left margin is from left to right horizontally-arranged or directionless vertical setting of types, then Ret less than the type-setting mode of original block in the difference of right margin and the said literal piece xValue be that the front and back position relation of-1, two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after; If the left margin difference is horizontally-arranged from right to left, then Ret less than the type-setting mode of original block in right margin difference and the said literal piece xValue be that the front and back position relation of 1, two adjacent original block is: second original block in these two adjacent original blocks be preceding, first original block after; If it is from left to right horizontally-arranged or directionless vertical setting of types, then Ret that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece xValue be that the front and back position relation of 1, two adjacent original block in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after; If it is horizontally-arranged from right to left, then Ret that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece xValue be that the front and back position relation of-1, two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
Step 04:, confirm the upper-lower position relation of two adjacent original blocks according to the difference of the coboundary of two adjacent original blocks and the extent of lower boundary;
Here, can use Ret yThe extent relation of difference and lower boundary of representing the coboundary of two adjacent original blocks, for example, if the difference of coboundary poor less than lower boundary, then Ret yValue be that the upper-lower position relation of-1, two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down; If the coboundary difference is not less than lower boundary difference, then Ret yValue be that the upper-lower position relation of 1, two adjacent original block in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
Step 05: whether go together according to two adjacent original blocks, the degree of overlapping O of horizontal direction is set yThreshold value, be set to-0.08 during the colleague, be set to-0.05 during different rows.Judge that the method whether two adjacent original blocks go together is:
The size of the horizontal base line value of two adjacent original blocks relatively, 0.95 times of conduct of font size of original block that will have relatively large horizontal base line value is with the line space size;
Calculate the horizontal base line difference of these two adjacent original blocks, and with line space relatively, if greater than line space, different rows then, otherwise colleague.
Step 06: if two adjacent original blocks are not overlapping in the horizontal direction, i.e. in the horizontal direction degree of overlapping O yLess than threshold value, then sort according to the upper-lower position relation;
Step 07: if two original blocks are not overlapping in vertical direction, promptly in the degree of overlapping of vertical direction less than 0, then sort according to context;
Step 08: if two original blocks are all overlapping in level and vertical direction, and degree of overlapping then sorts according to the sequence number with original block all greater than 0.5 o'clock;
Step 09:, then sort according to degree of overlapping if above-mentioned condition does not all satisfy.
The method that sorts according to degree of overlapping is following:
If the degree of overlapping of vertical direction is less than the degree of overlapping of horizontal direction, then according to Ret xSize sort Ret even xLess than 0, Block then iPreceding, otherwise Block I+1Preceding;
If the degree of overlapping of vertical direction is not less than the degree of overlapping of horizontal direction, then according to Ret ySize sort Ret even yLess than 0, Block then iPreceding, otherwise Block I+1Preceding.
Shown in Fig. 4 A, be the literal piece synoptic diagram of horizontally-arranged for type-setting mode; Shown in Fig. 4 B, thereby for according to the method described above to the content reorganization that the obtains synoptic diagram as a result that sorts of the original block in the literal piece shown in Fig. 4 A;
Shown in Fig. 4 C, be the literal piece synoptic diagram of directionless vertical setting of types for type-setting mode; Shown in Fig. 4 D, thereby for according to the method described above to the content reorganization that the obtains synoptic diagram as a result that sorts of the original block in the literal piece shown in Fig. 4 C.
Embodiment two:
In the present embodiment, the type-setting mode of original block is a vertical setting of types in the literal piece, the original block in the literal piece is sorted the literal of recombinating again piece content according to vertical setting of types from left to right, vertical setting of types from left to right; Concrete sort method is following:
Step 11: the sequence number based on each original block in the literal piece sorts the original block in the literal piece;
Step 12: calculate two adjacent original block Block in the literal piece successively respectively iAnd Block I+1Degree of overlapping O in the horizontal direction yDegree of overlapping O with vertical direction x
Step 13:, confirm the front and back position relation of two adjacent original blocks according to the difference of the left margin of two adjacent original blocks and the extent of right margin;
Here, can use Ret xThe extent relation of difference and right margin of representing the left margin of two adjacent original blocks, for example, if the difference of left margin is vertical setting of types from left to right, then Ret less than the type-setting mode of original block in the difference of right margin and the said literal piece xValue be that the front and back position relation of-1, two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after; If the left margin difference is vertical setting of types from right to left, then Ret less than the type-setting mode of original block in right margin difference and the said literal piece xValue be that the front and back position relation of 1, two adjacent original block is: second original block in these two adjacent original blocks be preceding, first original block after; If it is vertical setting of types from left to right, then Ret that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece xValue be that the front and back position relation of 1, two adjacent original block in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after; If it is vertical setting of types from right to left, then Ret that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece xValue be that the front and back position relation of-1, two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
Step 14:, confirm the upper-lower position relation of two adjacent original blocks according to the difference of the coboundary of two adjacent original blocks and the extent of lower boundary;
Here, can use Ret yThe extent relation of difference and lower boundary of representing the coboundary of two adjacent original blocks, for example, if the difference of coboundary poor less than lower boundary, then Ret yValue be that the upper-lower position relation of-1, two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down; If the coboundary difference is not less than lower boundary difference, then Ret yValue be that the upper-lower position relation of 1, two adjacent original block in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
Step 15: if two adjacent original blocks are not overlapping in vertical direction, promptly in the degree of overlapping of vertical direction less than 0, then sort according to context;
Step 16: if two adjacent original blocks are not overlapping in the horizontal direction, i.e. in the horizontal direction degree of overlapping O yLess than 0, then sort according to the upper-lower position relation;
Step 17: if two original blocks are all overlapping in level and vertical direction, and degree of overlapping then sorts according to the sequence number with original block all greater than 0.5 o'clock;
Step 18:, then sort according to degree of overlapping if above-mentioned condition does not all satisfy.
The method that sorts according to degree of overlapping is following:
If the degree of overlapping of vertical direction is less than the degree of overlapping of horizontal direction, then according to Ret xSize sort Ret even xLess than 0, Block then iPreceding, otherwise Block I+1Preceding;
If the degree of overlapping of vertical direction is not less than the degree of overlapping of horizontal direction, then according to Ret ySize sort Ret even yLess than 0, Block then iPreceding, otherwise Block I+1Preceding.
Shown in Fig. 4 E, be the literal piece synoptic diagram of vertical setting of types from left to right for type-setting mode; Shown in Fig. 4 F, thereby for according to the method described above to the content reorganization that the obtains synoptic diagram as a result that sorts of the original block in the literal piece shown in Fig. 4 E.
Embodiment three:
In the present embodiment; The type-setting mode of original block is that horizontal-type is tiltedly arranged or vertical row type is tiltedly arranged in the literal piece; Then the coordinate system to literal piece place carries out the rotational transform of coordinate system; Find colleague's original block according to the coordinate position of original block in rotating coordinate system, again colleague's original block according to concrete from left to right or composing from right to left sort the literal of recombinating again piece content in proper order.Concrete sort method is following:
Step 21: the sequence number based on each original block in the literal piece sorts the original block in the literal piece;
Step 22: horizontal base line difference and the vertical parallax difference of calculating per two the adjacent original blocks in ordering back; Confirm maximum horizontal base line difference and the maximum vertical parallax differences of occurrence number of occurrence number that calculate; The horizontal base line difference f maximum according to occurrence number xThe vertical parallax difference f maximum with occurrence number y, the slope k of calculating oblique line, that is: k=f y/ f x
Step 23: the value according to slope k calculates the coordinate system rotation angle [alpha], that is: the tangent value of α is k, and computing formula is: α=arctan (k);
Annotate: mathematic sign arctan (), its meaning is an arc tangent.
Step 24: the original block of seeking the colleague;
Horizontal base line value BaseY with original block OldWith vertical parallax value BaseX OldCalculate the horizontal base line value BaseY of baseline value in postrotational coordinate system according to the rotation of coordinate formula NewWith vertical parallax value BaseX New, through calculating BaseY NewBaseY with its previous original block NewDifference whether seek colleague's original block less than 0.95 times original block height, if then be colleague's original block, otherwise be different rows.
Step 25: carry out original block ordering in oblique row of horizontal-type and the oblique row of arranging of vertical row type based on the absolute value of k and 1 relation;
Absolute value as if k is not more than 1, and then type-setting mode is tiltedly arranged for from left to right oblique row of horizontal-type or horizontal-type from right to left, and the original block of going together is sorted according to left side dividing value direction from small to large.
Otherwise type-setting mode tiltedly arranges for vertical row type from left to right or vertical row type is from right to left tiltedly arranged, and the original block of going together is sorted according to upper boundary values direction from small to large.
Step 26: if the line number of literal piece then sorts between every trade to advancing greater than 1, promptly confirm the order of each row, concrete:
Add up the mean value of sequence number of each row original block, according to the size of sequence number mean value according to from small to large order Sort Rows.
Annotate: the coordinate system rotation mathematical formulae is: x '=x * cos α+y * sin α, and y '=-x * sin α+y * cos α.In the present invention, x ' is BaseX New, y ' is BaseY New, x is BaseX Old, y is BaseY Old
Shown in Fig. 4 G, for type-setting mode is tiltedly row's a literal piece synoptic diagram of horizontal-type; Shown in Fig. 4 H, thereby for according to the method described above to the content reorganization that the obtains synoptic diagram as a result that sorts of the original block in the literal piece shown in Fig. 4 G.
Shown in Fig. 4 I, for type-setting mode is tiltedly row's a literal piece synoptic diagram of vertical row type; Shown in Fig. 4 J, thereby for according to the method described above to the content reorganization that the obtains synoptic diagram as a result that sorts of the original block in the literal piece shown in Fig. 4 I.
The effect of the embodiment of the invention is: adopt method provided by the present invention, can robotization ground according to the type-setting mode of layout files, set up the reading order of space of a whole page literal, reduce original space of a whole page content information.Layout file is counter separate after, artificial only the need simply confirm the reading order of article content, improved anti-efficient of separating with index.
Referring to Fig. 5, the embodiment of the invention also provides a kind of literal piece content reconstruction unit, and this device comprises:
Sortord is confirmed unit 40, is used for the corresponding relation according to predefined type-setting mode and original block sortord, confirms the corresponding original block sortord of type-setting mode of original block in the literal piece;
Original block sequencing unit 41 is used for according to said original block sortord the original block in the said literal piece being sorted;
Content reduction unit 42 is exported demonstration with the original block after the ordering.
Further, said original block sequencing unit 41 specifically comprises:
First sequencing unit when type-setting mode that is used for original block in said literal piece is horizontally-arranged or vertical setting of types, sorts the original block in the said literal piece according to the sequence number of each original block in the said literal piece;
The ordering amending unit is used for carrying out following steps for per two adjacent original blocks of ranking results: calculate the degree of overlapping of two adjacent original blocks, confirm the new position relation of two adjacent original blocks according to the degree of overlapping of two adjacent original blocks; If this position relation is different with the position relation of two adjacent original blocks in ranking results, then with the location swap of two adjacent original blocks in ranking results.
Further, said ordering amending unit specifically is used for: the new position relation of confirming two adjacent original blocks according to following mode:
Confirm front and back position relation and the upper-lower position relation of two adjacent original blocks;
If two adjacent original blocks degree of overlapping in the horizontal direction less than the first threshold that is provided with in advance, is then confirmed the new position relation of two adjacent original blocks based on the upper-lower position relation of two adjacent original blocks;
If two adjacent original blocks less than 0, are then confirmed the new position relation of two adjacent original blocks in the degree of overlapping of vertical direction according to the front and back position relation of two adjacent original blocks;
If two adjacent original blocks in the horizontal direction degree of overlapping and in the degree of overlapping of vertical direction all greater than second threshold value that is provided with in advance, then confirm the new position relation of two adjacent original blocks according to the sequence number of two adjacent original blocks;
If two adjacent original blocks less than in the horizontal direction degree of overlapping, then confirm that according to the left margin difference of two adjacent original blocks and the magnitude relationship of right margin difference the new position of two adjacent original blocks concerns in the degree of overlapping of vertical direction;
If two adjacent original blocks are not less than degree of overlapping in the horizontal direction in the degree of overlapping of vertical direction, then confirm that according to the coboundary difference of two adjacent original blocks and the magnitude relationship of lower boundary difference the new position of two adjacent original blocks concerns.
Further, said ordering amending unit specifically is used for: the front and back position relation of confirming two adjacent original blocks according to following mode:
Calculate the left margin difference and the right margin difference of two adjacent original blocks;
If the left margin difference is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after;
If the left margin difference is from right to left horizontally-arranged or vertical setting of types from right to left less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: second original block in these two adjacent original blocks be preceding, first original block after;
If it is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after;
If the type-setting mode that the left margin difference is not less than original block in right margin difference and the said literal piece is from right to left horizontally-arranged or vertical setting of types from right to left; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
Further, said ordering amending unit specifically is used for: confirm the upper-lower position relation of two adjacent original blocks in the literal piece in the following manner:
Calculate the coboundary difference and the lower boundary difference of two adjacent original blocks;
If the coboundary difference is less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down;
If the coboundary difference is not less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
Further, said ordering amending unit specifically is used for: confirm that according to the left margin difference of two adjacent original blocks and the magnitude relationship of right margin difference the new position of two adjacent original blocks concerns according to following mode:
If the left margin difference is from left to right horizontally-arranged or vertical setting of types from left to right less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after;
If the left margin difference is from right to left horizontally-arranged or vertical setting of types from right to left less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: second original block in these two adjacent original blocks be preceding, first original block after;
If the type-setting mode that the left margin difference is not less than original block in right margin difference and the said literal piece is from left to right horizontally-arranged or vertical setting of types from left to right; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after;
If the type-setting mode that the left margin difference is not less than original block in right margin difference and the said literal piece is from right to left horizontally-arranged or vertical setting of types from right to left; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
Further, said ordering amending unit specifically is used for: confirm that according to the coboundary difference of two adjacent original blocks and the magnitude relationship of lower boundary difference the new position of two adjacent original blocks concerns according to following mode:
If the coboundary difference is less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down;
If the coboundary difference is not less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
The type-setting mode of original block is horizontally-arranged or nondirectional vertical setting of types and two adjacent original blocks when in the literal piece, going together in said literal piece, first threshold can be between-0.07 and-0.1 value; In said literal piece the type-setting mode of original block be horizontally-arranged or nondirectional vertical setting of types and two adjacent original blocks in the literal piece during different rows, first threshold can be between-0.03 and-0.06 value; The type-setting mode of original block is from left to right or during vertical setting of types from right to left, the value of first threshold can be 0 in said literal piece.Second threshold value can be between 0.5 and 1 value.
Further, said ordering amending unit also is used for: confirm in the following manner whether two adjacent original blocks go together in the literal piece:
The size of the horizontal base line value of two adjacent original blocks is relatively confirmed the line space size according to the font size that has than the original block of levels baseline value;
Calculate the horizontal base line difference of these two adjacent original blocks, if the horizontal base line difference greater than said line space size, is then confirmed these two adjacent original blocks different rows in the literal piece, otherwise, confirm that these two adjacent original blocks go together in the literal piece.
Further, said original block sequencing unit 41 specifically comprises:
Second sequencing unit, the type-setting mode that is used for original block in said literal piece be during for oblique row, according to the sequence number of each original block in the said literal piece original block in the said literal piece sorted;
The rotational transform unit is used for the coordinate system at said literal piece place is rotated conversion;
Grouped element is used for the position coordinates based on the coordinate system of ordering each original block of back after rotation transformation, and each original block is divided into groups, and the original block after dividing into groups in same group is positioned at delegation or same row;
The 3rd sequencing unit is used for the original block sequence number mean value based on each group of back of dividing into groups, each group after dividing into groups sorted, and, divide into groups for each, sort based on the writing direction of the said literal piece interior original block that will divide into groups.
Further, said rotational transform unit specifically is used for:
Calculate the horizontal base line difference and the vertical parallax difference of per two the adjacent original blocks in ordering back; Confirm maximum horizontal base line difference and the maximum vertical parallax differences of occurrence number of occurrence number that calculate; Horizontal base line difference and the occurrence number maximum vertical parallax difference maximum according to occurrence number, the slope of calculating oblique line;
According to said slope calculating coordinate system rotation angle value;
According to said coordinate system rotation angle value, the coordinate system that said literal piece is belonged to is rotated conversion.
Further, said grouped element specifically is used for: for per two the adjacent original blocks after the ordering, carry out following steps:
Calculate the horizontal base line value in last original block in two adjacent original blocks and the coordinate system of back one original block after rotational transform; Whether the horizontal base line value of a back original block of confirming to calculate and the difference of last original block horizontal base line value be less than predefined the 3rd threshold value; If; Confirm that then last original block and back one original block are positioned at same delegation, and last original block and back one original block are divided in same group; Otherwise, confirm that last original block and back one original block are positioned at different rows, and in being divided in not last original block and back one original block on the same group.Said the 3rd threshold value can be 0.91-1 times of original block height.
Further, said the 3rd sequencing unit specifically is used for: the original block in will dividing into groups according to the writing direction of said literal piece according to following mode sorts:
Greater than 1 o'clock, confirm that the type-setting mode of original block is that vertical row type is tiltedly arranged in the said literal piece at said slope, and the interior original block that will divide into groups sorts according to the upper boundary values order from small to large of original block;
Be not more than at 1 o'clock at said slope, confirm that the type-setting mode of original block is that horizontal-type is tiltedly arranged in the said literal piece, and the interior original block that will divide into groups sorts according to the left side dividing value order from small to large of original block.
To sum up, beneficial effect of the present invention comprises:
In the scheme that the embodiment of the invention provides; Corresponding relation according to predefined type-setting mode and original block sortord; Confirm the corresponding original block sortord of type-setting mode of original block in the literal piece; According to this original block sortord the original block in the literal piece is sorted then, and the original block after will sorting is exported demonstration.It is thus clear that, adopt the present invention, can the original block in the literal piece that adopt certain type-setting mode be sorted, with the reading order of original block in reduction this article block.
The present invention is that reference is described according to the process flow diagram and/or the block scheme of method, equipment (system) and the computer program of the embodiment of the invention.Should understand can be by the flow process in each flow process in computer program instructions realization flow figure and/or the block scheme and/or square frame and process flow diagram and/or the block scheme and/or the combination of square frame.Can provide these computer program instructions to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, make the instruction of carrying out through the processor of computing machine or other programmable data processing device produce to be used for the device of the function that is implemented in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame appointments.
These computer program instructions also can be stored in ability vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work; Make the instruction that is stored in this computer-readable memory produce the manufacture that comprises command device, this command device is implemented in the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame.
These computer program instructions also can be loaded on computing machine or other programmable data processing device; Make on computing machine or other programmable devices and to carry out the sequence of operations step producing computer implemented processing, thereby the instruction of on computing machine or other programmable devices, carrying out is provided for being implemented in the step of the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame.
Although described the preferred embodiments of the present invention, in a single day those skilled in the art get the basic inventive concept could of cicada, then can make other change and modification to these embodiment.So accompanying claims is intended to be interpreted as all changes and the modification that comprises preferred embodiment and fall into the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims (13)

1. literal piece content recombination method is characterized in that this method comprises:
According to the corresponding relation of predefined type-setting mode and original block sortord, confirm the corresponding original block sortord of type-setting mode of original block in the literal piece;
According to said original block sortord the original block in the said literal piece is sorted;
Original block after the ordering is exported demonstration.
2. the method for claim 1 is characterized in that, when the type-setting mode of original block was horizontally-arranged or vertical setting of types in said literal piece, said original block sortord was:
Sequence number based on each original block in the said literal piece sorts the original block in the said literal piece;
Per two adjacent original blocks in the ranking results are carried out following steps:
Calculate the degree of overlapping of two adjacent original blocks, confirm the new position relation of two adjacent original blocks according to the degree of overlapping of two adjacent original blocks; If this position relation is different with the position relation of two adjacent original blocks in ranking results, then with the location swap of two adjacent original blocks in ranking results.
3. method as claimed in claim 2 is characterized in that, said degree of overlapping according to two adjacent original blocks confirms that the new position relation of two adjacent original blocks comprises:
Confirm front and back position relation and the upper-lower position relation of two adjacent original blocks;
If two adjacent original blocks degree of overlapping in the horizontal direction less than the first threshold that is provided with in advance, is then confirmed the new position relation of two adjacent original blocks based on the upper-lower position relation of two adjacent original blocks;
If two adjacent original blocks less than 0, are then confirmed the new position relation of two adjacent original blocks in the degree of overlapping of vertical direction according to the front and back position relation of two adjacent original blocks;
If two adjacent original blocks in the horizontal direction degree of overlapping and in the degree of overlapping of vertical direction all greater than second threshold value that is provided with in advance, then confirm the new position relation of two adjacent original blocks according to the sequence number of two adjacent original blocks;
If two adjacent original blocks less than in the horizontal direction degree of overlapping, then confirm that according to the left margin difference of two adjacent original blocks and the magnitude relationship of right margin difference the new position of two adjacent original blocks concerns in the degree of overlapping of vertical direction;
If two adjacent original blocks are not less than degree of overlapping in the horizontal direction in the degree of overlapping of vertical direction, then confirm that according to the coboundary difference of two adjacent original blocks and the magnitude relationship of lower boundary difference the new position of two adjacent original blocks concerns.
4. method as claimed in claim 3 is characterized in that, confirms that according to the left margin difference of two adjacent original blocks and the magnitude relationship of right margin difference the new position relation of two adjacent original blocks comprises:
If the left margin difference is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: first original block in two adjacent original blocks is preceding, second original block after;
If the left margin difference is from right to left horizontally-arranged or vertical setting of types from right to left less than the type-setting mode of original block in right margin difference and the said literal piece; The front and back position relation of then confirming two adjacent original blocks is: second original block in these two adjacent original blocks be preceding, first original block after;
If it is from left to right horizontally-arranged or vertical setting of types from left to right or directionless vertical setting of types that the left margin difference is not less than the type-setting mode of original block in right margin difference and the said literal piece; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: second original block in these two adjacent original blocks be preceding, first original block after;
If the type-setting mode that the left margin difference is not less than original block in right margin difference and the said literal piece is from right to left horizontally-arranged or vertical setting of types from right to left; Confirm that then the front and back position relation of these two adjacent original blocks in the literal piece is: first original block in these two adjacent original blocks is preceding, second original block after.
5. method as claimed in claim 3 is characterized in that, confirms that according to the coboundary difference of two adjacent original blocks and the magnitude relationship of lower boundary difference the new position relation of two adjacent original blocks comprises:
If the coboundary difference is less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: first original block in two adjacent original blocks is last, and second original block is down;
If the coboundary difference is not less than the lower boundary difference, confirm that then the upper-lower position relation of two adjacent original blocks in the literal piece is: second original block in two adjacent original blocks be last, and first original block is down.
6. method as claimed in claim 3 is characterized in that, the type-setting mode of original block is horizontally-arranged or nondirectional vertical setting of types and when confirming that two adjacent original blocks are gone together in the literal piece, first threshold is value between-0.07 and-0.1 in said literal piece;
The type-setting mode of original block is horizontally-arranged or nondirectional vertical setting of types and confirms two adjacent original blocks in the literal piece during different rows in said literal piece, and first threshold is value between-0.03 and-0.06;
The type-setting mode of original block is from left to right or during vertical setting of types from right to left, the value of first threshold is 0 in said literal piece.
7. the method for claim 1 is characterized in that, when the type-setting mode of original block was oblique row in said literal piece, said original block sortord was:
Sequence number based on each original block in the said literal piece sorts the original block in the said literal piece;
The coordinate system at said literal piece place is rotated conversion;
Position coordinates according in the coordinate system of ordering each original block of back after rotational transform divides into groups each original block, and the original block after dividing into groups in same group is positioned at delegation or same row;
According to original block sequence number mean value of each group of back of dividing into groups, each group after dividing into groups is sorted, and, divide into groups for each, sort according to the writing direction of the said literal piece interior original block that will divide into groups.
8. method as claimed in claim 7 is characterized in that, the coordinate system at said literal piece place is rotated conversion comprises:
Calculate the horizontal base line difference and the vertical parallax difference of per two the adjacent original blocks in ordering back; Confirm maximum horizontal base line difference and the maximum vertical parallax differences of occurrence number of occurrence number that calculate; Horizontal base line difference and the occurrence number maximum vertical parallax difference maximum according to occurrence number, the slope of calculating oblique line;
According to said slope calculating coordinate system rotation angle value;
According to said coordinate system rotation angle value, the coordinate system that said literal piece is belonged to is rotated conversion.
9. method as claimed in claim 7 is characterized in that, and is said according to the position coordinates in the coordinate system of ordering each original block of back after rotational transform, and each original block is divided into groups to comprise:
For per two the adjacent original blocks after the ordering, carry out following steps:
Calculate the horizontal base line value in last original block in two adjacent original blocks and the coordinate system of back one original block after rotational transform; Whether the horizontal base line value of a back original block of confirming to calculate and the difference of last original block horizontal base line value be less than predefined the 3rd threshold value; If; Confirm that then last original block and back one original block are positioned at same delegation, and last original block and back one original block are divided in same group; Otherwise, confirm that last original block and back one original block are positioned at different rows, and in being divided in not last original block and back one original block on the same group.
10. method as claimed in claim 7 is characterized in that, the original block in will dividing into groups according to the writing direction of said literal piece sorts and comprises:
Greater than 1 o'clock, confirm that the type-setting mode of original block is that vertical row type is tiltedly arranged in the said literal piece at said slope, and the interior original block that will divide into groups sorts according to the upper boundary values order from small to large of original block;
Be not more than at 1 o'clock at said slope, confirm that the type-setting mode of original block is that horizontal-type is tiltedly arranged in the said literal piece, and the interior original block that will divide into groups sorts according to the left side dividing value order from small to large of original block.
11. a literal piece content reconstruction unit is characterized in that this device comprises:
Sortord is confirmed the unit, is used for the corresponding relation according to predefined type-setting mode and original block sortord, confirms the corresponding original block sortord of type-setting mode of original block in the literal piece;
The original block sequencing unit is used for according to said original block sortord the original block in the said literal piece being sorted;
The content reduction unit is exported demonstration with the original block after the ordering.
12. device as claimed in claim 11 is characterized in that, said original block sequencing unit comprises:
First sequencing unit when type-setting mode that is used for original block in said literal piece is horizontally-arranged or vertical setting of types, sorts the original block in the said literal piece according to the sequence number of each original block in the said literal piece;
The ordering amending unit is used for carrying out following steps for per two adjacent original blocks of ranking results: calculate the degree of overlapping of two adjacent original blocks, confirm the new position relation of two adjacent original blocks according to the degree of overlapping of two adjacent original blocks; If this position relation is different with the position relation of two adjacent original blocks in ranking results, then with the location swap of two adjacent original blocks in ranking results.
13. device as claimed in claim 11 is characterized in that, said original block sequencing unit comprises:
Second sequencing unit, the type-setting mode that is used for original block in said literal piece be during for oblique row, according to the sequence number of each original block in the said literal piece original block in the said literal piece sorted;
The rotational transform unit is used for the coordinate system at said literal piece place is rotated conversion;
Grouped element is used for the position coordinates based on the coordinate system of ordering each original block of back after rotation transformation, and each original block is divided into groups, and the original block after dividing into groups in same group is positioned at delegation or same row;
The 3rd sequencing unit is used for the original block sequence number mean value based on each group of back of dividing into groups, each group after dividing into groups sorted, and, divide into groups for each, sort based on the writing direction of the said literal piece interior original block that will divide into groups.
CN201010621806.4A 2010-12-27 2010-12-27 Text block content reorganizing method and device Active CN102541826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010621806.4A CN102541826B (en) 2010-12-27 2010-12-27 Text block content reorganizing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010621806.4A CN102541826B (en) 2010-12-27 2010-12-27 Text block content reorganizing method and device

Publications (2)

Publication Number Publication Date
CN102541826A true CN102541826A (en) 2012-07-04
CN102541826B CN102541826B (en) 2014-08-06

Family

ID=46348752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010621806.4A Active CN102541826B (en) 2010-12-27 2010-12-27 Text block content reorganizing method and device

Country Status (1)

Country Link
CN (1) CN102541826B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906347A (en) * 2021-03-22 2021-06-04 掌阅科技股份有限公司 Character typesetting method, electronic equipment and storage medium
CN115618847A (en) * 2022-12-20 2023-01-17 浙江保融科技股份有限公司 Method and device for analyzing PDF document and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1131301A (en) * 1995-03-13 1996-09-18 财团法人工业技术研究院 Word cutting method
US6175844B1 (en) * 1997-05-29 2001-01-16 Adobe Systems Incorporated Ordering groups of text in an image
CN1604075A (en) * 2004-11-22 2005-04-06 北京北大方正技术研究院有限公司 Method for conducting words reading sequence recovery for newspaper pages
CN1604074A (en) * 2004-11-22 2005-04-06 北京北大方正技术研究院有限公司 Method for determining words reading sequence for columned serial words pages with mutually exclusive pattern and characters
CN101206639A (en) * 2007-12-20 2008-06-25 北大方正集团有限公司 Method for indexing complex impression based on PDF
CN101308488A (en) * 2008-06-05 2008-11-19 北大方正集团有限公司 Document stream type information processing method based on format document and device therefor
CN101866418A (en) * 2009-04-17 2010-10-20 株式会社理光 Method and equipment for determining file reading sequences

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1131301A (en) * 1995-03-13 1996-09-18 财团法人工业技术研究院 Word cutting method
US6175844B1 (en) * 1997-05-29 2001-01-16 Adobe Systems Incorporated Ordering groups of text in an image
CN1604075A (en) * 2004-11-22 2005-04-06 北京北大方正技术研究院有限公司 Method for conducting words reading sequence recovery for newspaper pages
CN1604074A (en) * 2004-11-22 2005-04-06 北京北大方正技术研究院有限公司 Method for determining words reading sequence for columned serial words pages with mutually exclusive pattern and characters
CN101206639A (en) * 2007-12-20 2008-06-25 北大方正集团有限公司 Method for indexing complex impression based on PDF
CN101308488A (en) * 2008-06-05 2008-11-19 北大方正集团有限公司 Document stream type information processing method based on format document and device therefor
CN101866418A (en) * 2009-04-17 2010-10-20 株式会社理光 Method and equipment for determining file reading sequences

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906347A (en) * 2021-03-22 2021-06-04 掌阅科技股份有限公司 Character typesetting method, electronic equipment and storage medium
CN112906347B (en) * 2021-03-22 2021-10-15 掌阅科技股份有限公司 Character typesetting method, electronic equipment and storage medium
CN115618847A (en) * 2022-12-20 2023-01-17 浙江保融科技股份有限公司 Method and device for analyzing PDF document and readable storage medium

Also Published As

Publication number Publication date
CN102541826B (en) 2014-08-06

Similar Documents

Publication Publication Date Title
US20220236866A1 (en) Method and system for section-based editing of a website page
Agnihotri et al. Recursive bisection placement: Feng Shui 5.0 implementation details
CN1154061C (en) Method for dynamically displaying controls in toolbar display based on control usage
RU2430421C2 (en) Applying effects to merged text path
CN105279139B (en) A kind of form data displaying is regular to be configured and computational methods and system
CN104809684A (en) Graphic processing method, device and system
CN103971586A (en) E-map generation method and device
CN102541826A (en) Text block content reorganizing method and device
CN110377559B (en) PDF file data extraction method, device and storage medium
Gemsa et al. Multirow boundary-labeling algorithms for panorama images
CN102236509A (en) Device and method for full-window horizontal touch scroll of stock name fields of financial tape reading software of touch mobile equipment
CN102136039B (en) Method and equipment for establishing map model
CN104111773A (en) Data information display method and terminal
CN107526576B (en) Method and device for displaying view components of page
CN112634410A (en) Animation curve interpolator generation method and device, electronic equipment and readable storage medium
CN104484093A (en) Graphical interface arrangement display method and graphical interface arrangement display device
CN102314288A (en) Track type window splitting system and method thereof
CN103488620A (en) Multi-point touch publication typesetting method
WO2012074521A1 (en) Reshaping interfaces using content-preserving warps
CN103761307A (en) Data processing device and data processing method
TW201541341A (en) Photo/video timeline display
CN103699383A (en) Method and device for controlling page presentation
CN101727675B (en) System and method for dynamically previewing insertion positions of graphic elements
CN101950429A (en) Method and device for interpolating voltage isometric curved surface and generating pattern
US20160063756A1 (en) Method And Device For Displaying Picture, And Storage Medium Therefore

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220615

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: Beijing Fangzheng apapi Technology Co., Ltd.

Address before: 100871, Beijing, Haidian District Cheng Fu Road 298, founder building, 9 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: Beijing Fangzheng apapi Technology Co., Ltd.

TR01 Transfer of patent right