CN104516867A - Table reordering method and table reordering system - Google Patents

Table reordering method and table reordering system Download PDF

Info

Publication number
CN104516867A
CN104516867A CN201310462279.0A CN201310462279A CN104516867A CN 104516867 A CN104516867 A CN 104516867A CN 201310462279 A CN201310462279 A CN 201310462279A CN 104516867 A CN104516867 A CN 104516867A
Authority
CN
China
Prior art keywords
cell
line
information
width
logical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310462279.0A
Other languages
Chinese (zh)
Inventor
冯浩然
丁力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Founder Group Co Ltd
Beijing Founder Apabi Technology Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Apabi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Apabi Technology Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN201310462279.0A priority Critical patent/CN104516867A/en
Publication of CN104516867A publication Critical patent/CN104516867A/en
Pending legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The invention relates to a table reordering method and a table reordering system. The method comprises the steps of: setting up table logical structure information; acquiring the total row number and the total column number of a table, calculating the table width, and calculating the initial column number and the column width of each table cell; traversing each table cell, acquiring contents of the table cells, and separately composing the table cells by taking the table cells as a composing object; adding line width and table cell separation distance, combining the table cells into logical lines, combining the logical lines into a logical table, and arranging in an objective display area to draw the table. According to the improved table reordering method and the table reordering system, the original format of the table structure is not changed, the logical structure of the table can be reordered under the display limitation, and a technical problem in the prior art that table data cannot be reasonably processed when the format document is displayed in display equipment under the limitation of display by using the traditional table reordering method is solved.

Description

A kind of form rearrangement method and system
Technical field
The present invention relates to a kind of form rearrangement method and system, belong to computer information processing field.
Background technology
Layout files is in self-defining coordinate system, clearly have recorded position and the size etc. of the display of each document source data, thus the result making document print go out is consistent with the result browsed on computers, and all there is display consistency under any computer environment, ensure the former formula of master of document.Due to " stablizing " property of layout files, be suitable as very much the final issue of electronic document and the form of propagation, be widely used in the fields such as electronic government documents, e-book, electronic journal, electronic newspaper.But because format document is to the absolute describing mode of layout information, make it lack the logical organization of some data, wherein important one is exactly form.
In format document, the simple graph identical permutation of the storage mode of form and some lines and word is as good as, not clear and definite logical organization.This representation there will be problem when running into the size-constrained situation of display device screen, and display device does not have enough information to representing that the figure tuple of form processes to generate the display result meeting size.Simple processing mode taked by most display device, comprising:
1) scroll bar mode is adopted to browse original text shelves.
Because visibility region is limited, the method likely cause user roll number screen just can see complete table row, and one of important use of form is just used to illustrate and contrast regular data and information, the reading of base table table rows is imperfect likely makes troubles to user, and some equipment does not support scroll bar.
2) equal proportion scaling is carried out to form.
The weakness of the method is, seriously reduces display quality after convergent-divergent.
3) some more backward systems.Do not show this type of form or simply reduce, a display screen region finding part.Because processing mode is simple, these class methods destroy tableau format content, do not do and adopt.
4) form in format document is needed to format according to typesetting, in the process of streaming typesetting, carry out streaming rearrangement according to the topological structure of form in format.The method all needs counter structure and the logical message of preserving form in format and streaming, although to some extent solve table streaming to reset problem, but document can due to need preserve information too much cause treatment scheme more complicated, and in format occur be used for streaming reset partial information this itself logically just easily cause confusion.
To sum up, in the prior art, exist in format document and lack tableau format or the incomplete problem of logical organization or expression way not science.When these problems cause format document to be shown on the display device that screen size is limited, rationally cannot process list data wherein.
Summary of the invention
Technical matters to be solved by this invention is that form rearrangement method of the prior art is when format document is shown on the display device that screen size is limited, rationally cannot process list data wherein, the invention provides a kind of perfect form rearrangement method and system, when making tableau format keep format to describe constant, the rearrangement keeping form logical organization can be carried out when showing limited again.
For solving the problems of the technologies described above, the present invention is achieved by the following technical solutions:
A kind of form rearrangement method, comprises the following steps:
Build form logical structure information;
Obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell;
Travel through each cell, after the content of acquiring unit lattice, cell is carried out typesetting separately as typesetting object;
After adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.
Described form rearrangement method, in the process of described structure form logical structure information, comprising:
From form source file, form logical structure information and corresponding format descriptor is directly obtained by space of a whole page analytic method;
By the plate type table file acquisition form logical structure information after conversion and corresponding format descriptor;
Form logical structure information and corresponding format descriptor is obtained by accepting extraneous input.
Described form rearrangement method, is numbered the logical organization of form and corresponding format descriptor by unique pel and associates.
Described form rearrangement method, also comprises the process that live width is arranged, arranges the live width parameter of acquiescence.
Described form rearrangement method, the process of described computation sheet live width, comprising:
Traversal row, obtains row information by line index, and the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines;
Refresh the live width of line and alignment, there is not line width information and then calculate according to the live width parameter of the described acquiescence of acquiescence in current line, if exist, retaining currency, preserving all line width information by representing two horizontal and vertical arrays;
All cells of traversal current line, refresh the live width of line and alignment further.
Described form rearrangement method, the process of the initial row of each cell of described calculating number, comprise: a two-dimentional array is set, the ranks number of its size form is for this reason multiplied, the unique positions of each element position-table in array, represents whether this position can as the starting point of cell, and the attribute according to cell travels through all cells by row, obtain the initial row number of cell, calculate often row col width by row ratio and line width information.
Described form rearrangement method, if set maximum col width and minimum col width in the pattern quoted at form, then needs to readjust according to these information the col width often arranged.
Described form rearrangement method, each cell of described traversal, after the content of acquiring unit lattice, carries out cell separately in the process of typesetting as typesetting object, does not limit typesetting height, gets the actual typesetting height of each cell after typesetting terminates.
Described form rearrangement method, described add live width and cell spacing after, cell is combined into logical line, logical line is combined into logical tables, being arranged in target viewing area draws in the process of form, comprise: if current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting.
Described form rearrangement method, the process of described drafting form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again.
A kind of form system for rearranging, comprising:
Build module: build form logical structure information;
Ranks acquisition module: obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell;
Type-setting module: travel through each cell, after the content of acquiring unit lattice, carries out typesetting using cell separately as typesetting object;
Drafting module: after adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.
Described form system for rearranging, in described structure module, comprising:
Printed page analysis submodule: directly obtain form logical structure information and corresponding format descriptor by space of a whole page analytic method from form source file;
Transformant module: by the plate type table file acquisition form logical structure information after conversion and corresponding format descriptor;
Corresponding subelement: obtain form logical structure information and corresponding format descriptor by accepting extraneous input.
Described form system for rearranging, is numbered the logical organization of form and corresponding format descriptor by unique pel and associates.
Described form system for rearranging, also comprises live width setting unit, arranges the live width parameter of acquiescence.
Described form system for rearranging, the process of described computation sheet live width, comprising:
Traversal row, obtains row information by line index, and the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines;
Refresh the live width of line and alignment, there is not line width information and then calculate according to the live width parameter of the described acquiescence of acquiescence in current line, if exist, retaining currency, preserving all line width information by representing two horizontal and vertical arrays;
All cells of traversal current line, refresh the live width of line and alignment further.
Described form system for rearranging, the process of the initial row of each cell of described calculating number, comprise: a two-dimentional array is set, the ranks number of its size form is for this reason multiplied, the unique positions of each element position-table in array, represents whether this position can as the starting point of cell, and the attribute according to cell travels through all cells by row, obtain the initial row number of cell, calculate often row col width by row ratio and line width information.
Described form system for rearranging, if set maximum col width and minimum col width in the pattern quoted at form, then needs to readjust according to these information the col width often arranged.
Described form system for rearranging, in described type-setting module, does not limit typesetting height, gets the actual typesetting height of each cell after typesetting terminates.
Described form system for rearranging, in described drafting module, comprises part rendering submodule: if current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting.
Described form system for rearranging, in described drafting module, draws the process of form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again.
Technique scheme of the present invention has the following advantages compared to existing technology:
(1) a kind of form rearrangement method of the present invention and system, builds form logical structure information; Obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell; Travel through each cell, after the content of acquiring unit lattice, cell is carried out typesetting separately as typesetting object; After adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.A kind of perfect form rearrangement method and system are provided, when making tableau format keep format to describe constant, the rearrangement keeping form logical organization can be carried out again when showing limited, the technical matters solved is form rearrangement method of the prior art when format document is shown on the display device that screen size is limited, rationally cannot process list data wherein.
(2) the form rearrangement method described in, first source document resolved and be converted into the format describing method in the present invention, recycling printed page understanding method confirms form number and corresponding region, the structure of form logical structure information is completed finally by the method receiving extraneous input, in this way, improve treatment effeciency, in order to ensure accuracy by source document parsing, carry out again follow-up external world's adjustment, thus establish form logical structure information efficiently, accurately.
(3) the form rearrangement method described in, also comprises the process that live width is arranged, arranges the live width parameter of acquiescence, like this for there is the ranks live width of line width information according to the value existed, non-existently to be supplemented by default value, maintain the integrality of information, and adjustable, the setting of pattern.
(4) the form rearrangement method described in, if set maximum col width and minimum col width in the pattern quoted at form, then needs to readjust according to these information the col width often arranged.Like this, achieve the controllability of col width within the specific limits, have more hommization, facilitate and carry out adjusting and arranging as required, maintaining again the integrality of form.
(5) the form rearrangement method described in, after the content of acquiring unit lattice, cell is carried out separately in the process of typesetting as typesetting object, do not limit typesetting height, the actual typesetting height of each cell is got after typesetting terminates, highly can regulate as required or User Defined display type, all display or part display, hommization and variation more.
(6) the form rearrangement method described in, if current page fails to show complete table, is then recorded the logical line information do not shown, after page turning, only calculates non-display section and typesetting, maintained the integrality of form by the form of Pagination Display.
(7) the form rearrangement method described in, the process of described drafting form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again, and more complete for form restores by preferential gauge outfit of drawing, and the readability of form is improved further.
Accompanying drawing explanation
In order to make content of the present invention be more likely to be clearly understood, below in conjunction with accompanying drawing, the present invention is further detailed explanation, wherein,
Fig. 1 is the FB(flow block) of a kind of form rearrangement method of the present invention;
Fig. 2 is the one-piece construction figure of form of the present invention;
Fig. 3 is the row combination assumption diagram of form of the present invention;
Fig. 4 gives list data and describes generation system schematic diagram;
Fig. 5 is that in the present embodiment, list data describes parsing display system schematics.
Embodiment
Provide the embodiment of a kind of form rearrangement method of the present invention and system below.
embodiment 1
The present embodiment provides a kind of form rearrangement method, as shown in Figure 1, comprises the following steps:
(1) build form logical structure information, process is as follows:
1.1 directly obtain form logical structure information and corresponding format descriptor by space of a whole page analytic method from form source file.The logical organization of form and corresponding format descriptor are numbered by unique pel and associates.
1.2 by the plate type table file acquisition form logical structure information after conversion and corresponding format descriptor;
1.3 obtain form logical structure information and corresponding format descriptor by accepting extraneous input.
In the present embodiment, first source document resolved and be converted into the format describing method in the present invention, recycling printed page understanding method confirms form number and corresponding region, the structure of form logical structure information is completed finally by the method receiving extraneous input, in this way, improve treatment effeciency, in order to ensure accuracy by source document parsing, carry out again follow-up external world's adjustment, thus establish form logical structure information efficiently, accurately.
(2) obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell.
The process of computation sheet live width described in 2.1, comprising:
Traversal row, obtains row information by line index, and the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines;
Refresh the live width of line and alignment, there is not line width information and then calculate according to the live width parameter of the described acquiescence of acquiescence in current line, if exist, retaining currency, preserving all line width information by representing two horizontal and vertical arrays;
All cells of traversal current line, refresh the live width of line and alignment further.
The process of the initial row number of each cell is calculated described in 2.2, comprise: a two-dimentional array is set, the ranks number of its size form is for this reason multiplied, the unique positions of each element position-table in array, represent whether this position can as the starting point of cell, attribute according to cell travels through all cells by row, obtains the initial row number of cell, calculates often row col width by row ratio and line width information.If set maximum col width and minimum col width in the pattern that form is quoted, then need to readjust according to these information the col width often arranged.
(3) travel through each cell, after the content of acquiring unit lattice, cell is carried out typesetting separately as typesetting object.Do not limit typesetting height, after typesetting terminates, get the actual typesetting height of each cell.
(4) after adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.If current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting.The process of described drafting form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again.
As further embodiment, described form rearrangement method, also comprises the process that live width is arranged, arranges the live width parameter of acquiescence.Like this for the ranks live width that there is line width information according to the value existed, non-existently to be supplemented by default value, maintain the integrality of information, and adjustable, the setting of pattern.
Form rearrangement method in the present embodiment, a kind of perfect form rearrangement method is provided, when making tableau format keep format to describe constant, the rearrangement keeping form logical organization can be carried out again when showing limited, the technical matters solved is form rearrangement method of the prior art when format document is shown on the display device that screen size is limited, rationally cannot process list data wherein.In described form rearrangement method, first source document resolved and be converted into the format describing method in the present invention, recycling printed page understanding method confirms form number and corresponding region, the structure of form logical structure information is completed finally by the method receiving extraneous input, in this way, improve treatment effeciency, in order to ensure accuracy by source document parsing, carry out again follow-up external world's adjustment, thus establish form logical structure information efficiently, accurately.The process that live width is arranged also is comprised in described form rearrangement method, the live width parameter of acquiescence is set, like this for the ranks live width that there is line width information according to the value existed, non-existently to be supplemented by default value, maintain the integrality of information, and adjustable, the setting of pattern.
In addition, the form rearrangement method described in the present embodiment, if set maximum col width and minimum col width in the pattern quoted at form, then needs to readjust according to these information the col width often arranged.Like this, achieve the controllability of col width within the specific limits, have more hommization, facilitate and carry out adjusting and arranging as required, maintaining again the integrality of form.Described form rearrangement method, after the content of acquiring unit lattice, cell is carried out separately in the process of typesetting as typesetting object, do not limit typesetting height, the actual typesetting height of each cell is got after typesetting terminates, highly can regulate as required or User Defined display type, all display or part display, hommization and variation more.If current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting, maintained the integrality of form by the form of Pagination Display.The process of described drafting form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again, and more complete for form restores by preferential gauge outfit of drawing, and the readability of form is improved further.
embodiment 2:
A kind of form system for rearranging, comprising:
Build module: build form logical structure information.
Printed page analysis submodule: directly obtain form logical structure information and corresponding format descriptor by space of a whole page analytic method from form source file.The logical organization of form and corresponding format descriptor are numbered by unique pel and associates.
Transformant module: by the plate type table file acquisition form logical structure information after conversion and corresponding format descriptor.
Corresponding subelement: obtain form logical structure information and corresponding format descriptor by accepting extraneous input.
Ranks acquisition module: obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell.
Type-setting module: travel through each cell, after the content of acquiring unit lattice, carries out typesetting using cell separately as typesetting object; In described type-setting module, do not limit typesetting height, after typesetting terminates, get the actual typesetting height of each cell.Highly can regulate as required or User Defined display type, all display or part display, hommization and variation more.
Drafting module: after adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.In described drafting module, comprise part rendering submodule: if current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting.The integrality of form is maintained by the form of Pagination Display.In described drafting module, draw the process of form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again.
Further, described form system for rearranging, also comprises live width setting unit, arranges the live width parameter of acquiescence.
Form system for rearranging described in the present embodiment, the process of described computation sheet live width, comprising: traversal row, obtains row information by line index, the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines; Refresh the live width of line and alignment, there is not line width information and then calculate according to the live width parameter of the described acquiescence of acquiescence in current line, if exist, retaining currency, preserving all line width information by representing two horizontal and vertical arrays; All cells of traversal current line, refresh the live width of line and alignment further.
Preferably, described form system for rearranging, the process of the initial row of each cell of described calculating number, comprise: arrange a two-dimentional array, the ranks number of its size form is for this reason multiplied, the unique positions of each element position-table in array, represent whether this position can as the starting point of cell, attribute according to cell travels through all cells by row, obtains the initial row number of cell, calculates often row col width by row ratio and line width information.
In other embodiments, if set maximum col width and minimum col width in the pattern quoted at form, then need to readjust according to these information the col width often arranged.If set maximum col width and minimum col width in the pattern that form is quoted, then need to readjust according to these information the col width often arranged.Like this, achieve the controllability of col width within the specific limits, have more hommization, facilitate and carry out adjusting and arranging as required, maintaining again the integrality of form.
Form system for rearranging in the present embodiment, a kind of perfect form rearrangement method is provided, when making tableau format keep format to describe constant, the rearrangement keeping form logical organization can be carried out again when showing limited, the technical matters solved is form rearrangement method of the prior art when format document is shown on the display device that screen size is limited, rationally cannot process list data wherein.In described structure module, first source document resolved and be converted into the format describing method in the present invention, recycling printed page understanding method confirms form number and corresponding region, the structure of form logical structure information is completed finally by the method receiving extraneous input, in this way, improve treatment effeciency, in order to ensure accuracy by source document parsing, carry out again follow-up external world's adjustment, thus establish form logical structure information efficiently, accurately.Live width setting unit is also comprised in described form rearrangement method, the live width parameter of acquiescence is set, like this for the ranks live width that there is line width information according to the value existed, non-existently to be supplemented by default value, maintain the integrality of information, and adjustable, the setting of pattern.
In addition, the form system for rearranging described in the present embodiment, if set maximum col width and minimum col width in the pattern quoted at form, then needs to readjust according to these information the col width often arranged.Like this, achieve the controllability of col width within the specific limits, have more hommization, facilitate and carry out adjusting and arranging as required, maintaining again the integrality of form.In addition, after the content of acquiring unit lattice, cell is carried out separately in the process of typesetting as typesetting object, do not limit typesetting height, the actual typesetting height of each cell is got after typesetting terminates, highly can regulate as required or User Defined display type, all display or part display, hommization and variation more.If current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting, maintained the integrality of form by the form of Pagination Display.The process of described drafting form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again, and more complete for form restores by preferential gauge outfit of drawing, and the readability of form is improved further.
embodiment 3:
The present embodiment provides a kind of embodiment of the form rearrangement method based on format, and process is as follows:
1. in the process generating streaming information, need to add form definition, identify the laggard pedestrian's work adjustment of positional information of form, complete and build form logical structure information.In existing typesetting algorithm, only have character block and non-legible piece, the present embodiment adds form block on this basis.If run into form block in process of typeset, then retain on-the-spot and enter in the typesetting logic of form.
2., after the structure completing above table logical organization, start the preliminary work of carrying out form rearrangement.If all live widths are not arranged, then unification is adjusted to 1 pixel wide, and arrange in the pattern that live width attribute is quoted at form, line and alignment live width are set to one-dimension array pHLineWidth, pVLineWidth respectively.
3. obtain total line number and total columns: add up the line number in each RowGroup and total line number that adds up to obtain, be set to nRowCount, total columns is form attributes, is set to nColCount.
4. computation sheet live width:
1) traversal row, obtains row information by line index, and the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines;
2) refresh the live width of line and alignment, there is not line width information and then calculate according to pHLineWidth and pVLineWidth in current line, existing and then retain currency, preserving all line width information by representing two horizontal and vertical arrays;
3) travel through all cells of current line, refresh the live width of line and alignment further, reach accurate.
5. according to the initial row number of each cell of Span property calculation of each cell:
1) array that maintenance one is two-dimentional, be set to cellStartColVec, the ranks number of size form is for this reason multiplied, and in array, each element can the unique positions of position-table, represent whether this position can as the starting point of cell, and all values is initialized as true;
2) travel through by row in all cell processes, if the line number in current iteration and columns are respectively i and k, ColSpan and the RowSpan property value of current cell is respectively m and n, if k+m ' <nColCount and i+n ' <nRowCount, wherein the span of m ' is { 0 ~ m}, the span of n ' is { 0 ~ n}, then cellStartColVec [i+n '] [k+m ']=ture, if instead k+m ' >nColCount or i+n ' be >nRowCount, ColSpan attribute or RowSpan property value is forced to be that 0(actual document value is constant, only be applied to display), in addition, in ergodic process cellStartColVec [i] [k] if value be false, then when prostatitis k is the initial row number of current cell.
6. the width that form is often gone can safeguard an one-dimension array, is set to pColWidth, and array has nColCount element, calculates often row col width by row ratio and line width information:
1) row ratio array is set to pColWidthRatio, and ratio summation is set to dWidthRatio if there is no row scale attributes, then often arrange and be all defaulted as 1;
2) form overall width is user defined value, it can be screen width, be set to dTableWidth, if row cell overall width sum is dCellWidthSum(do not comprise live width pVLineWidth and cell spacing dCellSpace), then travel through in the process of row that (current columns is i), dWidthRatio+=pColWidthRatio [i], dCellWidthSum-=(pVLineWidth [i]+dCellSpace), after traversal terminates, adjusted value dCellWidthSum+=dCellSpace, again travel through once again, pColWidth [i]=dCellWidthSum*pColWidthRatio [i]/dWidthRatio.
7. if set maximum col width and minimum col width in the pattern quoted at form, then need to readjust according to these information the col width often arranged.
8. travel through each table row, in table row, travel through each cell, after the content of acquiring unit lattice, cell is carried out typesetting separately as typesetting object, type-setting mode is consistent with previously described algorithm, and wherein cell uses the layout width that previous calculations obtains, but does not limit typesetting
Highly, typesetting gets the actual typesetting height of each cell after terminating.
9. after adding live width and cell spacing, cell is combined into logical line, and suitably adjusts size.
10. logical line is combined into logical tables, after suitable adjustment, (mainly inter-bank is alignd across row and differing heights cell table line) is arranged on target viewing area, if current page fails to show complete table, then record the logical line information do not shown, non-display section is only calculated and typesetting after page turning, if form exist row group information Repeat attribute be true and Type for Body, then this journey group (being generally the gauge outfit of form) is drawn in preferential typesetting again.
11. for form caption, and user can be plotted on relevant position voluntarily according to form data.
By with upper type, complete the rearrangement process of form.
embodiment 4
Face is further described the present invention in conjunction with the accompanying drawings and embodiments below.
Existing layout files makes again the space of a whole page fix document, and feature is the consistance on distinct device and ocr software with display result.It is in self-defining coordinate system, and specify the position of each document content display clearly, size etc., ensure the original appearance of real rendition document.This is the general designation of a series of document page object describing mode.And streaming is reset and is referred to and re-start typesetting according to the content of size to layout files of display device screen.After system achieves comparatively perfect streaming typesetting core algorithm, the form shuffle algorithm in the present embodiment can be used to carry out form rearrangement.The function of streaming typesetting core algorithm is, provide the viewing area of designated shape, draw by various graphics primitive object (word in region, picture is main) data stream of mixing, make: when data stream runs into restricted area edge in drawing process, to cross the border if next pel is drawn, then automatically switch to next line and continue to export.The basic ideas of typesetting algorithm are, algorithm draws a minimum particle size unit of pel at every turn, namely a word in a width picture or text object, the size of taking out this object before drafting also judges the magnitude relationship remaining space width with current line, if width not, enters a new line, otherwise export, continue after end of output to judge next pel.
The form of expression of form on layout files is exactly the arrangement of simple pel such as some lines and words etc., not clear and definite logical organization.So need the form tree structure defining a set of applicable streaming rearrangement, realize streaming when not affecting format reading effect and reset.
Structure about streaming form:
Streaming form essentially describes the form be included in one page or multipage, in order to support that table content is reset.
Fig. 2 gives the one-piece construction figure of streaming form in the design, and the data message represented by label is in fig. 2 as following table 1:
Data message represented by label in table 1, Fig. 2
Fig. 3 gives the row combination assumption diagram of streaming form in the present embodiment, and the data message represented by label is in figure 3 as shown in table 2 below:
Data message represented by label in table 2, Fig. 3
Table specification relatively and in HTML, the present invention program focuses on the terseness (such as: adopt the succession of the rendering style to add the complicacy that the mode quoted reduces description) that form describes more, user has the typesetting situation of more process means process complexity in detail, and pattern plays up mode opening more.
A kind of form rearrangement method described in the present embodiment and system, the process building form logical structure information is as follows with the module comprised, and Fig. 4 gives list data and describes generation system schematic diagram.
1) source document parsing module.
Read form source file content to be converted, resolve by methods such as known printed page understandings and obtain wherein about the data of form.
2) format describes generation module.
If the source file in step 1) is stream-oriented file, first the mode of form virtual printing is exported to obtain its format and describe.
If the source file in step 1) is layout files, directly obtains format and describe.
Original format set type is described the format describing method be converted in the present invention.Comprising, increase unique number for each pel describes, be convenient to quote.
3) streaming describes generation module.
If the source file in step 1) is stream-oriented file, according to the grammer of this stream-oriented file, by information wherein, the table streaming information be converted in the present invention describes.
If the source file in step 1) is layout files.Can consider in this step that the mode introducing certain known Table recognition obtains a logical description, and be converted into the structure in the present invention.Otherwise directly skip, enter step 4).
4) artificial adjusting module.
Because Table recognition is world-famous puzzle, automatic recognition system is made mistakes unavoidably, therefore needs manually to revise step 1)-3) the middle mistake occurred.
Be the situation of layout files for source file, if do not carry out identification work extraction, then manually add a set of logical message to form according to tree structure of the present invention.
5) information association module.
By pel unique number, to be logical message with format descriptor add associate, the content of cell each in plate type table utilized unique pel to number and be stored in the logical tables structure of relevant position.
Constructed the logical structure information of form by said method, the process that form is reset is as follows:
1) read the streaming form document supporting structure in the present invention, obtain form logical organization and corresponding format descriptor.
2) carry out Processing Algorithm, calculate the table row that should show in the page to be shown.
Content to be shown comprises, the several rows from the next line of last the logical table table rows shown by page up, and the head of form defined in form semanteme, row.Be specially:
Build form logical structure information; Obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell; Travel through each cell, after the content of acquiring unit lattice, cell is carried out typesetting separately as typesetting object; After adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.
3) according to step 2) the middle table row group that addition of semantic information generated, from format describes, obtain its content that should show, process.These process may be after the display capabilities and computing velocity of balance display device, carry out simplify processes to the pel that some do not affect reading effect.
4) final process result of step 3) is represented on the display device.
Use above table rearrangement method, the process of carrying out typesetting is as follows, and Fig. 5 is that in the present embodiment, list data describes parsing display system schematics.
Although streaming form contains the information such as word picture in the content flow treating typesetting, the rearrangement of table content is not identical with common content stream, and in order to the globality shown, it is reset participating in streaming as independent entirety;
1) according to above-mentioned steps 1) description, first form is directly add in the middle of whole streaming typesetting content as a large displaying block, but in fact it contains a lot of displaying block (DispPiece), after parsing obtains the tree structure of form, content for each cell carries out displaying block collection, and the displaying block generating needs rearrangement is deposited in the data structure of cell.According to the description of convection type tableau format above, for the cell of content relative complex, collection process relies on previously mentioned streaming typesetting core algorithm, when cell content form is relatively simple, then carries out the establishment of displaying block according to the format content quoted and instruction character;
2) play up according to the structure of form and the pattern quoted the size that mode and target reset region, calculate the width (consider cell between form line width) of each cell, all relevant informations that this computation process can utilize the present invention to describe are according to without the self-defined suitable algorithm of platform.Then according to table row each cell typesetting once, the composing structure information of each cell and row information cache are got up, such as by the row information needing to merge (: there is the cell of inter-bank across row simultaneously, display in need some cell span) stored in list cell in order to further adjustment unit lattice content, complete so each cell in form independence reset;
3) cell in step 3) is adjusted.
4) according to the description of step 3), all cell typesetting results are carried out combine (all relevant informations that array mode describes according to the present invention according to without the self-defined suitable algorithm of platform), after combination, whole form completes overall sequence (meaning of this sequence is that the situation across page display, each cell can obtain oneself streaming location in whole table) simultaneously;
Then, the auxiliary display (account form of form line position is self-defined) such as form line and form background will be drawn according to aforesaid way, and be formed and finally present effect, and complete form and reset.
Obviously, above-described embodiment is only for clearly example being described, and the restriction not to embodiment.For those of ordinary skill in the field, can also make other changes in different forms on the basis of the above description.Here exhaustive without the need to also giving all embodiments.And thus the apparent change of extending out or variation be still among the protection domain of the invention.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the present invention can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) of computer usable program code.
The present invention describes with reference to according to the process flow diagram of the method for the embodiment of the present invention, equipment (system) and computer program and/or block scheme.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or square frame.These computer program instructions can being provided to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, making the instruction performed by the processor of computing machine or other programmable data processing device produce device for realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other programmable data processing device, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make on computing machine or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computing machine or other programmable devices is provided for the step realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
Although describe the preferred embodiments of the present invention, those skilled in the art once obtain the basic creative concept of cicada, then can make other change and amendment to these embodiments.So claims are intended to be interpreted as comprising preferred embodiment and falling into all changes and the amendment of the scope of the invention.

Claims (20)

1. a form rearrangement method, is characterized in that, comprises the following steps:
Build form logical structure information;
Obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell;
Travel through each cell, after the content of acquiring unit lattice, cell is carried out typesetting separately as typesetting object;
After adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.
2. form rearrangement method according to claim 1, is characterized in that, in the process of described structure form logical structure information, comprising:
From form source file, form logical structure information and corresponding format descriptor is directly obtained by space of a whole page analytic method;
By the plate type table file acquisition form logical structure information after conversion and corresponding format descriptor;
Form logical structure information and corresponding format descriptor is obtained by accepting extraneous input.
3. form rearrangement method according to claim 2, is characterized in that, the logical organization of form and corresponding format descriptor is numbered by unique pel and associates.
4. the form rearrangement method according to claim 1 or 2 or 3, is characterized in that, also comprises the process that live width is arranged, arranges the live width parameter of acquiescence.
5. form rearrangement method according to claim 4, is characterized in that, the process of described computation sheet live width, comprising:
Traversal row, obtains row information by line index, and the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines;
Refresh the live width of line and alignment, there is not line width information and then calculate according to the live width parameter of the described acquiescence of acquiescence in current line, if exist, retaining currency, preserving all line width information by representing two horizontal and vertical arrays;
All cells of traversal current line, refresh the live width of line and alignment further.
6. according to the arbitrary described form rearrangement method of claim 1-5, it is characterized in that, the process of the initial row of each cell of described calculating number, comprise: arrange a two-dimentional array, the ranks number of its size form is for this reason multiplied, the unique positions of each element position-table in array, represent whether this position can as the starting point of cell, attribute according to cell travels through all cells by row, obtains the initial row number of cell, calculates often row col width by row ratio and line width information.
7. form rearrangement method according to claim 6, is characterized in that, if set maximum col width and minimum col width in the pattern quoted at form, then needs to readjust according to these information the col width often arranged.
8. form rearrangement method according to claim 7, is characterized in that, each cell of described traversal, after the content of acquiring unit lattice, cell is carried out separately in the process of typesetting as typesetting object, does not limit typesetting height, after typesetting terminates, get the actual typesetting height of each cell.
9. the form rearrangement method according to claim 7 or 8, it is characterized in that, described add live width and cell spacing after, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws in the process of form, comprising: if current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting.
10. form rearrangement method according to claim 9, is characterized in that, the process of described drafting form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again.
11. 1 kinds of form system for rearranging, is characterized in that, comprising:
Build module: build form logical structure information;
Ranks acquisition module: obtain total line number of described form and total columns, computation sheet live width also calculates initial row number and the col width of each cell;
Type-setting module: travel through each cell, after the content of acquiring unit lattice, carries out typesetting using cell separately as typesetting object;
Drafting module: after adding live width and cell spacing, cell is combined into logical line, logical line is combined into logical tables, is arranged in target viewing area and draws form.
12. form system for rearranging according to claim 1, is characterized in that, in described structure module, comprising:
Printed page analysis submodule: directly obtain form logical structure information and corresponding format descriptor by space of a whole page analytic method from form source file;
Transformant module: by the plate type table file acquisition form logical structure information after conversion and corresponding format descriptor;
Corresponding subelement: obtain form logical structure information and corresponding format descriptor by accepting extraneous input.
13. form system for rearranging according to claim 12, is characterized in that, the logical organization of form and corresponding format descriptor are numbered by unique pel and associate.
14. form system for rearranging according to claim 11 or 12 or 13, is characterized in that, also comprise live width setting unit, arrange the live width parameter of acquiescence.
15. form system for rearranging according to claim 14, is characterized in that, the process of described computation sheet live width, comprising:
Traversal row, obtains row information by line index, and the pattern of wherein going comprises live width and colouring information, only works to the Article 1 alignment of current line and the last item alignment and upper and lower two lines;
Refresh the live width of line and alignment, there is not line width information and then calculate according to the live width parameter of the described acquiescence of acquiescence in current line, if exist, retaining currency, preserving all line width information by representing two horizontal and vertical arrays;
All cells of traversal current line, refresh the live width of line and alignment further.
16. according to the arbitrary described form system for rearranging of claim 11-15, it is characterized in that, the process of the initial row of each cell of described calculating number, comprise: a two-dimentional array is set, the ranks number of its size form is for this reason multiplied, the unique positions of each element position-table in array, represent whether this position can as the starting point of cell, attribute according to cell travels through all cells by row, obtain the initial row number of cell, calculate often row col width by row ratio and line width information.
17. form system for rearranging according to claim 16, is characterized in that, if set maximum col width and minimum col width in the pattern quoted at form, then need to readjust according to these information the col width often arranged.
18. form system for rearranging according to claim 17, is characterized in that, in described type-setting module, do not limit typesetting height, get the actual typesetting height of each cell after typesetting terminates.
19. form system for rearranging according to claim 17 or 18, it is characterized in that, in described drafting module, comprise part rendering submodule: if current page fails to show complete table, then record the logical line information do not shown, after page turning, only calculate non-display section and typesetting.
20. form system for rearranging according to claim 19, is characterized in that, in described drafting module, draw the process of form, if form attributes is Table Header information, then this journey group is drawn in preferential typesetting again.
CN201310462279.0A 2013-09-30 2013-09-30 Table reordering method and table reordering system Pending CN104516867A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310462279.0A CN104516867A (en) 2013-09-30 2013-09-30 Table reordering method and table reordering system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310462279.0A CN104516867A (en) 2013-09-30 2013-09-30 Table reordering method and table reordering system

Publications (1)

Publication Number Publication Date
CN104516867A true CN104516867A (en) 2015-04-15

Family

ID=52792193

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310462279.0A Pending CN104516867A (en) 2013-09-30 2013-09-30 Table reordering method and table reordering system

Country Status (1)

Country Link
CN (1) CN104516867A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279139A (en) * 2015-11-30 2016-01-27 中国建设银行股份有限公司 Form information display rule configuration and calculation method and system
CN105912516A (en) * 2016-04-01 2016-08-31 南京朗坤软件有限公司 Method for one-lick extraction of table data from AutoCAD file
CN106021454A (en) * 2016-05-17 2016-10-12 乐视控股(北京)有限公司 Mail reading method and apparatus
CN107368276A (en) * 2017-08-28 2017-11-21 郑州云海信息技术有限公司 A kind of display control method and device
CN107741924A (en) * 2017-10-26 2018-02-27 南京大学 A kind of method of computer disposal complexity list
CN109359283A (en) * 2018-09-26 2019-02-19 中国平安人寿保险股份有限公司 Method of summary, terminal device and the medium of list data
CN109491742A (en) * 2018-10-31 2019-03-19 天津字节跳动科技有限公司 Page tabular rendering method and device
CN109815461A (en) * 2018-12-07 2019-05-28 北京天健源达科技股份有限公司 A method of editor's table
CN110210440A (en) * 2019-06-11 2019-09-06 中国农业银行股份有限公司 A kind of form image printed page analysis method and system
CN110929487A (en) * 2018-09-04 2020-03-27 北大方正集团有限公司 Table typesetting method and device, computer equipment and readable storage medium
CN110968987A (en) * 2018-09-30 2020-04-07 腾讯科技(深圳)有限公司 Table display method and device, storage medium and electronic device
CN111027294A (en) * 2019-12-12 2020-04-17 中国联合网络通信集团有限公司 Table summarizing method, device and system
CN111353272A (en) * 2019-12-24 2020-06-30 浙江明度智控科技有限公司 Information display method and device of web form
CN111723560A (en) * 2020-07-15 2020-09-29 金蝶软件(中国)有限公司 Dynamic adjustment method, system and related equipment for table parallel display area
CN112149397A (en) * 2020-09-30 2020-12-29 杭州拼便宜网络科技有限公司 Method, system and related device for analyzing electronic form
CN112380819A (en) * 2020-11-17 2021-02-19 北京字跳网络技术有限公司 Document editing method and device and electronic equipment
CN112926286A (en) * 2021-04-02 2021-06-08 方正国际软件(北京)有限公司 Dynamic table generation method and system
CN113076716A (en) * 2021-04-16 2021-07-06 浙江鸿程计算机系统有限公司 Typesetting method and device for yearbook
CN113391861A (en) * 2021-05-21 2021-09-14 军事科学院系统工程研究院网络信息研究所 Table dynamic drawing method based on android platform
CN114077466A (en) * 2020-08-12 2022-02-22 北京智邦国际软件技术有限公司 Automatic layout algorithm for multiple rows and multiple columns of fields in Web interface form
US12067345B2 (en) 2021-05-10 2024-08-20 Beijing Zitiao Network Technology Co., Ltd. Table displaying method, device and medium

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279139A (en) * 2015-11-30 2016-01-27 中国建设银行股份有限公司 Form information display rule configuration and calculation method and system
CN105912516A (en) * 2016-04-01 2016-08-31 南京朗坤软件有限公司 Method for one-lick extraction of table data from AutoCAD file
CN105912516B (en) * 2016-04-01 2019-02-05 朗坤智慧科技股份有限公司 A method of the one-touch extraction list data from autocad file
CN106021454A (en) * 2016-05-17 2016-10-12 乐视控股(北京)有限公司 Mail reading method and apparatus
CN107368276A (en) * 2017-08-28 2017-11-21 郑州云海信息技术有限公司 A kind of display control method and device
CN107741924A (en) * 2017-10-26 2018-02-27 南京大学 A kind of method of computer disposal complexity list
CN110929487A (en) * 2018-09-04 2020-03-27 北大方正集团有限公司 Table typesetting method and device, computer equipment and readable storage medium
CN109359283A (en) * 2018-09-26 2019-02-19 中国平安人寿保险股份有限公司 Method of summary, terminal device and the medium of list data
CN109359283B (en) * 2018-09-26 2023-07-25 中国平安人寿保险股份有限公司 Summarizing method of form data, terminal equipment and medium
CN110968987A (en) * 2018-09-30 2020-04-07 腾讯科技(深圳)有限公司 Table display method and device, storage medium and electronic device
CN109491742A (en) * 2018-10-31 2019-03-19 天津字节跳动科技有限公司 Page tabular rendering method and device
CN109491742B (en) * 2018-10-31 2021-10-22 天津字节跳动科技有限公司 Page table rendering method and device
CN109815461A (en) * 2018-12-07 2019-05-28 北京天健源达科技股份有限公司 A method of editor's table
CN109815461B (en) * 2018-12-07 2024-02-09 北京天健源达科技股份有限公司 Method for editing form
CN110210440A (en) * 2019-06-11 2019-09-06 中国农业银行股份有限公司 A kind of form image printed page analysis method and system
CN111027294A (en) * 2019-12-12 2020-04-17 中国联合网络通信集团有限公司 Table summarizing method, device and system
CN111027294B (en) * 2019-12-12 2023-05-30 中国联合网络通信集团有限公司 Method, device and system for summarizing table
CN111353272A (en) * 2019-12-24 2020-06-30 浙江明度智控科技有限公司 Information display method and device of web form
CN111353272B (en) * 2019-12-24 2023-10-20 明度智云(浙江)科技有限公司 Information display method and device for web form
CN111723560A (en) * 2020-07-15 2020-09-29 金蝶软件(中国)有限公司 Dynamic adjustment method, system and related equipment for table parallel display area
CN111723560B (en) * 2020-07-15 2024-04-19 金蝶软件(中国)有限公司 Dynamic adjustment method, system and related equipment for parallel display area of table
CN114077466A (en) * 2020-08-12 2022-02-22 北京智邦国际软件技术有限公司 Automatic layout algorithm for multiple rows and multiple columns of fields in Web interface form
CN112149397A (en) * 2020-09-30 2020-12-29 杭州拼便宜网络科技有限公司 Method, system and related device for analyzing electronic form
CN112380819A (en) * 2020-11-17 2021-02-19 北京字跳网络技术有限公司 Document editing method and device and electronic equipment
CN112926286A (en) * 2021-04-02 2021-06-08 方正国际软件(北京)有限公司 Dynamic table generation method and system
CN112926286B (en) * 2021-04-02 2024-05-28 方正国际软件(北京)有限公司 Dynamic form generation method and system
CN113076716A (en) * 2021-04-16 2021-07-06 浙江鸿程计算机系统有限公司 Typesetting method and device for yearbook
US12067345B2 (en) 2021-05-10 2024-08-20 Beijing Zitiao Network Technology Co., Ltd. Table displaying method, device and medium
CN113391861A (en) * 2021-05-21 2021-09-14 军事科学院系统工程研究院网络信息研究所 Table dynamic drawing method based on android platform
CN113391861B (en) * 2021-05-21 2023-12-29 军事科学院系统工程研究院网络信息研究所 Android platform-based form dynamic drawing method

Similar Documents

Publication Publication Date Title
CN104516867A (en) Table reordering method and table reordering system
CN105302550B (en) The page is switched to the method and system of format data stream file
US7844896B2 (en) Layout-rule generation system, layout system, layout-rule generation program, layout program, storage medium, method of generating layout rule, and method of layout
US9785623B2 (en) Identifying a set of related visible content elements in a markup language document
EP2291010A1 (en) Structure processing method and apparatus for layout file
US20140176564A1 (en) Chinese Character Constructing Method and Device, Character Constructing Method and Device, and Font Library Building Method
US8386943B2 (en) Method for query based on layout information
CN103605502B (en) Form page display method and server
CN105808217A (en) Flow chart drawing method and system based on XML
CN108090037B (en) Automatic typesetting method and device
US10275428B2 (en) Panoptic visualization document differencing
CN115393872B (en) Method, device and equipment for training text classification model and storage medium
KR101371406B1 (en) Method and system for manufacturing e-book by source analysis of pdf document
CN104424174B (en) Document processing system and document processing method
CN104516919B (en) One kind quotes annotation process method and system
US8326812B2 (en) Data search device, data search method, and recording medium
JP5551986B2 (en) Information processing apparatus, information processing method, and program
CN101944081A (en) Computer generation, edition method of Guqin abbreviated character notation and system thereof
US9965446B1 (en) Formatting a content item having a scalable object
KR102299879B1 (en) Method and computer program product for automated placement and harmonization of graph onto a figure
WO2021082652A1 (en) Information display method and apparatus, and computer-readable storage medium
CN102129502A (en) Optimal electrical power line selection method and system
CN110188326B (en) Rich text generating method, rich text generating device, computer equipment and storage medium
CN112668299A (en) Automatic typesetting method and system for referee document
JP2006092462A (en) Automatic conversion system for electronic book content and construction of electronic book shared database

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20150415