CN103942185A - Method and system for storing document - Google Patents

Method and system for storing document Download PDF

Info

Publication number
CN103942185A
CN103942185A CN201410122294.5A CN201410122294A CN103942185A CN 103942185 A CN103942185 A CN 103942185A CN 201410122294 A CN201410122294 A CN 201410122294A CN 103942185 A CN103942185 A CN 103942185A
Authority
CN
China
Prior art keywords
document
file
change
modification
incidence relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410122294.5A
Other languages
Chinese (zh)
Inventor
江潮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUHAN TRANSN INFORMATION TECHNOLOGY Co Ltd
Original Assignee
WUHAN TRANSN INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUHAN TRANSN INFORMATION TECHNOLOGY Co Ltd filed Critical WUHAN TRANSN INFORMATION TECHNOLOGY Co Ltd
Priority to CN201410122294.5A priority Critical patent/CN103942185A/en
Publication of CN103942185A publication Critical patent/CN103942185A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a method for storing a document. The method for storing the document comprises the steps that at least one piece of modification generated when the document is edited and information of the positions of all pieces of modification in the document are recorded; all the pieces of modification are scanned, the frequency of occurrence of each character appearing during all the pieces of modification is determined, and a Huffman tree is established; the coding string of each piece of modification is established according to the Huffman tree; the Huffman tree, the coding strings of all the pieces of modification and the information of the positions of all the pieces of modification are taken as a file to be stored. The invention further discloses a system for storing the document. By the adoption of the method and system for storing the document, the occupancy rate of the document in a storage space is lowered, and the operating pressure of a storage device is relieved.

Description

A kind of method and system of storing document
Technical field
The present invention relates to document processing technology field, especially relate to a kind of method and system of storing document.
Background technology
At present, for word worker, before Edit Document, former document need to be backed up, more former document is modified, storage after revising.If same document has been carried out repeatedly revising, just need to store a plurality of amended documents.In the situation that storage space is certain, for the less small-sized document of capacity, it is not very large that the storage space of hard disk is required, but concerning the larger large-scale document of capacity, this mode can cause greatly reducing of storage space undoubtedly, and the storage pressure of storage space is much.The change of document is based on former document, and it is identical that amended document and former document have many contents, for the repeatedly storage of the identical content of a plurality of amended documents, easily causes the utilization factor of storage space to decline.
Summary of the invention
The present invention aims to provide a kind of method of storing document, to solve the problem that repeatedly storage easily causes the utilization factor of storage space to decline of identical content of a plurality of amended documents of prior art.
In some illustrative embodiment, the method for described storage document comprises: record at least one place modification change that document produces in editing process, and the positional information of modification change in described document stated in every place; Scan all described modification changes, determine the frequency of occurrences of each character in all described modification changes, set up Huffman tree; According to described Huffman tree, set up the coded strings of revising change described in each; The described positional information of revising the coded strings of change using described Huffman tree, described in each and revising change described in each is as an independent file preservation.
Another object of the present invention is to provide a kind of system of storing document.
In some illustrative embodiment, the system of described storage document comprises: logging modle, and at least one place modification change producing at editing process for recording document, and the positional information of modification change in described document stated in every place; First sets up module, for scanning all described modification changes, determines the frequency of occurrences of each character in all described modification changes, sets up Huffman tree; Second sets up module, for according to described Huffman tree, sets up the coded strings of revising change described in each; And memory module, for revising the coded strings of change using described Huffman tree, described in each and revising the described positional information changing described in each and preserve as an independent file.
Compared with prior art, illustrative embodiment of the present invention comprises following advantage:
By only recording the modification of the generation of document in each editing process, change, and the positional information of revising change, by the modification change of recording in each editing process, and the positional information of revising change is in an independent mode that file is preserved, in storage space, only need to store a document, and each file of the document generation after each editor finishes, do not need the different editions of the document to carry out complete storage, save the memory space of identical content in the different editions of document, improved the utilization factor of storage space.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention is used for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the Stored Procedure figure according to example embodiment of the present invention;
Fig. 2 is according to the structural representation of the Huffman tree of example embodiment of the present invention; And
Fig. 3 is the system chart according to example embodiment of the present invention.
Embodiment
In the following detailed description, a large amount of specific detail are proposed, so that provide thorough understanding of the present invention.But, person of skill in the art will appreciate that, even without these specific detail, also can implement the present invention.In other cases, do not describe well-known method, process, assembly and circuit in detail, in order to avoid affect the understanding of the present invention.
Referring now to Fig. 1, Fig. 1 shows according to storing the process flow diagram of document in the storage document of some illustrative embodiment
As shown in Figure 1, in some illustrative embodiment, disclose a kind of method of storing document, having comprised:
S11, record at least one place that document produces in editing process and revise change, and every place is stated and revised the positional information of change in described document.
In some illustrative embodiment, every place is stated and is revised change and comprise one of following: increase and/or delete, and change of format.Wherein, increase for representing that the content of document increases, such as increasing document content, increasing document annotation etc.Deletion is deleted for representing the content of document, such as deleting document content, deleting document annotation etc.; Wherein, increase and delete for representing the replacing of the content of document.
In some illustrative embodiment, revise between change and does not have continuous association at every two places, discontinuous.
In some illustrative embodiment, the recording mode that the positional information of revising change is stated in every place comprises: which page, which row, which the character front and back of record modification change in described document; Or this revises the front and back of change certain character string in described document.For example, after, the 1st of document the page, the 2nd row, the 7th character, increased character " I "; Record " I ", the 1st page, after the 2nd row, the 7th character or before the 8th character, and the attribute that mark is revised change is for just, represents increase; Or this revises which page, this page which section, this section which row, this row which the character front and back of change in described document; Or this revises which section, this section which row of change in described document, before and after which character of this row; Or this revises the front and back of change certain character string in described document, for example after the B in " ABCD ", add E, become " ABECD ", after record " E " place AB or before the CD of place.
S12, scan all described modifications changes, determine the frequency of occurrences of each character in all described modifications changes, set up Huffman tree;
All modifications change to generation scans, and determines each character occurring revising change, and the number of times (frequency of occurrences) of each character appearance, as the weighted value of each character, according to the weighted value of each character, sets up Huffman tree, and its process comprises:
First, extract two characters of weighted value minimum as two leaf nodes of Huffman tree, set up the father node of these two leaf nodes, the weighted value of father node is the weighted value sum of these two leaf nodes;
Afterwards, according to weighted value order from small to large, extract successively a character as another leaf node of Huffman tree,, as same node layer, continue to build next father node with the father node of current weighted value maximum, until all characters are all building up in Huffman tree;
Wherein, each father node is connected with two child nodes under it respectively, and the syndeton in father node left side is labeled as 0, and the syndeton on father node right side is labeled as 1;
By the method, the character that each leaf node in Huffman tree represents is unique binary prefix code.Example as shown in Figure 2, is to revise change for the Huffman tree of " ASCTASCTAAAAACCTTT ", and its process comprises:
After scanning, determine that A occurrence number is that 7, C occurrence number is that 4, S occurrence number is that 2, T occurrence number is 5;
Extract character S and character C as the leaf node of Huffman tree, be labeled as 2 and 4 respectively with its weighted value, set up weighted value and be 6 father node, and be connected with its child node, left side linkage flag is 0, and right side linkage flag is 1;
Extract character T, as another leaf node of Huffman tree, this leaf node weighted value is 5 again, the father node that itself and weighted value are 6 is same node layer, sets up weighted value and is 11 father node, and be connected with its child node, left side linkage flag is 0, and right side linkage flag is 1;
Extract character A, as another leaf node of Huffman tree, this leaf node weighted value is 7 again, all the other weighted values are that 11 father node is same node layer, set up weighted value and are 18 father node, and be connected with its child node, left side linkage flag is 0, and right side linkage flag is 1; Huffman has been set up by tree.
The binary coding representation of character A is 0, and the binary coding representation of character T is 10, and the binary coding representation of character C is 110, and the binary coding representation of character S is 111.
S13, according to described Huffman tree, set up the coded strings of revising change described in each;
According to character and the binary-coded relation in Huffman tree, determine the coded strings of each modification change; For example revise change " ASCTASCTAAAAACCTTT ", it is according to the Huffman tree of setting up, coded strings is " 01111101001111101000000110110101010 ", due to each character representation be a prefix code, coded strings can not produce and misread in the process of resolving.
S14, the described positional information of revising the coded strings of change using described Huffman tree, described in each and revising change described in each are preserved as an independent file.。
Wherein, each file has unique identification marking, and this identification marking can be that for example fileversion number 1.0 by numbering, title or the edit session of artificial setting or generation automatically, not only play mark action, also for distinguishing the sequencing of the generation between each file.
In some illustrative embodiment, the document portion being edited preserves.Its detailed process can comprise: copy document in volatile data base, carry out editing operation.After having edited, remove the document in volatile data base.
In some illustrative embodiment, each in each file revised change and had unique identification marking, for example numbering.The numbering of revising change can have corresponding relation with the version number of file, and for example the version number of file is 1.0; What a modification in this document changed is numbered 1.001, is denoted as first in this document and revises change; Each positional information of revising change has mapping relations with the numbering of revising change.That for example revises change is numbered 1.001, and its positional information has 1.001 numbering equally.
S15, set up the incidence relation of described file and described document.
In some illustrative embodiment, above-mentioned document can be former document; Wherein, former document is that itself does not preserve any modification as revising basic document.
In some illustrative embodiment, above-mentioned document can be the modification document after former document merges with at least one alternative document successively.Now, the incidence relation of file and document is the incidence relation between this document and the alternative document of the last splicing of former document;
Relation between former document, file and modification document can reference table 1 shown in.
Table 1
File B in table 1 produces by editing former document A, file B and former document A have incidence relation, file C afterwards produces by edit-modify document A+B, file C and file B have incidence relation, file D produces by edit-modify document A+B+C, file D and file C have incidence relation, and file E produces by Edit Document A+B, and file E and file B have incidence relation.
In some illustrative embodiment, each file, after being extracted, also needs to set according to Huffman wherein, and each coded strings is reverted back and revises change.
In some illustrative embodiment, document is that while revising document, the method for described storage document also comprises:
After the first file in file is selected, according to above-mentioned incidence relation, transfer former document or second file with this document with incidence relation; In situation for the first file and the second file association, continue to transfer former document or the 3rd file with the second file association, until transfer former document, finish to transfer.For example, after file B is selected, transfer former document A; Or after for example file D is selected, transfer successively file C, file B, former document A.
One or more files of transferring and former document are merged to processing; For example: after file B is selected, according to the positional information in file B, the modification change in file B is spliced in former document A, forms and revise document A+B; Or be after file D is selected, transferred file C, file B and former document A; The sequencing producing according to file, first merges file B and former document A, at the modification document A+B by file C and after merging, merges, and finally the modification document A+B+C after file D and secondary merging is merged, and obtains revising document A+B+C+D.
In some illustrative embodiment, the first database is used for storing former document, and the second database is for storage file.
In some illustrative embodiment, the method for described storage document also comprises:
User, choose after its needed file, according to incidence relation (as table 1), transfer former document or alternative document and former document;
Merge, obtain revising document, now for can check, the operation such as editor.
In some illustrative embodiment, user is the identification marking selecting file of add file more.
In some illustrative embodiment, user can choose a plurality of files with a former document associations, forms the many pieces of modification documents that are associated, and provides user to contrast every piece and revises document.
Referring now to Fig. 3, Fig. 3 shows according to the block diagram of the system of the storage document of some illustrative embodiment.
As shown in Figure 3, according to some illustrative embodiment, a kind of system of storing document is disclosed,
A system of 6, storing document, is characterized in that, comprising: record at least one place modification change that document produces in editing process, and the logging modle (for example logging modle 101) of revising the positional information of change in described document is stated in every place; Scan all described modification changes, determine the frequency of occurrences of each character in all described modification changes, set up first of Huffman tree and set up module (for example first setting up module 102); According to described Huffman tree, set up revise described in each change coded strings second set up module (for example second setting up module 103); And using described Huffman tree, described in each, revise the coded strings of change and for example described in each, revise the described positional information of change, as the memory module (memory module 104) of an independent file preservation.
In some illustrative embodiment, in described system, also comprise: for example, for extracting the extraction module (extraction module 105) of described document.
In some illustrative embodiment, the document that described extraction module extracts has backup.
In some illustrative embodiment, in described system, also comprise: after editor finishes, delete the removing module (for example removing module 106) of described editor's document.
In some illustrative embodiment, described system also comprises: the relating module (for example relating module 107) of setting up the incidence relation of described file and described document.
In some illustrative embodiment, described extraction module, also for after being selected at file, is transferred the described document with described file with incidence relation according to described incidence relation.
In some illustrative embodiment, described system also comprises: according to described Huffman tree, coded strings described in each is reverted back to the parsing module (for example parsing module 108) of revising change; And, according to the positional information in described file, the modification change in this document is spliced to the merging module (for example merging module 109) in described document.
In some illustrative embodiment, described document is the document after former document merges with at least one alternative document successively; The described incidence relation that described relating module is set up is the incidence relation between described file and the described alternative document of last and former document merging.
In some illustrative embodiment, in described system, also comprise: store first database (for example the first database 110) of former document, and the second database of storage file (for example the second database 111).
The explanation of above embodiment is just for helping to understand method of the present invention and core concept thereof; , for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention meanwhile.

Claims (10)

1. a method of storing document, is characterized in that, comprising:
Record at least one place modification change that document produces in editing process, and the positional information of modification change in described document stated in every place;
Scan all described modification changes, determine the frequency of occurrences of each character in all described modification changes, set up Huffman tree;
According to described Huffman tree, set up the coded strings of revising change described in each;
The described positional information of revising the coded strings of change using described Huffman tree, described in each and revising change described in each is as an independent file preservation.
2. method according to claim 1, is characterized in that, described document has backup; Also comprise:
After editor finishes, delete described editor's document.
3. method according to claim 1, is characterized in that, also comprises:
Set up the incidence relation of described file and described document.
4. method according to claim 3, is characterized in that, also comprises:
After file is selected, according to described incidence relation, transfer the described document with described file with incidence relation;
Described file and described document are merged, and its process comprises:
According to described Huffman tree, coded strings described in each is reverted back and revises change;
According to the positional information in described file, the modification change in this document is spliced in described document.
5. method according to claim 4, is characterized in that, described document be former document successively with the document of at least one alternative document after described merging;
The described incidence relation of setting up is the incidence relation between described file and the described alternative document of last and former document merging.
6. a system of storing document, is characterized in that, comprising:
Logging modle, at least one place modification change producing at editing process for recording document, and the positional information of modification change in described document stated in every place;
First sets up module, for scanning all described modification changes, determines the frequency of occurrences of each character in all described modification changes, sets up Huffman tree;
Second sets up module, for according to described Huffman tree, sets up the coded strings of revising change described in each; And
Memory module, for revising the coded strings of change and revising the described positional information changing described in each and preserve as an independent file using described Huffman tree, described in each.
7. system according to claim 1, is characterized in that, described document has backup; Described system also comprises: remove module, after finishing editor, delete described editor's document.
8. system according to claim 1, is characterized in that, also comprises:
Relating module, for setting up the incidence relation of described file and described document.
9. system according to claim 8, is characterized in that, also comprises:
Extraction module, after being selected at file, transfers the described document with described file with incidence relation according to described incidence relation;
Parsing module, for according to described Huffman tree, reverts back coded strings described in each to revise change; And,
Merge module, for according to the positional information of described file, the modification change in this document is spliced in described document.
10. system according to claim 9, is characterized in that, described document is the document after former document merges with at least one alternative document successively;
The described incidence relation that described relating module is set up is the incidence relation between described file and the described alternative document of last and former document merging.
CN201410122294.5A 2014-03-28 2014-03-28 Method and system for storing document Pending CN103942185A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410122294.5A CN103942185A (en) 2014-03-28 2014-03-28 Method and system for storing document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410122294.5A CN103942185A (en) 2014-03-28 2014-03-28 Method and system for storing document

Publications (1)

Publication Number Publication Date
CN103942185A true CN103942185A (en) 2014-07-23

Family

ID=51189855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410122294.5A Pending CN103942185A (en) 2014-03-28 2014-03-28 Method and system for storing document

Country Status (1)

Country Link
CN (1) CN103942185A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5355476A (en) * 1990-12-29 1994-10-11 Casio Computer Co., Ltd. File update apparatus for generating a matrix representing a subset of files and the update correspondence between directories and files
CN102708191A (en) * 2012-05-15 2012-10-03 通唐软件技术(湖南)有限公司 Word stock coding and decoding method capable of saving memory
CN103020026A (en) * 2012-11-15 2013-04-03 无锡永中软件有限公司 Synergistic file processing system and method
CN103294658A (en) * 2012-03-02 2013-09-11 北大方正集团有限公司 Document storage method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5355476A (en) * 1990-12-29 1994-10-11 Casio Computer Co., Ltd. File update apparatus for generating a matrix representing a subset of files and the update correspondence between directories and files
CN103294658A (en) * 2012-03-02 2013-09-11 北大方正集团有限公司 Document storage method and device
CN102708191A (en) * 2012-05-15 2012-10-03 通唐软件技术(湖南)有限公司 Word stock coding and decoding method capable of saving memory
CN103020026A (en) * 2012-11-15 2013-04-03 无锡永中软件有限公司 Synergistic file processing system and method

Similar Documents

Publication Publication Date Title
US8117217B2 (en) Information processing apparatus and encoding method
CN101504662B (en) Data conversion method and apparatus
CN101937377A (en) Data recovery method and device
CN105589838A (en) Electronic official document trace reserving method based on file comparison
CN101430714B (en) Content structuring process method and system based on model
JPS63292365A (en) Character processor
CN101558405A (en) Migration apparatus which convert database of mainframe system into database of open system and method for thereof
CA2526701A1 (en) Object representing and processing method and apparatus
CN105243168A (en) Data migration method and system
CN105589842A (en) Typesetting method and device for digital publications
CN105488471B (en) A kind of font recognition methods and device
CN105404472A (en) Method and apparatus for compressing storage space of log time data
CN101499085B (en) Method and apparatus for fast extracting key frame
CN101925898A (en) Method and apparatus for organizing media data in database
CN103942186A (en) Method and system for managing documents
US20140172897A1 (en) Device, method, and program for processing data with tree structure
CN104699664B (en) The composing system and method for a kind of template independence
CN108846039B (en) Data flow direction determining method and device
CN103942185A (en) Method and system for storing document
CN107291574B (en) Backup data recovery primary key generation method based on interpretation system
JPH02297284A (en) Document processing system and version control system
CN106569986A (en) Character string replacement method and device
JP2007148751A (en) Encoding method, encoding device, encoding program and decoding device for structured document and data structure for encoded structured document
CN111401005B (en) Text conversion method and device and readable storage medium
CN112232032B (en) Automatic conversion method for content style of docx document

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140723

RJ01 Rejection of invention patent application after publication