CN105677746A - Database transaction operation based duplicate files merging system and method - Google Patents

Database transaction operation based duplicate files merging system and method Download PDF

Info

Publication number
CN105677746A
CN105677746A CN201511018294.1A CN201511018294A CN105677746A CN 105677746 A CN105677746 A CN 105677746A CN 201511018294 A CN201511018294 A CN 201511018294A CN 105677746 A CN105677746 A CN 105677746A
Authority
CN
China
Prior art keywords
document
upload
document body
data base
affairs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201511018294.1A
Other languages
Chinese (zh)
Inventor
莫华枫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eisoo Information Technology Co Ltd
Original Assignee
Shanghai Eisoo Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eisoo Information Technology Co Ltd filed Critical Shanghai Eisoo Information Technology Co Ltd
Priority to CN201511018294.1A priority Critical patent/CN105677746A/en
Publication of CN105677746A publication Critical patent/CN105677746A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving

Abstract

The invention relates to a database transaction operation based duplicate files merging system and a method. The system comprises a client terminal, a file management server and a database which are in successive connection. The file management server includes: a file uploading request responding module which responds to a file uploading request from the client terminal, initiating an uploading transaction to the database, executing corresponding operations in accordance with digital fingerprints of an uploaded file, and completing the uploading of the file; a file deleting request responding module which responds to a file deleting request from the client terminal, initiating a deleting transaction to the database, examining whether a corresponding file body have other references, and making the file body which does not have other references as to be recycled; a periodical detection module which scans file bodies in the database periodically and deletes the file body which are marked as to be recycled. Compared with prior art, the system and method of the invention use transaction operations of a rational database so as to guarantee that in the case of non-interrupted operation of the system, file body objects which are no longer referenced are correctly deleted.

Description

A kind of repetitive file merger system and method based on db transaction operation
Technical field
The present invention relates to document processing field, especially relate to a kind of High Availabitity repetitive file merger system and method based on db transaction operation.
Background technology
In document file management system, there is the situation that document repeats to preserve. Multiple different documents have identical content, and the content-data of repetition causes the waste of memory space. Usual document file management system finds the document of identical content according to the digital finger-print of document content, and only retains a copy of it document body, and makes document reference the document body that other are identical, thus reaching to save the purpose of memory space.
But in the process of this repetitive file merger, multiple document references are with a document volume data. In follow-up running, after document is deleted, the cited document body of the document would be likely to occur by the situation of other document references, thus cannot delete at once. Finally, since the situation that is cited of document body cannot be confirmed, for ensureing the reliability of data, do not do the deletion action of document body. Until whole system stops write operation, and the adduction relationship of all documents and document body is carried out all-round statistics, the document body being no longer cited could be deleted exactly. This has resulted in the complexity that document body is deleted.
Summary of the invention
Defect that the purpose of the present invention is contemplated to overcome above-mentioned prior art to exist and a kind of High Availabitity repetitive file merging method based on db transaction operation is provided, utilize the transaction operation of relevant database, ensure when system not interrupt run, correctly delete the document body object being no longer cited.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of repetitive file merger system based on db transaction operation, including the client being sequentially connected with, document management server and data base, wherein, described document management server includes:
Document upload request respond module, the document upload request of customer in response end, to initiate to upload affairs to data base, the digital finger-print according to uploading document performs corresponding operating, completes document and uploads;
Document removal request respond module, the document removal request of customer in response end, initiate to delete affairs to data base, check whether respective document body also has other to quote, and be state to be recycled by the document body tag quoted without other;
Periodic test module, for periodically scanning for the document body in data base, deletes the document body being labeled as state to be recycled.
Described document upload request respond module includes:
Document upload request receives unit, and for receiving the document upload request that client sends, described document upload request comprises the digital finger-print of document content to be uploaded;
Whether the first db transaction start unit, for initiating to upload affairs to data base, exist the document body identical with described digital finger-print in inquiry data base;
Reference record increases unit, respond when the Query Result of described first db transaction start unit is for being, for adding the reference record of a document body to having same numbers fingerprint to data base, terminate to upload affairs, and uploaded to client feedback document;
Document body increases unit, respond when Query Result at described db transaction start unit is no, for generating a globally unique document body ID, and in data base, add a document body record comprising described document body ID and corresponding reference record, terminate to upload affairs, and continue to upload document body instruction to client feedback.
Described document removal request respond module includes:
Document removal request receives unit, for receiving the document removal request that client sends;
Second db transaction start unit, for initiating to delete affairs to data base, deletes paper trail corresponding with document body to be deleted and document body reference record;
Quoting inspection unit, be used for checking whether document body to be deleted also has other to quote, if checking, result is yes, then directly terminate to delete affairs;
Document body tag unit, responds when the described inspection result quoting inspection unit is no, is state to be recycled by document body tag, terminates to delete affairs.
Described document body increases unit and also includes document storing subelement, for receiving the document body of client upload, and using described document body ID as filename, will receive document volume data and preserve to data base.
A kind of repetitive file merging method based on db transaction operation, uploads flow process, document deletion flow process and periodic test flow process including document, and described document is uploaded flow process and included:
A1) client sends document upload request, and described document upload request comprises the digital finger-print of document content to be uploaded;
A2) document management server initiates to upload affairs to data base, whether there is the document body identical with described digital finger-print in inquiry data base, if so, then performs step A3), if it is not, then perform step A4);
A3) document management server adds the reference record of a document body to having same numbers fingerprint to data base, terminates to upload affairs, and has uploaded to client feedback document;
A4) document management server generates a globally unique document body ID, and adds a document body record comprising described document body ID and corresponding reference record in data base, terminates to upload affairs, and continues to upload document body instruction to client feedback;
A5) client upload document body;
Described document is deleted flow process and is included:
B1) client sends document removal request;
B2) document management server initiates to delete affairs to data base, deletes paper trail corresponding with document body to be deleted and document body reference record;
B3) document management server checks that whether document body to be deleted also has other to quote, if checking, result be yes, then directly terminate to delete affairs, if it is not, be then state to be recycled by document body tag, terminates deletion affairs.
Described periodic test flow process is particularly as follows: the document body that periodically scans in data base, and the document body that will be labeled as state to be recycled is deleted.
Described step A5) in, after client upload document body, document volume data, using described document body ID as filename, is preserved to data base by document management server.
Compared with prior art, the invention have the advantages that
(1) present invention is uploaded by the document that two modules process client respectively and deletes with document, two modules can occur simultaneously, upload to lay respectively in respective affairs with deletion action and perform, can ensure that two groups of atomicities operated, do not have upload operation and refer to a document body, the situation that the document deletion action but simultaneously carried out by another is deleted.
(2) present invention utilizes two simultaneous upload operation of the transaction guarantee of relational database and deletion action to interfere, and solves the problem that mistake is deleted the document body that is cited and caused loss of data.
(3) present invention is additionally provided with an independent process for deleting the document body being marked as state to be recycled; clean out the document volume data being no longer used; ensure that can correct deletion is no longer cited in the non-stop-machine situation of system document body; memory space is recovered, it is ensured that the high availability of system.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet that document of the present invention is uploaded;
Fig. 2 is the schematic flow sheet that document of the present invention is deleted.
Detailed description of the invention
Below in conjunction with the drawings and specific embodiments, the present invention is described in detail. The present embodiment is carried out premised on technical solution of the present invention, gives detailed embodiment and concrete operating process, but protection scope of the present invention is not limited to following embodiment.
The present embodiment provides a kind of repetitive file merger system based on db transaction operation, including the client being sequentially connected with, document management server and data base, described document management server includes document upload request respond module, document removal request respond module and periodic test module, wherein, the document upload request of document upload request respond module customer in response end, initiating to upload affairs to data base, the digital finger-print according to uploading document performs corresponding operating, completes document and uploads; The document removal request of document removal request respond module customer in response end, initiates to delete affairs to data base, checks whether respective document body also has other to quote, and is " to be recycled " state by the document body tag quoted without other; Periodic test module is for periodically scanning for the document body in data base, and the document body that will be labeled as " to be recycled " state is deleted. This system is based on the transaction operation of relational database, when document file management system performs document deletion, what check associated document body in same affairs quotes situation, and when being absent from other and quoting, document body is transferred to state to be recycled, making subsequent document upload operation no longer quote the document body, the document body can reclaim (deletion) safely.
The process that the above-mentioned repetitive file merger system based on db transaction operation carries out aggregation of data includes document and uploads flow process, document deletion flow process and periodic test flow process.
As it is shown in figure 1, document upload flow process particularly as follows:
In step A1, client sends document upload request, and document upload request comprises the digital finger-print of document content to be uploaded, and digital finger-print is calculated acquisition by client according to document content to be uploaded;
In step A2, document management server initiates to upload affairs to data base, whether there is the document body identical with digital finger-print in inquiry data base, if so, then performs step A3, if it is not, then perform step A4;
In step A3, document management server adds the reference record of a document body to having same numbers fingerprint to data base, terminates to upload affairs, and has uploaded to client feedback document;
In step A4, document management server generates a globally unique document body ID, and adds a document body record comprising document body ID and corresponding reference record in data base, terminates to upload affairs, and continues to upload document body instruction to client feedback;
In step A5, client upload document body, document volume data, using document body ID as filename, is preserved to data base by document management server.
As in figure 2 it is shown, document delete flow process particularly as follows:
In step B1, client sends document removal request;
In step B2, document management server initiates to delete affairs to data base, deletes paper trail corresponding with document body to be deleted and document body reference record;
In step B3, document management server checks whether document body to be deleted also has other to quote, if checking, result is yes, then directly terminate to delete affairs, if it is not, be then " to be recycled " state by document body tag, terminate to delete affairs, deleted to client feedback document.
Periodic test flow process is particularly as follows: the document body that periodically scans in data base, and the document body that will be labeled as state to be recycled is deleted, and finally cleans out the document volume data being no longer used.

Claims (6)

1. the repetitive file merger system based on db transaction operation, it is characterised in that including the client, document management server and the data base that are sequentially connected with, wherein, described document management server includes:
Document upload request respond module, the document upload request of customer in response end, to initiate to upload affairs to data base, the digital finger-print according to uploading document performs corresponding operating, completes document and uploads;
Document removal request respond module, the document removal request of customer in response end, initiate to delete affairs to data base, check whether respective document body also has other to quote, and be state to be recycled by the document body tag quoted without other;
Periodic test module, for periodically scanning for the document body in data base, deletes the document body being labeled as state to be recycled.
2. the repetitive file merger system based on db transaction operation according to claim 1, it is characterised in that described document upload request respond module includes:
Document upload request receives unit, and for receiving the document upload request that client sends, described document upload request comprises the digital finger-print of document content to be uploaded;
Whether the first db transaction start unit, for initiating to upload affairs to data base, exist the document body identical with described digital finger-print in inquiry data base;
Reference record increases unit, respond when the Query Result of described first db transaction start unit is for being, for adding the reference record of a document body to having same numbers fingerprint to data base, terminate to upload affairs, and uploaded to client feedback document;
Document body increases unit, respond when Query Result at described db transaction start unit is no, for generating a globally unique document body ID, and in data base, add a document body record comprising described document body ID and corresponding reference record, terminate to upload affairs, and continue to upload document body instruction to client feedback.
3. the repetitive file merger system based on db transaction operation according to claim 1, it is characterized in that, described document body increases unit and also includes document storing subelement, for receiving the document body of client upload, and using described document body ID as filename, document volume data will be received and preserve to data base.
4. the repetitive file merger system based on db transaction operation according to claim 1, it is characterised in that described document removal request respond module includes:
Document removal request receives unit, for receiving the document removal request that client sends;
Second db transaction start unit, for initiating to delete affairs to data base, deletes paper trail corresponding with document body to be deleted and document body reference record;
Quoting inspection unit, be used for checking whether document body to be deleted also has other to quote, if checking, result is yes, then directly terminate to delete affairs;
Document body tag unit, responds when the described inspection result quoting inspection unit is no, is state to be recycled by document body tag, terminates to delete affairs.
5. the repetitive file merging method based on db transaction operation, it is characterised in that including document and upload flow process, document deletion flow process and periodic test flow process, described document is uploaded flow process and included:
A1) client sends document upload request, and described document upload request comprises the digital finger-print of document content to be uploaded;
A2) document management server initiates to upload affairs to data base, whether there is the document body identical with described digital finger-print in inquiry data base, if so, then performs step A3), if it is not, then perform step A4);
A3) document management server adds the reference record of a document body to having same numbers fingerprint to data base, terminates to upload affairs, and has uploaded to client feedback document;
A4) document management server generates a globally unique document body ID, and adds a document body record comprising described document body ID and corresponding reference record in data base, terminates to upload affairs, and continues to upload document body instruction to client feedback;
A5) client upload document body;
Described document is deleted flow process and is included:
B1) client sends document removal request;
B2) document management server initiates to delete affairs to data base, deletes paper trail corresponding with document body to be deleted and document body reference record;
B3) document management server checks that whether document body to be deleted also has other to quote, if checking, result be yes, then directly terminate to delete affairs, if it is not, be then state to be recycled by document body tag, terminates deletion affairs.
Described periodic test flow process is particularly as follows: the document body that periodically scans in data base, and the document body that will be labeled as state to be recycled is deleted.
6. the repetitive file merging method based on db transaction operation according to claim 5, it is characterized in that, described step A5) in, after client upload document body, document volume data, using described document body ID as filename, is preserved to data base by document management server.
CN201511018294.1A 2015-12-29 2015-12-29 Database transaction operation based duplicate files merging system and method Pending CN105677746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511018294.1A CN105677746A (en) 2015-12-29 2015-12-29 Database transaction operation based duplicate files merging system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511018294.1A CN105677746A (en) 2015-12-29 2015-12-29 Database transaction operation based duplicate files merging system and method

Publications (1)

Publication Number Publication Date
CN105677746A true CN105677746A (en) 2016-06-15

Family

ID=56297926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511018294.1A Pending CN105677746A (en) 2015-12-29 2015-12-29 Database transaction operation based duplicate files merging system and method

Country Status (1)

Country Link
CN (1) CN105677746A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287793A (en) * 2018-01-09 2018-07-17 网宿科技股份有限公司 The way to play for time and server of response message
CN110263060A (en) * 2019-06-06 2019-09-20 零搜科技(深圳)有限公司 A kind of ERP electronic accessories management method and computer equipment
CN116861455A (en) * 2023-06-25 2023-10-10 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium
CN116861455B (en) * 2023-06-25 2024-04-26 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667178A (en) * 2008-09-01 2010-03-10 北京数码大方科技有限公司 Transaction processing method used for C/S architecture file management system
CN101853288A (en) * 2010-05-19 2010-10-06 马晓普 Configurable full-text retrieval service system based on document real-time monitoring
CN103455631A (en) * 2013-09-22 2013-12-18 广州中国科学院软件应用技术研究所 Method, device and system for processing data
US9146930B2 (en) * 2012-07-10 2015-09-29 Tencent Technology (Shenzhen) Company, Limited Method and apparatus for file storage
CN105095300A (en) * 2014-05-16 2015-11-25 阿里巴巴集团控股有限公司 Method and system for database backup

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667178A (en) * 2008-09-01 2010-03-10 北京数码大方科技有限公司 Transaction processing method used for C/S architecture file management system
CN101853288A (en) * 2010-05-19 2010-10-06 马晓普 Configurable full-text retrieval service system based on document real-time monitoring
US9146930B2 (en) * 2012-07-10 2015-09-29 Tencent Technology (Shenzhen) Company, Limited Method and apparatus for file storage
CN103455631A (en) * 2013-09-22 2013-12-18 广州中国科学院软件应用技术研究所 Method, device and system for processing data
CN105095300A (en) * 2014-05-16 2015-11-25 阿里巴巴集团控股有限公司 Method and system for database backup

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287793A (en) * 2018-01-09 2018-07-17 网宿科技股份有限公司 The way to play for time and server of response message
CN110263060A (en) * 2019-06-06 2019-09-20 零搜科技(深圳)有限公司 A kind of ERP electronic accessories management method and computer equipment
CN116861455A (en) * 2023-06-25 2023-10-10 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium
CN116861455B (en) * 2023-06-25 2024-04-26 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium

Similar Documents

Publication Publication Date Title
US8176272B2 (en) Incremental backup using snapshot delta views
US9715507B2 (en) Techniques for reconciling metadata and data in a cloud storage system without service interruption
CN110262929B (en) Method for ensuring consistency of copying affairs and corresponding copying device
CN109299183A (en) A kind of data processing method, device, terminal device and storage medium
US20140181575A1 (en) Data error detection and correction using hash values
US20070115495A1 (en) Image processing apparatus, image processing system, computer readable medium, and image processing method
US8095510B2 (en) Data restoration in a storage system using multiple restore points
CN110753084B (en) Uplink data reading method, cache server and computer readable storage medium
CN105069111A (en) Similarity based data-block-grade data duplication removal method for cloud storage
KR20150043331A (en) De-duplicating attachments on message delivery and automated repair of attachments
CN104239443A (en) Serialization data operation log storage method
WO2020082744A1 (en) Data backup method and apparatus, and system
CN102546730A (en) Data processing method, device and system
CN108121774B (en) Data table backup method and terminal equipment
CN113806301A (en) Data synchronization method, device, server and storage medium
CN105677746A (en) Database transaction operation based duplicate files merging system and method
CN112363995A (en) Incremental data comparison method and device based on log analysis and electronic equipment
CN104965835A (en) Method and apparatus for reading and writing files of a distributed file system
CN104182182A (en) Intelligent terminal and data backup method thereof
CN106874399B (en) Networking backup system and backup method
KR101424568B1 (en) Client and database server for resumable transaction and method thereof
US20230205637A1 (en) Backup Recovery System and Method for Modern Application
CN112506711B (en) Power-on recovery method and system for solid state disk
CN111930828B (en) Data synchronization method and data synchronization system based on log analysis
CN111858504B (en) Operation merging execution method based on log analysis synchronization and data synchronization system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160615