CN109299056B - A kind of method of data synchronization and device based on distributed file system - Google Patents
A kind of method of data synchronization and device based on distributed file system Download PDFInfo
- Publication number
- CN109299056B CN109299056B CN201811096362.XA CN201811096362A CN109299056B CN 109299056 B CN109299056 B CN 109299056B CN 201811096362 A CN201811096362 A CN 201811096362A CN 109299056 B CN109299056 B CN 109299056B
- Authority
- CN
- China
- Prior art keywords
- data
- server
- file
- metadata
- source
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45562—Creating, deleting, cloning virtual machine instances
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45583—Memory management, e.g. access or allocation
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of method of data synchronization and device based on distributed file system, communication interaction is carried out using two server operating mode and physical server and virtual server simultaneously by data server, establish main memory cluster and metadata cluster based on database, signal difference writes data to the temporary data block based on the received, the file content inside former back end is replaced with the content of temporary data block, realizes that data are synchronous.The operating mode that the present invention is interactively communicated using physical server and virtual server and two server, Each performs its own functions is each responsible for for two servers, is switched to single server operating mode when necessary, the operational efficiency of system has been effectively ensured;Meanwhile the cluster content synchronous to data with metadata cluster carries out clustering processing to data store internal based on memory, treats respectively, the reasonable distribution of data isochronous resources and the accuracy that data are synchronous has been effectively ensured.
Description
Technical field
The application belongs to distributed proccessing field, and in particular, to a kind of data based on distributed file system
Synchronous method and device.
Background technique
With the continuous improvement of people's quality of life, the application of internet is also constantly being popularized.In order to more convenient
Be that we provide service, the application of internet also in continuous development and evolution, at the same time brought by network security
Property problem also just it is more and more, be allowed to constantly increase the demand of network security class product in the market.Pacify in numerous networks
Among universal class problem, the case where important information changes or loses caused by file is accidentally deleted, is the most serious, domestic thus
The research in outer website unexpected deleting technique field anti-to webpage is also carried out constantly.
At the initial stage that webpage tamper resistant systems occur, internal structure is very simple, functional module division is also less.Far
Some basic safety issues of website so are able to solve, but there are many shortcomings.If when hacker uses scale
Tamper resistant systems that are large-scale, acting when continuously the activity of distorting goes to attack some important website at this time will be unable to completion pair
The defencive function of website.It is increasing to the amount of access of webpage as dependence of the people to internet is more and more stronger, this
In the case of to guarding website safe and simple tamper resistant systems by powerless reply.Therefore, in order to effectively prevent webpage quilt
It distorts, the safety of guarding website, webpage tamper resistant systems are also in development gradually and perfect.With gradually mentioning for tamper-resistance techniques
Height, what the internal structure of tamper resistant systems also became becomes increasingly complex, and the division of functional module is also more and more.At this time anti-tamper
System is exactly based on the interaction of each intermodule, is fitted to each other, and completes the function of entire tamper resistant systems.The connection of each intermodule
It is close, all linked with one another, indispensable.
Therefore, in webpage tamper resistant systems, it is necessary to be carried out more to the synchronous system in distributed files realized herein
Optimization, be run through the function for the multimachine publication that simpler system information configuration operation can be completed to file, mention
The simplicity that high system uses;It is necessary to further deep in the design aspect expansion for improving file transmission efficiency for synchronization system
Enter research work, the function of abundant optimization system enables preferably to blend with tamper resistant systems, plays its due function
Energy.
Summary of the invention
A kind of method of data synchronization based on distributed file system is claimed in the present invention, uses physical services on the whole
The operating mode that device and virtual server and two server interactively communicate is used for the synchronization check of data interaction, and inside uses memory
The mode of data and the separation of metadata set faciation, each side's server execute corresponding work, synchronize to reach to data and synchronize in time, more
New accurate technical effect.
A kind of method of data synchronization based on distributed file system, it is characterised in that:
Data server carries out communication friendship using two server operating mode and physical server and virtual server simultaneously
In mutual process, the working condition of the physical server and the virtual server is monitored;
Wherein, the communication interaction include: with the physical server carry out signal interaction, with the physical server and
The virtual server carries out data interaction simultaneously;
Main memory cluster and metadata cluster based on database are established, the internal storage data storage of distributed file system is arrived should
Distributed data base in cluster, meanwhile, the metadata for storing the distributed data base in the cluster carries out processing operation;
The data server is if it is determined that the physical server breaks down and the virtual server works normally, then
Single server operating mode switching command is sent to the virtual server;
Automation physical server is to virtual server, the resources of virtual machine layered cylinders defined according to application logicframework
Reason, the smart allocation of computing resource, online dynamic expansion resource;
The data server receives the first confirmation response that the virtual server returns, and using single server work
Mode continues to carry out the signal interaction and the data interaction with the virtual server;
The back end of virtual server creates temporary data block, and it is interim to write data to this for signal difference based on the received
Data block replaces the file content inside former back end with the content of temporary data block, realizes that data are synchronous.
It can be seen that the present invention by foregoing invention content to hand over using physical server and virtual server and two server
The operating mode of mutual communication, Each performs its own functions is each responsible for for two servers, is switched to single server operating mode when necessary,
The operational efficiency of system has been effectively ensured;Meanwhile cluster is synchronous to data with metadata cluster based on memory for data store internal
Content carries out clustering processing, treats respectively, the reasonable distribution of data isochronous resources and the accuracy that data are synchronous has been effectively ensured.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Attached drawing 1 is a kind of workflow of the method for data synchronization based on distributed file system according to the present invention
Figure.
Attached drawing 2 is a kind of construction module of the data synchronization unit based on distributed file system according to the present invention
Figure.
Specific embodiment
The present invention protects a kind of method of data synchronization based on distributed file system first, is this method referring to attached drawing 1
Work flow diagram, it is characterised in that:
Data server carries out communication friendship using two server operating mode and physical server and virtual server simultaneously
In mutual process, the working condition of the physical server and the virtual server is monitored, wherein the communication interaction packet
It includes: carrying out signal interaction with the physical server, carry out data simultaneously with the physical server and the virtual server
Interaction;
Main memory cluster and metadata cluster based on database are established, the internal storage data storage of distributed file system is arrived should
Distributed data base in cluster, meanwhile, the metadata for storing the distributed data base in the cluster carries out processing operation;
The data server is if it is determined that the physical server breaks down and the virtual server works normally, then
Single server operating mode switching command is sent to the virtual server;
Automation physical server is to virtual server, the resources of virtual machine layered cylinders defined according to application logicframework
Reason, the smart allocation of computing resource, online dynamic expansion resource;
The data server receives the first confirmation response that the virtual server returns, and using single server work
Mode continues to carry out the signal interaction and the data interaction with the virtual server;
The back end of virtual server creates temporary data block, and it is interim to write data to this for signal difference based on the received
Data block replaces the file content inside former back end with the content of temporary data block, realizes that data are synchronous.
Wherein preferably, during the physical server and virtual server carry out communication interaction simultaneously further include:
It is established with virtual server and jumps connection, so that virtual server although it is understood that the current state of oneself, including current
Situations such as state, load, update, while virtual server being waited to send task requests, receive the company of virtual server transmission
After connecing client request, the operation requests that client is monitored in connection are established with client, are made a response in time;
After present physical server has submitted update from client, update can be synchronized to physics clothes by data cache server
Be engaged in device, update message is submitted into virtual server again after the completion, by virtual server control other data cache servers with
Physical server synchronizes.
There are two steps for general meeting when synchronizing to small documents: arriving metadata node first and obtains small documents rope
The stream of Data Node position and data file where drawing.What is obtained when writing among these is the output stream of data file, is read
When obtain be data file inlet flow.When small documents are operated, if the small documents of connected reference are subordinate to
In the same catalogue, then their these information are identical.System passes through in the corresponding index position of client-cache catalogue
The inlet flow of data file under information and the catalogue, output stream reduce the visit for interacting to improve file with metadata node
It asks speed, while also not needing that the modification of metadata node is frequently required to update mark, only just when updating mark and changing
It needs to carry out.The Information Number for being buffered in client indexes position is defaulted as 20, and user can be with self-setting.Caching default uses
LRU replacement policy, because the case where multiple threads are not present in client, to caching without using lock mechanism.
Due to client-cache small documents index position and update mark information, so there is no need to continually with member
Data Node access, allows the performance of system to be greatly improved, but to the consistency indexed in each Data Node
Bring problem.According to selection rule, the position that master data node is indexed as creation in index position mapping table, its content
It is newest.After client 1 such as inquires copy data node 1, it updates mark in the buffer becomes as N, and
It modifies to the corresponding item of metadata node mapping table.There is client 2 to obtain the information of mapping table at this time, learns copy
Data Node 1 updates flag bit N.The index of the creation of client 1 at this time is so that the update mark of copy data node 1 becomes
For Y, and client 2 can not actively be known, so the small documents that it will can't see client 1 and just created.
It is further preferred that the source path that is inputted according to client of synchronously control node in the back end is from source document
The metadata node of part system obtains copy list, creates thread pool and is that per thread distributes source document according to the copy list
Part, the copy list are the list of all source files under source path, filename, size and file road including each source file
Diameter.
Each thread of synchronously control node obtains the assigned source file of each thread from the metadata node of source file system
Metadata, obtain each data block that source file includes respectively from corresponding source data node according to the metadata of source file
Check code.
Each thread of synchronously control node obtains the corresponding target of each source file from the metadata node of target file system
The size of the metadata of file, reference source and file destination, according to comparison result, to the metadata node Shen of target file system
It please create or the data block of delete target file, so that file destination size is consistent with source file.
Each thread of synchronously control node reacquires the member of each file destination from the metadata node of target file system
Data obtain all data blocks that each file destination includes from corresponding target data node according to the metadata of each file destination
Check code.
Metadata and all source and target number of each thread of synchronously control node according to respective source and target file
According to block check code generate the list of file verification code, this document check code list include: the serial number of data block, source block ID,
Source block check code, source data node ID and target data block ID, target data block check code, target data node ID and
Whether target data block is the new marker bit for creating data block.
It can choose and source data space is split at random, the average segmentation of dimension longer to span clusters each dimension
Cutting etc., the cluster of large-scale distributed file system usually cross over multiple racks, logical between the computer in different racks
Letter is needed through interchanger, and transmission cost is big.And in most cases, the bandwidth between two computers of different racks is more same
Two intercomputers in rack it is small.At present the duplicate strategy of distributed file system be duplicate is stored in two it is different
It in rack, can prevent the rack from can lose data when ging wrong, while when data are read, nearest original can be used
Then, when accessing the node of the storage source data nearest from client computer, or reducing reading using the bandwidth between different racks
Between.Calculating is moved near data memory node, mobile operation obviously more efficient nearby compared with calculate node is moved data over into
Cost it is low cross mobile data.
Further, it is preferable to, the distributed file system uses MapReduce thread work, the MapReduce
The number to create directory that program is provided according to user creates the quantity of small documents under each catalogue in distributed file system
Creation and input file of the control file as Map function of catalogue number as many.
The Map function of test, the small documents of main creation specified quantity and size using mass small documents storage system
All small documents write under createing directory when test are read with the interface for using mass small documents storage system to read small documents.
It writes test and reads test and use identical Reduce function, which counts the output data of each Map, such as
The total size of MapReduce program test file, total quantity, total run time etc., and these data are stored in distributed text
In part system.
Interpretation of result function can read the result data of Reduce statistics from distributed file system, and by giving
Fixed formula calculates, the write-in of distributed file system or mass small documents storage system and the speed etc. for reading small documents.
After each MapReduce the end of the program, need to record mass small documents storage system and distribution text
The memory occupation rate of part system in systems.
It is further preferred that the metadata cluster, authorized user accesses data cache server cluster, is quickly visitor
Family end and data server establish connection;The state of each data cache server of real-time monitoring, according to status information come for
The data cache server for being capable of providing optimal service is distributed at family.Ensure that user data exists using buffer consistency strategy simultaneously
Consistency and stability in data cache server cluster;Metadata cluster is controlled by virtual server, is responsible for client
The interaction for carrying out data, is in real time monitored each user data state.Status information is submitted into Virtual Service simultaneously
Device, it is ensured that traceability of the control server to user data state.Heartbeat is established between transaction controlling server to connect, it will
The factor that itself network availability bandwidth, cpu busy percentage etc. influence service quality passes to virtual server.
Within the storage system, the file of a physics just corresponds to the expression of a logic, the metadata information of composition.When
When carrying out file reading, the reading of logical file will be first carried out, then further according to composed metadata information sequence,
That data block corresponding thereto is taken out from storage system, finally restores the copy of its physical file.Data text
Part stores the data of small documents by the data structure of key/value, not only reduces mass small documents in distributed text
The scale of metadata in part system, while the access speed (by reducing the interaction with metadata node) of small documents is improved,
It is more advantageous to the data processing based on MapReduce, provides support for distributed computing.
The small documents that client stores under same catalogue are all saved in the data file under the catalogue, the data
File is the file in distributed file system.Generate index simultaneously, record small documents specific location in the data file and
Other relevant informations, and by index transfer to each Data Node carry out maintenance management and Data Node to client provide index clothes
Business.Metadata node needs to record the Data Node for being used to safeguard small documents index.When client needs to propose to Data Node
The position that Data Node is first obtained from distributed file system is needed under some catalogue when the index service request of small documents.Client
Data Node position and the data file information for holding caching mechanism meeting cache maintenance small documents index, reduce access metadata node
Number to substantially increase the access speeds of small documents.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.
Claims (2)
1. a kind of method of data synchronization based on distributed file system, it is characterised in that:
Data server carries out communication interaction using two server operating mode and physical server and virtual server simultaneously
In the process, monitor the working condition of the physical server and the virtual server, wherein the communication interaction include: with
The physical server carries out signal interaction, carries out data interaction simultaneously with the physical server and the virtual server;
Main memory cluster and metadata cluster based on database are established, the internal storage data of distributed file system is stored to memory collection
Group, meanwhile, the metadata of metadata cluster distributed storage file system carries out processing operation;
The data server is if it is determined that the physical server breaks down and the virtual server works normally, then to institute
It states virtual server and sends single server operating mode switching command;
Physical server is automated to virtual server, the resources of virtual machine multi-zone supervision defined according to application logicframework is counted
Calculate the smart allocation of resource, online dynamic expansion resource;
The data server receives the first confirmation response that the virtual server returns, and using single server operating mode
Continue to carry out the signal interaction and the data interaction with the virtual server;
The back end of virtual server creates temporary data block, and signal difference writes data to the ephemeral data based on the received
Block replaces the file content inside former back end with the content of temporary data block, realizes that data are synchronous;
During the physical server and virtual server carry out communication interaction simultaneously further include:
It is established with virtual server and jumps connection, so that virtual server although it is understood that the current state of oneself, including current state,
Load, update status, while virtual server being waited to send task requests, receive the connection client of virtual server transmission
After request, the operation requests that client is monitored in connection are established with client, are made a response in time;Present physical server is from client
After end has submitted update, update can be synchronized to physical server by data cache server, after the completion again submit update message
To virtual server, other data cache servers are controlled by virtual server and are synchronized with physical server;
Metadata section of the source path that synchronously control node in the back end is inputted according to client from source file system
Point obtains copy list, creates thread pool and is that per thread distributes source file according to the copy list, which is source
The list of all source files under path, filename, size and file path including each source file;
Each thread of synchronously control node obtains the member of the assigned source file of each thread from the metadata node of source file system
Data obtain the verification for each data block that source file includes according to the metadata of source file respectively from corresponding source data node
Code;
Each thread of synchronously control node obtains the corresponding file destination of each source file from the metadata node of target file system
Metadata, the size of reference source and file destination created to the metadata node application of target file system according to comparison result
It builds or the data block of delete target file, so that file destination size is consistent with source file;
Each thread of synchronously control node reacquires the metadata of each file destination from the metadata node of target file system,
The school for all data blocks that each file destination includes is obtained from corresponding target data node according to the metadata of each file destination
Test code;
Each thread of synchronously control node is according to the metadata and all source and target data blocks of respective source and target file
Check code generate the list of file verification code, this document check code list includes: the serial number of data block, source block ID, source number
According to block check code, source data node ID and target data block ID, target data block check code, target data node ID and target
Whether data block is the new marker bit for creating data block;
The distributed file system uses MapReduce thread work, what the MapReduce program was provided according to user
The number to create directory, the quantity that small documents are created under each catalogue create as catalogue number in distributed file system
Input file of more control files as Map function;
The Map function of test mainly utilizes the creation specified quantity of mass small documents storage system and the small documents of size and makes
All small documents write under createing directory when test are read with the interface that mass small documents storage system reads small documents;
It writes test and reads test and use identical Reduce function, which counts the output data of each Map, including
The total size of MapReduce program test file, total quantity, total run time, and these data are stored in distributed document
In system;
Interpretation of result function can read the result data of Reduce statistics from distributed file system, and by given
Formula calculates, the write-in of distributed file system or mass small documents storage system and the speed for reading small documents;
After each MapReduce the end of the program, need to record mass small documents storage system and distributed field system
The memory occupation rate of system in systems;
The metadata cluster, authorized user access data cache server cluster, are quickly client and data server
Establish connection;The state of each data cache server of real-time monitoring is capable of providing most according to status information for user's distribution
The data cache server of excellent service, while ensuring user data in data cache server collection using buffer consistency strategy
Consistency and stability in group;Metadata cluster is controlled by virtual server, is responsible for carrying out the interaction of data with client, real
When each user data state is monitored, while status information is submitted into virtual server, it is ensured that control server
To the traceability of user data state, heartbeat being established between transaction controlling server and is connect, the network of itself can be used into band
The factor that wide, cpu busy percentage influences service quality passes to virtual server.
2. a kind of data synchronization unit based on distributed file system, it is characterised in that:
Corresponding device includes data server, physical server, virtual server, database;Wherein,
Data server carries out communication interaction using two server operating mode and physical server and virtual server simultaneously
In the process, monitor the working condition of the physical server and the virtual server, wherein the communication interaction include: with
The physical server carries out signal interaction, carries out data interaction simultaneously with the physical server and the virtual server;
Main memory cluster and metadata cluster based on database are established, the internal storage data of distributed file system is stored to memory collection
Group, meanwhile, the metadata of metadata cluster distributed storage file system carries out processing operation;
The data server is if it is determined that the physical server breaks down and the virtual server works normally, then to institute
It states virtual server and sends single server operating mode switching command;
Physical server is automated to virtual server, the resources of virtual machine multi-zone supervision defined according to application logicframework is counted
Calculate the smart allocation of resource, online dynamic expansion resource;
The data server receives the first confirmation response that the virtual server returns, and using single server operating mode
Continue to carry out the signal interaction and the data interaction with the virtual server;
The back end of virtual server creates temporary data block, and signal difference writes data to the ephemeral data based on the received
Block replaces the file content inside former back end with the content of temporary data block, realizes that data are synchronous;
During the physical server and virtual server carry out communication interaction simultaneously further include:
It establishes also to jump with virtual server and connect, so that virtual server although it is understood that the current state of oneself, including current shape
State, including load, update status, while virtual server being waited to send task requests, receive the company of virtual server transmission
After connecing client request, the operation requests that client is monitored in connection are established with client, are made a response in time;Present physical service
After device has submitted update from client, update can be synchronized to physical server by data cache server, will be updated again after the completion
Message submits to virtual server, controls other data cache servers by virtual server and synchronizes with physical server;
Metadata section of the source path that synchronously control node in the back end is inputted according to client from source file system
Point obtains copy list, creates thread pool and is that per thread distributes source file according to the copy list, which is source
The list of all source files under path, filename, size and file path including each source file;
Each thread of synchronously control node obtains the member of the assigned source file of each thread from the metadata node of source file system
Data obtain the verification for each data block that source file includes according to the metadata of source file respectively from corresponding source data node
Code;
Each thread of synchronously control node obtains the corresponding file destination of each source file from the metadata node of target file system
Metadata, the size of reference source and file destination created to the metadata node application of target file system according to comparison result
It builds or the data block of delete target file, so that file destination size is consistent with source file;
Each thread of synchronously control node reacquires the metadata of each file destination from the metadata node of target file system,
The school for all data blocks that each file destination includes is obtained from corresponding target data node according to the metadata of each file destination
Test code;
Each thread of synchronously control node is according to the metadata and all source and target data blocks of respective source and target file
Check code generate the list of file verification code, this document check code list includes: the serial number of data block, source block ID, source number
According to block check code, source data node ID and target data block ID, target data block check code, target data node ID and target
Whether data block is the new marker bit for creating data block;
The distributed file system uses MapReduce thread work, what the MapReduce program was provided according to user
The number to create directory, the quantity that small documents are created under each catalogue create as catalogue number in distributed file system
Input file of more control files as Map function;
The Map function of test mainly utilizes the creation specified quantity of mass small documents storage system and the small documents of size and makes
All small documents write under createing directory when test are read with the interface that mass small documents storage system reads small documents;
It writes test and reads test and use identical Reduce function, which counts the output data of each Map, including
The total size of MapReduce program test file, total quantity, total run time, and these data are stored in distributed document
In system;
Interpretation of result function can read the result data of Reduce statistics from distributed file system, and by given
Formula calculates, the write-in of distributed file system or mass small documents storage system and the speed for reading small documents;
After each MapReduce the end of the program, need to record mass small documents storage system and distributed field system
The memory occupation rate of system in systems;
The metadata cluster, authorized user access data cache server cluster, are quickly client and data server
Establish connection;The state of each data cache server of real-time monitoring is capable of providing most according to status information for user's distribution
The data cache server of excellent service, while ensuring user data in data cache server collection using buffer consistency strategy
Consistency and stability in group;Metadata cluster is controlled by virtual server, is responsible for carrying out the interaction of data with client, real
When each user data state is monitored, while status information is submitted into virtual server, it is ensured that control server
To the traceability of user data state, heartbeat being established between transaction controlling server and is connect, the network of itself can be used into band
The factor that wide, cpu busy percentage influences service quality passes to virtual server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811096362.XA CN109299056B (en) | 2018-09-19 | 2018-09-19 | A kind of method of data synchronization and device based on distributed file system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811096362.XA CN109299056B (en) | 2018-09-19 | 2018-09-19 | A kind of method of data synchronization and device based on distributed file system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109299056A CN109299056A (en) | 2019-02-01 |
CN109299056B true CN109299056B (en) | 2019-10-01 |
Family
ID=65163433
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811096362.XA Expired - Fee Related CN109299056B (en) | 2018-09-19 | 2018-09-19 | A kind of method of data synchronization and device based on distributed file system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299056B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112187916B (en) * | 2020-09-27 | 2023-12-05 | 中国银联股份有限公司 | Cross-system data synchronization method and device |
CN113672579B (en) * | 2021-08-29 | 2023-11-14 | 中盾创新数字科技(北京)有限公司 | File synchronization method based on webservice |
CN113901141B (en) * | 2021-10-11 | 2022-08-05 | 京信数据科技有限公司 | Distributed data synchronization method and system |
CN114048269B (en) * | 2022-01-12 | 2022-04-22 | 北京奥星贝斯科技有限公司 | Method and device for synchronously updating metadata in distributed database |
CN114328421B (en) * | 2022-03-17 | 2022-06-10 | 联想凌拓科技有限公司 | Metadata service architecture management method, computer system, electronic device and medium |
CN117312264B (en) * | 2023-12-01 | 2024-02-20 | 中孚信息股份有限公司 | File synchronization method, system, equipment and medium in virtual disk system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268252A (en) * | 2013-05-12 | 2013-08-28 | 南京载玄信息科技有限公司 | Virtualization platform system based on distributed storage and achieving method thereof |
CN103761162A (en) * | 2014-01-11 | 2014-04-30 | 深圳清华大学研究院 | Data backup method of distributed file system |
CN104113587A (en) * | 2014-06-23 | 2014-10-22 | 华中科技大学 | Client metadata buffer optimization method of distributed file system |
-
2018
- 2018-09-19 CN CN201811096362.XA patent/CN109299056B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268252A (en) * | 2013-05-12 | 2013-08-28 | 南京载玄信息科技有限公司 | Virtualization platform system based on distributed storage and achieving method thereof |
CN103761162A (en) * | 2014-01-11 | 2014-04-30 | 深圳清华大学研究院 | Data backup method of distributed file system |
CN104113587A (en) * | 2014-06-23 | 2014-10-22 | 华中科技大学 | Client metadata buffer optimization method of distributed file system |
Also Published As
Publication number | Publication date |
---|---|
CN109299056A (en) | 2019-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109299056B (en) | A kind of method of data synchronization and device based on distributed file system | |
US11789925B2 (en) | System and method for conditionally updating an item with attribute granularity | |
AU2017218964B2 (en) | Cloud-based distributed persistence and cache data model | |
US9460185B2 (en) | Storage device selection for database partition replicas | |
US9489443B1 (en) | Scheduling of splits and moves of database partitions | |
US11609697B2 (en) | System and method for providing a committed throughput level in a data store | |
CN107247778B (en) | System and method for implementing an extensible data storage service | |
US8572091B1 (en) | System and method for partitioning and indexing table data using a composite primary key | |
CN106775446B (en) | Distributed file system small file access method based on solid state disk acceleration | |
US20170249246A1 (en) | Deduplication and garbage collection across logical databases | |
CN108833503A (en) | A kind of Redis cluster method based on ZooKeeper | |
CN107800808A (en) | A kind of data-storage system based on Hadoop framework | |
CN104133882A (en) | HDFS (Hadoop Distributed File System)-based old file processing method | |
US11080207B2 (en) | Caching framework for big-data engines in the cloud | |
CN111984696B (en) | Novel database and method | |
CN107343021A (en) | A kind of Log Administration System based on big data applied in state's net cloud | |
US11250019B1 (en) | Eventually consistent replication in a time-series database | |
US11256719B1 (en) | Ingestion partition auto-scaling in a time-series database | |
US11263270B1 (en) | Heat balancing in a distributed time-series database | |
CN103501319A (en) | Low-delay distributed storage system for small files | |
US10909143B1 (en) | Shared pages for database copies | |
CN103595799A (en) | Method for achieving distributed shared data bank | |
CN103365987B (en) | Clustered database system and data processing method based on shared-disk framework | |
US11409771B1 (en) | Splitting partitions across clusters in a time-series database | |
US11366598B1 (en) | Dynamic lease assignments in a time-series database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191001 Termination date: 20200919 |