CN109614383A - Data copy method, device, electronic equipment and storage medium - Google Patents
Data copy method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN109614383A CN109614383A CN201811387721.7A CN201811387721A CN109614383A CN 109614383 A CN109614383 A CN 109614383A CN 201811387721 A CN201811387721 A CN 201811387721A CN 109614383 A CN109614383 A CN 109614383A
- Authority
- CN
- China
- Prior art keywords
- back end
- source
- data
- files
- source file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The embodiment provides a kind of data copy method, device, electronic equipment and storage mediums, are related to big data technical field.This method comprises: obtaining the position of the source data node where source file block and the position of the purpose back end where purpose blocks of files;Whether the position of the position and the purpose back end that judge the source data node belongs to the same back end;When determining to belong to same back end, the source file block is replicated using hard chain mode;When determining to be not belonging to same back end, then the source file block is copied into the purpose blocks of files by the way of data copy.The technical solution of the embodiment of the present invention can significantly improve duplicating efficiency, reduce the occupancy to practical hard-disc storage space.
Description
Technical field
The present invention relates to big data technical field, in particular to a kind of data copy method, data copy device,
Electronic equipment and computer readable storage medium.
Background technique
With the development of internet technology, distributed file system such as HDFS (Hadoop Distributed File
System, distributed file system) using more and more extensive.In distributed file system such as HDFS, it is often necessary to right
The file of back end storage is replicated or is copied.
In a kind of technical solution, when carrying out file copy inside distributed system such as HDFS, it can be ordered by Cp
The mode of order carries out file copy, can also carry out file copy by way of Distcp order.The mode of Cp be obtain to
Listed files all under catalogue is copied, the copy of file metadata, blocks of files is then carried out.Distcp mode is also first to obtain
It needs to copy listed files all under catalogue, then starts distribution map task according to the parameter of configuration, carry out concurrent type frog
File duplication.
However, actual file read-write can all occur for the mode of Cp order and Distcp order, require first to read source document
Then destination address is written again in source file by part.In a distributed system, it also occur that the case where across a network is read and write.Both
Since the speed of copy is limited by hardware disk, network interface card, concurrent process, the data for often copying a large capacity need scheme
Want several hours.In addition, in both schemes, it is right due to will use actual disk space after carrying out file copy
In the more distributed file system of repeated data, disk space usage is extremely low.
Accordingly, it is desirable to provide a kind of data copy method, the number of the one or more problems being able to solve in the above problem
According to reproducing unit, electronic equipment and computer readable storage medium.
It should be noted that information is only used for reinforcing the reason to background of the present invention disclosed in above-mentioned background technology part
Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The purpose of the present invention is to provide a kind of data copy method, data copy device, electronic equipment and computers
Readable storage medium storing program for executing, and then overcome at least to a certain extent due to the limitation copy by hardware disk, network interface card, concurrent process
The low problem of the problem and disk space usage of time length.
According to a first aspect of the embodiments of the present invention, a kind of data copy method is provided, using with multiple data sections
The distributed system of point, comprising: the position for obtaining the source data node where source file block and the mesh where purpose blocks of files
Back end position;It is same whether the position of the position and the purpose back end that judge the source data node belongs to
A back end;When determining to belong to same back end, the source file block is replicated using hard chain mode;Determining
When being not belonging to same back end, then the source file block is copied into the purpose blocks of files by the way of data copy.
In some embodiments of the invention, aforementioned schemes, the data copy method are based on further include: receiving pair
When the data of the source file update request, request is updated based on the data and determines file to be updated from name node
Block;Judge that the blocks of files to be updated is linked with the presence or absence of hard chain;Determining to create temporary file block there are when the link of hard chain
The content of the source file block is replicated, and data are carried out to the temporary file block and update operation;Determining that hard chain chain is not present
When connecing, data directly are carried out to blocks of files to be updated and update operation.
In some embodiments of the invention, aforementioned schemes, the data copy method are based on further include: traversal title section
The catalogue of source file in point obtains all source file block messages of source file;Described in being obtained from the source file block message
The position of source data node creates purpose blocks of files based on the position of the source file block message and the source data node;
Replication task is generated based on the source file block, the position of the source data node, the purpose blocks of files.
In some embodiments of the invention, aforementioned schemes, the data copy method further include: from the duplication are based on
The position of the source data node where the source file block is obtained in task;It is determined based on the position of the source data node
Purpose back end where the purpose blocks of files.
In some embodiments of the invention, aforementioned schemes are based on, described in the position determination based on the source data node
Purpose back end where purpose blocks of files, comprising: the position of the source data node is determined as the purpose blocks of files
Purpose back end.
In some embodiments of the invention, aforementioned schemes are based on, the source file block is obtained from the replication task
The position of the source data node at place, comprising: obtain the source document from the replication task by way of multithreading
The position of the source data node where part.
In some embodiments of the invention, aforementioned schemes are based on, the source file block is answered using hard chain mode
System, comprising: the source file block is directed toward to the newly-built link of the purpose blocks of files in the source data node.
According to a second aspect of the embodiments of the present invention, a kind of data copy device is provided, using with multiple data sections
The distributed system of point, comprising: information acquisition unit, for obtaining position and the mesh of the source data node where source file block
Blocks of files where purpose back end position;Judging unit, for judge the position of the source data node with it is described
Whether the position of purpose back end belongs to the same back end;Local replica unit, for determining to belong to same data
When node, the source file block is replicated using hard chain mode;Data copy cell, for determining to be not belonging to same number
When according to node, then the source file block is copied into the purpose blocks of files by the way of data duplication.
According to a third aspect of the embodiments of the present invention, a kind of electronic equipment is provided, comprising: processor;And memory,
It is stored with computer-readable instruction on the memory, is realized when the computer-readable instruction is executed by the processor as above
State data copy method described in first aspect.
According to a fourth aspect of the embodiments of the present invention, a kind of computer readable storage medium is provided, meter is stored thereon with
Calculation machine program realizes the data copy method as described in above-mentioned first aspect when the computer program is executed by processor.
In the technical solution provided by some embodiments of the present invention, on the one hand, obtain the source data section of source file block
The position of the purpose back end of the position and purpose blocks of files of point, being capable of position according to source data node and purpose data
The position of node judges whether source file block and purpose blocks of files belong to the same back end;On the other hand, in source file block
When belonging to same back end with purpose blocks of files, source file is replicated using hard chain mode, due to hard chain mode not into
The duplication of row actual file reduces the occupancy to practical hard-disc storage space so as to significantly improve duplicating efficiency, improves hard disk
The utilization rate of memory space.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
It can the limitation present invention.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention
Example, and be used to explain the principle of the present invention together with specification.It should be evident that the accompanying drawings in the following description is only the present invention
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 shows the flow diagram of data copy method according to some embodiments of the present invention;
Fig. 2 shows the flow diagrams of the data copy method of other embodiments according to the present invention;
Fig. 3 shows the schematic block diagram of the data copy device of an exemplary embodiment according to the present invention;
Fig. 4 shows the structural schematic diagram for being suitable for the computer system for the electronic equipment for being used to realize the embodiment of the present invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms
It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the present invention will be comprehensively and complete
It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure
Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner
In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However,
It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one or more in specific detail,
Or it can be using other methods, constituent element, device, step etc..In other cases, it is not shown in detail or describes known side
Method, device, realization or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity.
I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit
These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Flow chart shown in the drawings is merely illustrative, it is not necessary to including all content and operation/step,
It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close
And or part merge, therefore the sequence actually executed is possible to change according to the actual situation.
Fig. 1 shows the flow diagram of data copy method according to some embodiments of the present invention.Institute referring to Fig.1
Show, which may comprise steps of:
Step S110, the position for obtaining the source data node where source file block and the purpose number where purpose blocks of files
According to the position of node;
Step S120, it is same whether the position of the position and the purpose back end that judge the source data node belongs to
A back end;
Step S130 replicates the source file block using hard chain mode when determining to belong to same back end;
Step S140, when determining to be not belonging to same back end, then by the source file by the way of data copy
Block copies to the purpose blocks of files.
According to the data copy method in the example embodiment of Fig. 1, on the one hand, obtain the source data node of source file block
The position of the purpose back end of position and purpose blocks of files, being capable of position according to source data node and purpose back end
Position judge source file block and whether purpose blocks of files belongs to the same back end;On the other hand, in source file block and mesh
Blocks of files when belonging to same back end, source file is replicated using hard chain mode, since hard chain mode is without reality
File duplication in border reduces the occupancy to practical hard-disc storage space so as to significantly improve duplicating efficiency, improves hard-disc storage
The utilization rate in space.
In the following, the data copy method in the example embodiment to Fig. 1 is described in detail.
In step s 110, the position for obtaining the source data node where source file block and the mesh where purpose blocks of files
Back end position.
In the exemplary embodiment, all of to be copied or duplication source file can be inquired from name node NameNode
File block message, and obtain the location information of back end locating for these source file blocks.Further, it is also possible to be obtained from name node
Take the position of the purpose back end of the back end where purpose file.Name node, that is, NameNode is distributed field system
Manager in system is responsible for management file system name space, duplication of data block etc..Back end, that is, DataNode is file
The basic unit of storage saves the content of the file in distributed file system in the form of data block or blocks of files.
Step S120, it is same whether the position of the position and the purpose back end that judge the source data node belongs to
A back end.
In the exemplary embodiment, judge position and the purpose file of the source data node of the back end where source file block
Whether the position of the purpose back end of the back end where block belongs to the same back end.
Step S130 replicates the source file block using hard chain mode when determining to belong to same back end.
In the exemplary embodiment, the position of the source data node of the back end where determining source file block and purpose text
When the position of the purpose back end of back end where part block belongs to same back end, copied using hard chain mode or local
The mode of shellfish replicates source file block, i.e., in the way of the hard chain of linux system, creates a link and be directed toward source file
Block.
Hard link (hard link) is equivalent to an alias of blocks of files.What it was directed toward is a file inode (index
Node) reference address, rather than file path in soft link is directed toward.So the file in hard link, which is made an amendment, will affect
To the authentic document pointed by it, after doing deletion movement to hard link, if the file inode pointed by it is currently without quilt
If external hard link reference, then original can be deleted, and otherwise original will not be deleted.
Step S140, when determining to be not belonging to same back end, then by the source file by the way of data copy
Block copies to the purpose blocks of files.
In the exemplary embodiment, the position of the source data node of the back end where determining source file block and purpose text
When the position of the purpose back end of back end where part block is not belonging to same back end, by the way of data copy
Source file block is copied into purpose blocks of files.The mode of data copy needs the data of source block being written to other data sections
Practically reading and writing data occurs for point.
In addition, in the exemplary embodiment, traversing the catalogue of the source file in name node, obtaining all source documents of source file
Part block message;The position that the source data node of source file block is obtained from source file block message is affiliated back end, is based on
The position of source file block message and source data node creates purpose blocks of files;Based on source file block, source data node position,
Purpose blocks of files generates replication task or copy task.When creating purpose blocks of files, the purpose blocks of files of name node is generated
Information.Node actual use insufficient space causes copy to fail in order to prevent, needs to generate the back end of preferential replicate data
Information, the back end information is consistent with the back end information of the data block of source file, in the insufficient space of back end,
It can be copied to other back end.
Further, in the exemplary embodiment, can be read by way of multithreading in replication task source file block and
The information of purpose blocks of files, is communicated with name node, the back end where obtaining source file block in name node
Position.
In addition, in the exemplary embodiment, due to having used hard chain mode, being carried out more to actual blocks of files or data block
When new or modification, since source file block and purpose blocks of files are directed toward same data block, the content and purpose of source file block will lead to
The content of blocks of files can all be modified, and need particularly to handle.Specifically, being asked receiving the data update to the source file
When such as append being asked to update operation, request is updated based on data and determines blocks of files to be updated from name node;Judge institute
Blocks of files to be updated is stated to link with the presence or absence of hard chain;Determining to create described in the duplication of temporary file block there are when the link of hard chain
The content of source file block, and data are carried out to the temporary file block and update operation;When determining to link there is no hard chain, directly
Data are carried out to blocks of files to be updated and update operation.
Fig. 2 shows the flow diagrams of the data copy method of other embodiments according to the present invention.
Referring to shown in Fig. 2, in step S210, the data copy request that client is sent is received.For example, client is initiated
Duplicate requests are communicated with the back end where source file block, initiate data copy to the back end where source file
Request.
Further, in the exemplary embodiment, the catalogue for traversing the source file in name node obtains all of source file
Source file block message;The position that the source data node of source file block is obtained from source file block message is affiliated back end,
Purpose blocks of files is created based on the position of source file block message and source data node;Based on source file block, source data node
Position, purpose blocks of files generate replication task or copy task.When creating purpose blocks of files, the purpose text of name node is generated
Part block message.Node actual use insufficient space causes copy to fail in order to prevent, needs to generate the data of preferential replicate data
Nodal information, the back end information is consistent with the back end information of the data block of source file, the space of back end not
When sufficient, it can be copied to other back end.
In step S220, data copy request is read by way of multithreading, and source is obtained from data copy request
Blocks of files and purpose blocks of files communicate with name node back end position and purpose text where obtaining source file block
Back end position where part block.
Position and purpose blocks of files institute in step S230, in the source data node of the back end where source file block
The position of purpose back end of back end belong to same back end i.e. back end 1 when, using hard chain mode or
The mode of local copy replicates source file block, i.e., in the way of the hard chain of linux system, creates a link direction source
Blocks of files, the operation for creating link is substantially Millisecond, so as to significantly improve duplicating efficiency.It further, can be with
The information of hard link block is sent to name node.
Position and purpose blocks of files institute in step S240, in the source data node of the back end where source file block
The position of purpose back end of back end be not belonging to same back end when i.e. source file block belong to back end 1,
Purpose blocks of files belongs to back end 2, and source file block is copied to back end from back end 1 by the way of data copy
2 purpose blocks of files.The mode of data copy needs for the data of source block to be written to other back end, occurs practically
Reading and writing data.The mode of data copy is consistent with the mode that cp is copied with distcp.
In the exemplary embodiment, since data can carry out continuous iteration more new production based on original version, even if new
When version carries out data production, and the partial document directly covered, the update operation of blocks of files can't occur.Using former
When raw cp order or distcp command mode carry out file copy, one versions of data of several hours copies is generally required,
The use space that can also additionally double.And after using the technical solution of example embodiments of the present invention to carry out data copy,
The version iteration period for carrying out data greatly shortens, it is only necessary to which even copying for a versions of data can be completed for several seconds in a few minutes
Shellfish, while the practical occupied space that data will not occur increases, so as to save significantly on human time's cost, reduce hardware
Cost.
In addition, in an embodiment of the present invention, additionally providing a kind of data copy device, which can be answered
With the distributed system with multiple back end.Referring to shown in Fig. 3, which may include: acquisition of information
Unit 310, judging unit 320, local replica unit 330 and data copy cell 340.Wherein, information acquisition unit 310 is used
In the position of the purpose back end where the position and purpose blocks of files for obtaining the source data node where source file block;Sentence
Disconnected unit 320 is for judging whether the position of the source data node and the position of the purpose back end belong to same number
According to node;Local replica unit 330 is used for when determining to belong to same back end, using hard chain mode to the source file block
It is replicated;Data copy cell 340 is used for when determining to be not belonging to same back end, then will by the way of data duplication
The source file block copies to the purpose blocks of files.
In some embodiments of the invention, aforementioned schemes, the data copy device 300 are based on further include: determine single
Member, for updating request based on the data from name node when receiving the data update request to the source file
Determine blocks of files to be updated;Hard chain links judging unit, for judging the blocks of files to be updated with the presence or absence of hard chain chain
It connects;First updating unit, for determining that creation temporary file block replicates the interior of the source file block there are when the link of hard chain
Hold, and data are carried out to the temporary file block and update operation;Second updating unit, for determining that there is no hard chains to link
When, data directly are carried out to blocks of files to be updated and update operation.
In some embodiments of the invention, aforementioned schemes, the data copy device 300 further include: source file are based on
Block message acquiring unit obtains all source file block messages of source file for traversing the catalogue of the source file in name node;
Purpose blocks of files creating unit, for obtaining the position of the source data node from the source file block message, based on described
The position of source file block message and the source data node creates purpose blocks of files;Replication task generation unit, for being based on
The source file block, the position of the source data node, the purpose blocks of files generate replication task.
In some embodiments of the invention, aforementioned schemes, the data copy device 300 further include: obtain position are based on
Unit is taken, the position for the source data node where obtaining the source file block in the replication task;Node is true
Order member, for determining the purpose back end where the purpose blocks of files based on the position of the source data node.
In some embodiments of the invention, aforementioned schemes are based on, node determination unit is configured as: by the source data
The position of node is determined as the purpose back end of the purpose blocks of files.
In some embodiments of the invention, aforementioned schemes are based on, position acquisition unit is configured as: by multithreading
The position of the source data node of the mode where obtaining the source file in the replication task.
In some embodiments of the invention, aforementioned schemes are based on, local replica unit 330 is configured as: in the source
Back end is directed toward the source file block to the newly-built link of the purpose blocks of files.
Each functional module and above-mentioned data duplication side due to the data copy device 300 of example embodiments of the present invention
The step of example embodiment of method, is corresponding, therefore details are not described herein.
In an exemplary embodiment of the present invention, a kind of electronic equipment that can be realized the above method is additionally provided.
Below with reference to Fig. 4, it illustrates the computer systems 400 for the electronic equipment for being suitable for being used to realize the embodiment of the present invention
Structural schematic diagram.The computer system 400 of electronic equipment shown in Fig. 4 is only an example, should not be to the embodiment of the present invention
Function and use scope bring any restrictions.
As shown in figure 4, computer system 400 includes central processing unit (CPU) 401, it can be read-only according to being stored in
Program in memory (ROM) 402 or be loaded into the program in random access storage device (RAM) 403 from storage section 408 and
Execute various movements appropriate and processing.In RAM 403, it is also stored with various programs and data needed for system operatio.CPU
401, ROM 402 and RAM 403 is connected with each other by bus 404.Input/output (I/O) interface 405 is also connected to bus
404。
I/O interface 405 is connected to lower component: the importation 406 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 407 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 408 including hard disk etc.;
And the communications portion 409 of the network interface card including LAN card, modem etc..Communications portion 409 via such as because
The network of spy's net executes communication process.Driver 410 is also connected to I/O interface 405 as needed.Detachable media 411, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 410, in order to read from thereon
Computer program be mounted into storage section 408 as needed.
Particularly, according to an embodiment of the invention, may be implemented as computer above with reference to the process of flow chart description
Software program.For example, the embodiment of the present invention includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 409, and/or from detachable media
411 are mounted.When the computer program is executed by central processing unit (CPU) 401, executes and limited in the system of the application
Above-mentioned function.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned
Any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule
The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
Being described in unit involved in the embodiment of the present invention can be realized by way of software, can also be by hard
The mode of part realizes that described unit also can be set in the processor.Wherein, the title of these units is in certain situation
Under do not constitute restriction to the unit itself.
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in electronic equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying electronic equipment.
Above-mentioned computer-readable medium carries one or more program, when the electronics is set by one for said one or multiple programs
When standby execution, so that the electronic equipment realizes such as above-mentioned data copy method as described in the examples.
For example, the electronic equipment may be implemented as shown in Figure 1: step S110 obtains the source where source file block
The position of back end and the position of the purpose back end where purpose blocks of files;Step S120 judges the source data
Whether the position of node and the position of the purpose back end belong to the same back end;Step S130 is determining to belong to
When same back end, the source file block is replicated using hard chain mode;Step S140 is determining to be not belonging to same number
When according to node, then the source file block is copied into the purpose blocks of files by the way of data copy.
It should be noted that although being referred to several modules for acting the device executed in the above detailed description
Or unit, but this division is not enforceable.In fact, embodiment according to the present invention, above-described two
Or more the feature and function of module or unit can be embodied in a module or unit.Conversely, above-described
One module or the feature and function of unit can be to be embodied by multiple modules or unit with further division.
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented
Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the present invention
The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one
Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating
Equipment (can be personal computer, server, touch control terminal or network equipment etc.) executes embodiment according to the present invention
Method.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
Claims (10)
1. a kind of data copy method, applied to the distributed system with multiple back end characterized by comprising
Obtain the position of the source data node where source file block and the position of the purpose back end where purpose blocks of files;
Whether the position of the position and the purpose back end that judge the source data node belongs to the same back end;
When determining to belong to same back end, the source file block is replicated using hard chain mode;
When determining to be not belonging to same back end, then the source file block is copied into the mesh by the way of data copy
Blocks of files.
2. data copy method according to claim 1, which is characterized in that the data copy method further include:
When receiving the data update request to the source file, request is updated based on the data and is determined from name node
Blocks of files to be updated;
Judge that the blocks of files to be updated is linked with the presence or absence of hard chain;
Determining that creation temporary file block replicates the content of the source file block, and to the interim text there are when the link of hard chain
Part block carries out data and updates operation;
Determining directly to carry out data there is no when the link of hard chain to blocks of files to be updated and update operation.
3. data copy method according to claim 1, which is characterized in that the data copy method further include:
The catalogue for traversing the source file in name node, obtains all source file block messages of source file;
The position of the source data node is obtained from the source file block message, based on the source file block message and described
The position of source data node creates purpose blocks of files;
Replication task is generated based on the source file block, the position of the source data node, the purpose blocks of files.
4. data copy method according to claim 3, which is characterized in that the data copy method further include:
The position of the source data node where obtaining the source file block in the replication task;
The purpose back end where the purpose blocks of files is determined based on the position of the source data node.
5. data copy method according to claim 4, which is characterized in that determined based on the position of the source data node
Purpose back end where the purpose blocks of files, comprising:
The position of the source data node is determined as to the purpose back end of the purpose blocks of files.
6. data copy method according to claim 4, which is characterized in that obtain the source document from the replication task
The position of the source data node where part block, comprising:
The position of the source data node by way of multithreading where obtaining the source file in the replication task.
7. data copy method according to any one of claim 1 to 6, which is characterized in that using hard chain mode to institute
Source file block is stated to be replicated, comprising:
The source file block is directed toward to the newly-built link of the purpose blocks of files in the source data node.
8. a kind of data copy device, applied to the distributed system with multiple back end characterized by comprising
Information acquisition unit, for obtain the source data node where source file block position and purpose blocks of files where mesh
Back end position;
Judging unit, for judge the source data node position and the purpose back end position whether belong to it is same
A back end;
Local replica unit, for being carried out to the source file block using hard chain mode when determining to belong to same back end
Duplication;
Data copy cell, for when determining to be not belonging to same back end, then by the source by the way of being replicated using data
Blocks of files copies to the purpose blocks of files.
9. a kind of electronic equipment characterized by comprising processor;And memory, computer is stored on the memory
Readable instruction is realized as described in any one of claims 1 to 7 when the computer-readable instruction is executed by the processor
Data copy method.
10. a kind of computer readable storage medium, is stored thereon with computer program, the computer program is executed by processor
Data copy method of the Shi Shixian as described in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811387721.7A CN109614383B (en) | 2018-11-21 | 2018-11-21 | Data copying method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811387721.7A CN109614383B (en) | 2018-11-21 | 2018-11-21 | Data copying method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109614383A true CN109614383A (en) | 2019-04-12 |
CN109614383B CN109614383B (en) | 2021-01-15 |
Family
ID=66004675
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811387721.7A Active CN109614383B (en) | 2018-11-21 | 2018-11-21 | Data copying method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109614383B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112988697A (en) * | 2021-05-11 | 2021-06-18 | 北京华云安信息技术有限公司 | Target file copying method, device, equipment and computer readable storage medium |
CN115688187A (en) * | 2023-01-04 | 2023-02-03 | 中科方德软件有限公司 | Safety management method and device for hard link data, electronic equipment and computer readable storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060256712A1 (en) * | 2003-02-21 | 2006-11-16 | Nippon Telegraph And Telephone Corporation | Device and method for correcting a path trouble in a communication network |
CN102170440A (en) * | 2011-03-24 | 2011-08-31 | 北京大学 | Method suitable for safely migrating data between storage clouds |
CN103685368A (en) * | 2012-09-10 | 2014-03-26 | 中国电信股份有限公司 | Method and system for migrating data |
CN103761162A (en) * | 2014-01-11 | 2014-04-30 | 深圳清华大学研究院 | Data backup method of distributed file system |
CN104603774A (en) * | 2012-10-11 | 2015-05-06 | 株式会社日立制作所 | Migration-destination file server and file system migration method |
US20150347046A1 (en) * | 2012-12-14 | 2015-12-03 | Netapp, Inc. | Push-based piggyback system for source-driven logical replication in a storage environment |
US20160188232A1 (en) * | 2013-09-05 | 2016-06-30 | Nutanix, Inc. | Systems and methods for implementing stretch clusters in a virtualization environment |
CN107239480A (en) * | 2016-03-28 | 2017-10-10 | 阿里巴巴集团控股有限公司 | The method and apparatus that renaming operation is performed for distributed file system |
CN108268542A (en) * | 2016-12-31 | 2018-07-10 | 中国移动通信集团河北有限公司 | For the method and system of data-base cluster Data Migration |
CN108845892A (en) * | 2018-04-19 | 2018-11-20 | 北京百度网讯科技有限公司 | Data processing method, device, equipment and the computer storage medium of distributed data base |
-
2018
- 2018-11-21 CN CN201811387721.7A patent/CN109614383B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060256712A1 (en) * | 2003-02-21 | 2006-11-16 | Nippon Telegraph And Telephone Corporation | Device and method for correcting a path trouble in a communication network |
CN102170440A (en) * | 2011-03-24 | 2011-08-31 | 北京大学 | Method suitable for safely migrating data between storage clouds |
CN103685368A (en) * | 2012-09-10 | 2014-03-26 | 中国电信股份有限公司 | Method and system for migrating data |
CN104603774A (en) * | 2012-10-11 | 2015-05-06 | 株式会社日立制作所 | Migration-destination file server and file system migration method |
US20150347046A1 (en) * | 2012-12-14 | 2015-12-03 | Netapp, Inc. | Push-based piggyback system for source-driven logical replication in a storage environment |
US20160188232A1 (en) * | 2013-09-05 | 2016-06-30 | Nutanix, Inc. | Systems and methods for implementing stretch clusters in a virtualization environment |
CN103761162A (en) * | 2014-01-11 | 2014-04-30 | 深圳清华大学研究院 | Data backup method of distributed file system |
US20150199243A1 (en) * | 2014-01-11 | 2015-07-16 | Research Institute Of Tsinghua University In Shenzhen | Data backup method of distributed file system |
CN107239480A (en) * | 2016-03-28 | 2017-10-10 | 阿里巴巴集团控股有限公司 | The method and apparatus that renaming operation is performed for distributed file system |
CN108268542A (en) * | 2016-12-31 | 2018-07-10 | 中国移动通信集团河北有限公司 | For the method and system of data-base cluster Data Migration |
CN108845892A (en) * | 2018-04-19 | 2018-11-20 | 北京百度网讯科技有限公司 | Data processing method, device, equipment and the computer storage medium of distributed data base |
Non-Patent Citations (2)
Title |
---|
WEIXIN_34327761: "复制指定源位置的多级文件夹下所有文件到指定目标位置", 《HTTPS://BLOG.CSDN.NET/WEIXIN_34327761/ARTICLE/DETAILS/85815584》 * |
谯林飞: "云计算环境中分布式文件系统数据一致性问题研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112988697A (en) * | 2021-05-11 | 2021-06-18 | 北京华云安信息技术有限公司 | Target file copying method, device, equipment and computer readable storage medium |
CN115688187A (en) * | 2023-01-04 | 2023-02-03 | 中科方德软件有限公司 | Safety management method and device for hard link data, electronic equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109614383B (en) | 2021-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7212040B2 (en) | Content Management Client Synchronization Service | |
US20230090977A1 (en) | Synchronized content library | |
US10360536B2 (en) | Implementing a consistent ordering of operations in collaborative editing of shared content items | |
US10365916B2 (en) | Providing access to a hybrid application offline | |
RU2500023C2 (en) | Document synchronisation on protocol not using status information | |
US9325571B2 (en) | Access permissions for shared content | |
US10338917B2 (en) | Method, apparatus, and system for reading and writing files | |
CN102317923B (en) | Storage system | |
CN108628874A (en) | Method, apparatus, electronic equipment and the readable storage medium storing program for executing of migrating data | |
CN108920698A (en) | A kind of method of data synchronization, device, system, medium and electronic equipment | |
CN109804361A (en) | File synchronization in computing system | |
CN109614439A (en) | Method of data synchronization, device, electronic equipment and storage medium | |
US10747643B2 (en) | System for debugging a client synchronization service | |
CN108038153A (en) | The online data moving method and device of Hbase | |
CN109767274B (en) | Method and system for carrying out associated storage on massive invoice data | |
US10970193B2 (en) | Debugging a client synchronization service | |
JP5721056B2 (en) | Transaction processing apparatus, transaction processing method, and transaction processing program | |
CN109614383A (en) | Data copy method, device, electronic equipment and storage medium | |
US11151093B2 (en) | Distributed system control for on-demand data access in complex, heterogenous data storage | |
CN112334891A (en) | Centralized storage for search servers | |
CN110119386A (en) | Data processing method, data processing equipment, medium and calculating equipment | |
CN109344152A (en) | Data processing method, device, electronic equipment and storage medium | |
US20210255998A1 (en) | Method for object management using trace identifier, apparatus for the same, computer program for the same, and recording medium storing computer program thereof | |
CN109614440A (en) | Method of data synchronization and relevant device based on big data | |
CN109445966A (en) | Event-handling method, device, medium and calculating equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |