CN106446159A - Method for storing files, first virtual machine and name node - Google Patents
Method for storing files, first virtual machine and name node Download PDFInfo
- Publication number
- CN106446159A CN106446159A CN201610846967.0A CN201610846967A CN106446159A CN 106446159 A CN106446159 A CN 106446159A CN 201610846967 A CN201610846967 A CN 201610846967A CN 106446159 A CN106446159 A CN 106446159A
- Authority
- CN
- China
- Prior art keywords
- virtual machine
- data
- written
- memory area
- virtual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
Abstract
A method for storing files, a first virtual machine and a name node are used for solving the redundancy problem of file numbers, existing when a distributed file system stores the files, and improving availability of the distributed file system. The method for storing the files includes that a client side sends a request message of requesting writing data to be written in the distributed file system to the name node; the name node sends a response message corresponding to the request message to the client side, wherein the response message includes an address of the first virtual machine and an address of a second virtual machine, and indicates the first virtual machine to be a virtual machine in more than one virtual machine, which has permission of writing data in a storage region, and the second virtual machine to be the other virtual machine besides the first virtual machine in the more than one virtual machine; the client side sends the data to be written and the address of the second virtual machine to the first virtual machine; the first virtual machine writes the data to be written into the storage region shared by the more than one virtual machine, and generates or updates metadata of the data to be written; the first virtual machine sends the generated or updated metadata to the second virtual machine.
Description
Technical field
The present invention relates to field of computer technology, more particularly, to a kind of method of storage file, the first virtual machine and title
Node.
Background technology
Distributed file system includes client (client), back end (datanode) and name node
(namenode);Wherein, back end is used for storage file, and name node is used for managing the file of storage on back end.Visitor
Family end can be inquired about, by name node, the file storing in each back end and obtain the address of each back end, thus real
From back end, now read file or by file write data node.Back end in distributed file system can be
Physical server or virtual machine.
When the back end in distributed file system is virtual machine, the virtual hard disk of this virtual machine is by distributed block
Storage system provides, to virtual machine written document really to the virtual hard disk written document of virtual machine, to virtual hard disk written document
The physical hard disk written document being achieved in that to distributed block system management memory.
Distributed file system, in order to ensure the reliability of file, can adopt duplicate of the document in virtual hard disk storage file
Mechanism, same file is saved in N number of (N is the integer more than 1) virtual hard disk in distributed file system;And it is distributed
Block storage system, in order to ensure the reliability of file, also can adopt duplicate of the document mechanism, by the file in same virtual hard disk
M (M is the integer more than 1) physical hard disk preserves.Because distributed file system and distributed block storage system are all adopted
Use duplicate of the document mechanism, same file actual file number preserving in physical hard disk can be led to be N*M, cause file
Number redundancy.The file number redundancy that same file preserves can waste memory space, the process performance of impact system.
In order to solve the problems, such as file number redundancy in distributed file system in prior art, generally adopt following two
Method:First method is, for the text document needing storage, stores only in a virtual machine of distributed file system
This document.Using first method, this document can only could be accessed by this virtual machine, if this virtual machine breaks down, need
File read-write service could be provided for client again after waiting this virtual machine to recover normally, lead to distributed file system
Availability reduces;Second method is using the hot standby mechanism of virtual machine, that is, to configure the corresponding hot standby virtual machine of host virtual machine,
This hot standby virtual machine and host virtual machine are synchronously written file.When host virtual machine breaks down, distributed file system is switched to
Hot standby virtual machine continues as client and provides file read-write service.Using second method, distributed file system is switched to heat
Need certain waiting time during standby virtual machine, lead to distributed file system cannot provide for client within this waiting time
File read-write services, and so that the availability of distributed file system is reduced;And, hot standby virtual machine is before switching to host virtual machine
Externally do not provide service, lead to the wasting of resources.
To sum up, the existing method solving file number redundancy issue in distributed file system can lead to distributed document
The availability of system is low, cannot preferably solve file number redundancy issue.
Content of the invention
The embodiment of the present invention provides a kind of method of storage file, the first virtual machine and name node, in order to solve to be distributed
The problem of the file number redundancy existing during formula file system storage file, and improve the availability of system.
In a first aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, in the method,
Distributed file system includes name node, multiple virtual machine as back end, and multiple virtual machines therein are shared same
Memory area;The method includes:
First virtual machine receives data to be written, the address of the second virtual machine that client sends, then to multiple virtual machines
The data to be written that shared memory area write receives, and generate or update the metadata of data to be written;First virtual machine root
Address according to the second virtual machine receiving sends, to the second virtual machine, the metadata that the first virtual machine generates or updates.
Wherein, the first virtual machine is to be specified by name node in multiple virtual machines to have the power writing data to memory area
One virtual machine of limit, the second virtual machine is the virtual machine in multiple virtual machines in addition to the first virtual machine;The unit of data to be written
Data includes but is not limited to:The file directory of the storage location of data to be written, the file name of data to be written and data to be written.
Using said method, because multiple virtual machines that distributed file system includes share same memory area, thus
In distributed file system, the data to be written that the first virtual machine writes this memory area only preserves one in this memory area
Part.For data to be written, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not
There is the file of the preservation all leading to due to distributed file system and distributed block storage system using duplicate of the document mechanism
The problem of number redundancy.
Additionally, adopting such scheme, the first virtual machine in multiple virtual machines that distributed file system includes have to
Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area
Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written
Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client
The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously
The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, before the first virtual machine writes data to be written to memory area, also include:The
One virtual machine receives the write permission mark of the first virtual machine that client sends, write permission mark be name node client to
Name node asks to write to client transmission during data to be written to distributed file system, for specifying the first virtual equipment
Oriented memory area writes the authority of data to be written.
Using such scheme, there is provided a kind of client indicates the mode of the authority of the first virtual machine to the first virtual machine.
Multiple virtual machines share a memory area can be in the following way when implementing:Multiple virtual machine carries divide
The same virtual hard disk that cloth block storage system provides, this virtual hard disk includes the memory area that multiple virtual machines are shared.
The metadata of the data to be written that the first virtual machine sends to the second virtual machine has following two purposes:
The first
If the second virtual machine reads data to be written by the operating system of itself, metadata is used for the second virtual machine and generates
Or updating the fileinfo recording in the operating system of itself, fileinfo reads to be written from memory area for operating system
Data.
Second
If the second virtual machine reads data to be written, metadata reads number to be written for the second virtual machine from memory area
According to.
The metadata of the data to be written that can be sent according to the first virtual machine using such scheme, the second virtual machine reads many
Data to be written in the memory area that individual virtual machine is shared.
In a kind of possible implementation, the second virtual machine can be specified to have by name node and be read from memory area
The authority of data to be written.
Second aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, this is distributed
File system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;The party
Method includes:
Name node receives client request and writes after the request message of data to be written to distributed file system, to client
End sends the corresponding response message of this request message.
Wherein, the response message that name node sends to client includes address and second virtual machine of the first virtual machine
Address, additionally, this response message also indicate the first virtual machine be have in multiple virtual machines to memory area write data
One virtual machine of authority, the second virtual machine is the virtual machine in multiple virtual machines in addition to the first virtual machine.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and name
Claim the response message that node sends to specify one of multiple virtual machines first virtual machine to have to write in shared memory area
Enter the authority of data, thus the data of write only can preserve portion in this memory area in shared memory area.For
For the shared data of memory area of write, only can be preserved due to the duplicate of the document mechanism that distributed block storage system adopts
Many parts, and there is not the guarantor all leading to due to distributed file system and distributed block storage system using duplicate of the document mechanism
The problem of the file number redundancy deposited.
Additionally, the first virtual machine in instruction multiple virtual machines of including of distributed file system in response message have to
Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area
Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written
Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client
The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously
The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has
The authority of data.
In a kind of possible implementation, name node passes through the power that response message indicates the first virtual machine to client
The following two kinds mode can be adopted during the authority of limit and the second virtual machine:
First kind of way
The write permission mark that name node also includes the first virtual machine to the response message that client sends is virtual with second
The read right mark of machine, write permission mark therein and read right mark have indicated respectively authority and second void of the first virtual machine
The authority of plan machine, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, reads
Capability identification is used for the authority specifying the second virtual machine to have from memory area reading data to be written.
The second way
In the response message that name node sends to client, the address of the address of the first virtual machine and the second virtual machine is pressed
According to preset rules arrangement, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First is virtual
The oriented memory area of equipment writes the authority of data to be written, and the second virtual machine has the power reading data to be written from memory area
Limit.
Using both modes, there is provided name node passes through the authority that response message indicates the first virtual machine to client
Two ways with the authority of the second virtual machine.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing
Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
Due to can be used for the number of the virtual machine providing the service reading and writing data to be written for client in distributed file system
Measure as multiple, thus when certain virtual machine breaks down, can be that client provides and reads and writes data to be written by other virtual machines
Service.When implementing, which kind of mode name node is processed by when virtual machine breaks down can comprise following two feelings
Condition:
The first situation
When the first virtual machine breaks down, name node sends the first fresh information to client, this first renewal letter
Breath includes the address of the first virtual machine of renewal, and the first fresh information further specify removes first breaking down in multiple virtual machines
Another virtual machine beyond virtual machine as the first virtual machine updating, that is, specifies some virtual machine in the second virtual machine
As the first virtual machine updating, the first virtual machine of renewal has the authority writing data to memory area.
Second situation
When the second virtual machine breaks down, name node sends the second fresh information to client, this second renewal letter
Breath includes the address of the second virtual machine of renewal, and the second fresh information further specify another virtual machine beyond multiple virtual machines
As the second virtual machine updating, the second virtual machine of renewal has the authority reading data to be written from memory area.
Using such scheme, either the first virtual machine breaks down or the second virtual machine breaks down, name node
All specify other virtual machines to substitute the virtual machine breaking down, thus occurring in the first virtual machine and/or the second virtual machine
In the case of fault, distributed file system remains to provide the service of read-write data for client, further increases distributed
The availability of file system.
The third aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, this is distributed
File system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;The party
Method includes:
Client sends, to name node, the request message that request writes data to be written to distributed file system, receives afterwards
The corresponding response message of request message that name node sends.
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, additionally, response message also refers to
Show that the first virtual machine is a virtual machine in multiple virtual machines with the authority writing data to memory area, the second virtual machine
For the virtual machine in addition to the first virtual machine in multiple virtual machines.
The address of the first virtual machine that client includes according to response message sends data to be written and the to the first virtual machine
The address of two virtual machines, and indicate the first virtual machine:Write data to be written, generation or metadata the root updating data to be written
Address according to the second virtual machine sends the metadata of data to be written to the second virtual machine.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, thus
In distributed file system, client indicates that the first virtual machine writes the data to be written of this shared storage area only in this storage
Store a in region.For data to be written, only can be due to the duplicate of the document mechanism of distributed block storage system employing
Preserve many parts, and do not exist and all led to using duplicate of the document mechanism due to distributed file system and distributed block storage system
The file number redundancy of preservation problem.
Further, since the first virtual machine in multiple virtual machines of including of distributed file system has writing to memory area
Enter the authority of data, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area reading number to be written
According to authority.Thus, can be used in distributed file system providing the virtual machine of service of read-write data to be written for client
Quantity is multiple.When certain virtual machine breaks down, can be provided for client by other virtual machines and read and write data to be written
Service, makes the availability of distributed file system be improved, and it also avoid adopting virtual machine hot standby in prior art simultaneously
The problem of resource waste existing during mechanism.
In a kind of possible implementation, client is known by the response message that the name node receiving sends
The following two kinds mode can be adopted during the authority of the authority of the first virtual machine and the second virtual machine:
First kind of way
The response message that client receives also includes the write permission mark of the first virtual machine and the reading power of the second virtual machine
Limit mark, write permission mark therein and read right mark have indicated respectively the authority of the first virtual machine and the power of the second virtual machine
Limit, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, and read right identifies
Read the authority of data to be written for specifying the second virtual machine to have from memory area.
The second way
In the response message that client receives, the address of the address of the first virtual machine and the second virtual machine is according to default rule
Then arrange, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First virtual machine have to
Memory area writes the authority of data to be written, and the second virtual machine has the authority reading data to be written from memory area.
Using both modes, there is provided client knows the authority and second of the first virtual machine by receiving response message
The two ways of the authority of virtual machine.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing
Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has
The authority of data.
Fourth aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, this is distributed
File system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;The party
Method includes:
Second virtual machine receives the metadata that the first virtual machine sends.Wherein, the first virtual machine is quilt in multiple virtual machines
Name node specifies a virtual machine with the authority writing data to memory area, and the second virtual machine is in multiple virtual machines
Virtual machine in addition to the first virtual machine, metadata is that the first virtual machine writes generation or renewal after data to be written to memory area
Data to be written metadata.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and many
One of individual virtual machine first virtual machine has the authority writing data in shared memory area, thus deposits to shared
In storage area domain, the data of write only can preserve portion in this memory area.Data for the shared memory area of write is come
Say, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not exist due to distributed literary composition
The problem of the file number redundancy of preservation that part system and distributed block storage system are all led to using duplicate of the document mechanism.
Additionally, the first virtual machine in multiple virtual machines of including of distributed file system has writes number to memory area
According to authority, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has and reads data to be written from memory area
Authority.Thus, can be used in distributed file system providing the quantity of the virtual machine of the service reading and writing data to be written for client
For multiple.When certain virtual machine breaks down, can provide, for client, the service reading and writing data to be written by other virtual machines,
So that the availability of distributed file system is improved, when it also avoid adopting the hot standby mechanism of virtual machine in prior art simultaneously
The problem of resource waste existing.
In a kind of possible implementation, the second virtual machine knows that body authority can be in the following way:Second virtual machine
Before receiving the metadata that the first virtual machine sends, receive the read right mark of the second virtual machine that client sends, read power
Limit mark is name node when client asks to write data to be written to distributed file system to name node to client
Send, read right identifies the authority reading data to be written for specifying the second virtual machine to have from memory area.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing
Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
After second virtual machine receives the metadata that the first virtual machine sends, can be read multiple according to the metadata receiving
Data to be written in the memory area that virtual machine is shared, specifically can be in the following ways:
The first
If the second virtual machine reads data to be written by the operating system of itself, the second virtual machine generates according to metadata
Or updating the fileinfo recording in the operating system of itself, this document information can be used for operating system and reads from memory area
Data to be written.
Second
If the second virtual machine reads data to be written, the second virtual machine reads number to be written according to metadata from memory area
According to.
The metadata of the data to be written that can be sent according to the first virtual machine using such scheme, the second virtual machine reads many
Data to be written in the memory area that individual virtual machine is shared.
In a kind of possible implementation, the second virtual machine by name node specify have from memory area read to be written
The authority of data.
5th aspect, the embodiment of the present invention provides the first virtual machine in a kind of distributed file system, this distributed literary composition
Part system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area, and first is empty
Plan machine is to be specified a virtual machine with the authority writing data to memory area in multiple virtual machines by name node;This
One virtual machine includes:
Receiver module, for receiving the data to be written of client transmission, the address of the second virtual machine, the second virtual machine is many
Virtual machine in addition to the first virtual machine in individual virtual machine;
Processing module, for writing, to memory area, the data to be written that receiver module receives, and generates or updates number to be written
According to metadata;
Sending module, for the address of the second virtual machine that received according to receiver module to the second virtual machine transmission processe mould
Block generates or more new metadata.
Wherein, the metadata of data to be written includes but is not limited to:The storage location of data to be written, the filename of data to be written
Title and the file directory of data to be written.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, thus
In distributed file system, the data to be written that processing module writes this memory area only preserves portion in this memory area.
For data to be written, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not deposit
File part in the preservation all being led to using duplicate of the document mechanism due to distributed file system and distributed block storage system
The problem of number redundancy.
Additionally, adopting said method, the first virtual machine in multiple virtual machines that distributed file system includes have to
Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area
Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written
Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client
The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously
The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, receiver module is additionally operable to:Write number to be written in processing module to memory area
According to before, receive the write permission mark of the first virtual machine that client sends, write permission mark be name node client to
Name node asks to write to client transmission during data to be written to distributed file system, and this write permission identifies for specifying
First virtual machine has the authority writing data to be written to memory area.
Using such scheme, there is provided a kind of first virtual machine knows the mode of its own right from client.
Multiple virtual machines share a memory area can be in the following way when implementing:Multiple virtual machine carries divide
The same virtual hard disk that cloth block storage system provides, virtual hard disk includes memory area.
The metadata of the data to be written that sending module sends to the second virtual machine has following two purposes:
The first
If the second virtual machine reads data to be written by the operating system of itself, metadata is used for the second virtual machine and generates
Or updating the fileinfo recording in the operating system of itself, fileinfo reads to be written from memory area for operating system
Data.
Second
If the second virtual machine reads data to be written, metadata reads number to be written for the second virtual machine from memory area
According to.
The metadata of the data to be written that can be sent according to sending module using such scheme, the second virtual machine reads multiple
Data to be written in the memory area that virtual machine is shared.
In a kind of possible implementation, the second virtual machine by name node specify have from memory area read to be written
The authority of data.
6th aspect, the embodiment of the present invention provides the name node in a kind of distributed file system, this distributed document
System includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;This title section
Point includes:
Receiver module, writes the request message of data to be written for receiving client request to distributed file system;
Sending module, for the corresponding response message of request message receiving to client sending/receiving module, this response
Message includes the address of the first virtual machine and the address of the second virtual machine, additionally, this response message also indicates that the first virtual machine is
There is in multiple virtual machines a virtual machine of the authority writing data to memory area, the second virtual machine is in multiple virtual machines
Virtual machine in addition to the first virtual machine.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and name
Claim the response message that node sends to specify one of multiple virtual machines first virtual machine to have to write in shared memory area
Enter the authority of data, thus the data that processing module writes in shared memory area only can preserve one in this memory area
Part.For the data of the shared memory area of write, only can be due to the duplicate of the document machine of distributed block storage system employing
Make and preserve many parts, and do not exist because distributed file system and distributed block storage system are all using duplicate of the document mechanism
The problem of the file number redundancy of the preservation leading to.
Additionally, the first virtual machine in instruction multiple virtual machines of including of distributed file system in response message have to
Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area
Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written
Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client
The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously
The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has
The authority of data.
In a kind of possible implementation, the response message that sending module sends indicates the first virtual machine to client
The following two kinds mode can be adopted during the authority of authority and the second virtual machine:
First kind of way
The write permission mark that sending module also includes the first virtual machine to the response message that client sends is virtual with second
The read right mark of machine, write permission mark therein and read right mark have indicated respectively authority and second void of the first virtual machine
The authority of plan machine, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, reads
Capability identification is used for the authority specifying the second virtual machine to have from memory area reading data to be written.
The second way
In the response message that sending module sends to client, the address of the address of the first virtual machine and the second virtual machine is pressed
According to preset rules arrangement, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First is virtual
The oriented memory area of equipment writes the authority of data to be written, and the second virtual machine has the power reading data to be written from memory area
Limit.
Using both modes, there is provided the response message that sending module sends indicates the power of the first virtual machine to client
The two ways of the authority of limit and the second virtual machine.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing
Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
Due to can be used for the number of the virtual machine providing the service reading and writing data to be written for client in distributed file system
Measure as multiple, thus when certain virtual machine breaks down, can be that client provides and reads and writes data to be written by other virtual machines
Service.When implementing, which kind of mode sending module is processed by when virtual machine breaks down can comprise following two feelings
Condition:
The first situation
When the first virtual machine breaks down, send the first fresh information to client, this first fresh information is included more
The address of the first new virtual machine, the first fresh information further specify remove in multiple virtual machines the first virtual machine breaking down with
Another outer virtual machine has to memory area write data as the first virtual machine updating, the first virtual machine of renewal
Authority.
Second situation
When the second virtual machine breaks down, send the second fresh information to client, this second fresh information is included more
The address of the second new virtual machine, the second fresh information further specify multiple virtual machines beyond another virtual machine as renewal
The second virtual machine, the second virtual machine of renewal has the authority reading data to be written from memory area.
Using such scheme, either the first virtual machine breaks down or the second virtual machine breaks down, sending module
All specify other virtual machines to substitute the virtual machine breaking down, thus occurring in the first virtual machine and/or the second virtual machine
In the case of fault, distributed file system remains to provide the service of read-write data for client, further increases distributed
The availability of file system.
7th aspect, the embodiment of the present invention provides a kind of client, and the distributed file system that this client is located includes
Name node, multiple virtual machine as back end, multiple virtual machines share same memory area;This client includes:
Sending module, disappears to the request of distributed file system write data to be written for sending request to name node
Breath;
Receiver module, for receiving the corresponding response message of request message of name node transmission;
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, additionally, response message also refers to
Show that the first virtual machine is a virtual machine in multiple virtual machines with the authority writing data to memory area, the second virtual machine
For the virtual machine in addition to the first virtual machine in multiple virtual machines;
Sending module, is additionally operable to the address of the first virtual machine that includes according to the response message that receiver module receives to first
Virtual machine sends data to be written, the address of the second virtual machine, and indicates the first virtual machine:Write data to be written, generation or renewal
The address of the metadata of data to be written and the second virtual machine being included according to the response message that receiver module receives is empty to second
Plan machine sends the metadata of data to be written.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, thus
In distributed file system, the data to be written that sending module instruction the first virtual machine writes this shared storage area is only deposited at this
Store a in storage area domain.For data to be written, only can be due to the duplicate of the document mechanism of distributed block storage system employing
And preserve many parts, and do not exist and all led using duplicate of the document mechanism due to distributed file system and distributed block storage system
The problem of the file number redundancy of preservation causing.
Further, since the first virtual machine in multiple virtual machines of including of distributed file system has writing to memory area
Enter the authority of data, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area reading number to be written
According to authority.Thus, can be used in distributed file system providing the virtual machine of service of read-write data to be written for client
Quantity is multiple.When certain virtual machine breaks down, can be provided for client by other virtual machines and read and write data to be written
Service, makes the availability of distributed file system be improved, and it also avoid adopting virtual machine hot standby in prior art simultaneously
The problem of resource waste existing during mechanism.
In a kind of possible implementation, receiver module is obtained by the response message that the name node receiving sends
The following two kinds mode can be adopted during the authority knowing the authority of the first virtual machine and the second virtual machine:
First kind of way
The response message that receiver module receives also includes the write permission mark of the first virtual machine and the reading of the second virtual machine
Capability identification, write permission therein mark and read right mark have indicated respectively the authority of the first virtual machine and the second virtual machine
Authority, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, read right mark
Know the authority reading data to be written for specifying the second virtual machine to have from memory area.
The second way
In the response message that receiver module receives, the address of the address of the first virtual machine and the second virtual machine is according to default
Regularly arranged, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First virtual machine has
Write the authority of data to be written to memory area, the second virtual machine has the authority reading data to be written from memory area.
Using both modes, there is provided receiver module knows the authority and the of the first virtual machine by receiving response message
The two ways of the authority of two virtual machines.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing
Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has
The authority of data.
Eighth aspect, the second virtual machine in a kind of distributed file system of the embodiment of the present invention, this distributed field system
System includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;This is second virtual
Machine includes:
Receiver module, for receiving the metadata that the first virtual machine sends.Wherein, the first virtual machine is in multiple virtual machines
Specified a virtual machine with the authority writing data to memory area by name node, the second virtual machine is multiple virtual machines
In virtual machine in addition to the first virtual machine, metadata is that the first virtual machine writes to memory area and generates after data to be written or more
The metadata of new data to be written.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and many
One of individual virtual machine first virtual machine has the authority writing data in shared memory area, thus deposits to shared
In storage area domain, the data of write only can preserve portion in this memory area.Data for the shared memory area of write is come
Say, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not exist due to distributed literary composition
The problem of the file number redundancy of preservation that part system and distributed block storage system are all led to using duplicate of the document mechanism.
Additionally, the first virtual machine in multiple virtual machines of including of distributed file system has writes number to memory area
According to authority, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has and reads data to be written from memory area
Authority.Thus, can be used in distributed file system providing the quantity of the virtual machine of the service reading and writing data to be written for client
For multiple.When certain virtual machine breaks down, can provide, for client, the service reading and writing data to be written by other virtual machines,
So that the availability of distributed file system is improved, when it also avoid adopting the hot standby mechanism of virtual machine in prior art simultaneously
The problem of resource waste existing.
In a kind of possible implementation, receiver module knows that the authority of the second virtual machine can be in the following way:Connect
Receive module before receiving the metadata that the first virtual machine sends, receive the read right mark of the second virtual machine that client sends
Know, read right mark be name node when client asks to write data to be written to distributed file system to name node to
Client sends, and read right identifies the authority reading data to be written for specifying the second virtual machine to have from memory area.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing
Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
In a kind of possible implementation, the second virtual machine also includes processing module.It is virtual that receiver module receives first
After the metadata that machine sends, processing module can read in the memory area that multiple virtual machines are shared according to the metadata receiving
Data to be written, processing module is specifically can be in the following ways:
The first
After receiver module receives the metadata that the first virtual machine sends, if the second virtual machine passes through the operation system of itself
System reads data to be written, then processing module generates according to metadata or updates the fileinfo of record in the operating system of itself,
Fileinfo reads data to be written for operating system from memory area.
Second
If the second virtual machine reads data to be written, processing module reads number to be written according to metadata from memory area
According to.
The metadata of the data to be written that can be sent according to the first virtual machine using such scheme, processing module reads multiple
Data to be written in the memory area that virtual machine is shared.
In a kind of possible implementation, the second virtual machine by name node specify have from memory area read to be written
The authority of data.
9th aspect, provides a kind of computer-readable recording medium, and be stored with computer-readable recording medium computer
Execute instruction, when this computer executed instructions of at least one computing device of calculate node, calculate node execution above-mentioned the
Or the various of first aspect may the methods that provide of design or execute above-mentioned second aspect or second aspect on the one hand
The various sides that the method providing may be provided or the various possible design offer of the above-mentioned third aspect or the third aspect is provided
Method.
Tenth aspect, provides a kind of computer program, and this computer program includes computer executed instructions, should
Computer executed instructions store in a computer-readable storage medium.At least one processor of calculate node can be from computer
Readable storage medium storing program for executing reads this computer executed instructions, and this computer executed instructions of at least one computing device make to calculate section
Point implement above-mentioned first aspect or first aspect various may the methods that provide of design or execute above-mentioned second aspect or
The various of person's second aspect may the design method providing or the various possibility executing the above-mentioned third aspect or the third aspect
The method that design provides.
Brief description
Fig. 1 is name node, client and multiple back end in distributed file system provided in an embodiment of the present invention
Annexation schematic diagram;
Fig. 2 is distributed file system provided in an embodiment of the present invention and the showing of the annexation of distributed block storage system
It is intended to;
Fig. 3 is the schematic flow sheet of the method for storage file in distributed file system provided in an embodiment of the present invention;
Fig. 4 is distributed file system and the distributed block storage system of the method using the storage file shown in Fig. 3
Structural representation;
A kind of structural representation of first virtual machine that Fig. 5 provides for bright embodiment;
The structural representation of another kind first virtual machine that Fig. 6 provides for bright embodiment;
A kind of structural representation of name node that Fig. 7 provides for bright embodiment;
The structural representation of another kind of name node that Fig. 8 provides for bright embodiment;
A kind of structural representation of client that Fig. 9 provides for bright embodiment;
The structural representation of another kind of client that Figure 10 provides for bright embodiment;
A kind of structural representation of second virtual machine that Figure 11 provides for bright embodiment;
The structural representation of another kind second virtual machine that Figure 12 provides for bright embodiment;
Figure 13 is a kind of structural representation of distributed file system provided in an embodiment of the present invention.
Specific embodiment
The above-mentioned purpose of embodiment, scheme and advantage, provided hereinafter detailed description for a better understanding of the present invention.Should
Describe in detail by using the accompanying drawings such as block diagram, flow chart and/or example, illustrate the various embodiments of device and/or method.
In these block diagrams, flow chart and/or example, comprise one or more functions and/or operation.It will be appreciated by those skilled in the art that
Arrive:Each function in these block diagrams, flow chart or example and/or operation, can pass through various hardware, software, consolidate
Part is separately or cooperatively implemented, or is implemented by the combination in any of hardware, software and firmware.
The present embodiments relate to distributed file system, below distributed file system is described in detail.
As shown in figure 1, distributed file system can comprise name node and multiple back end.According to distributed field system
The application scenario of system is different, and name node is properly termed as main control server or other titles again, and accordingly, back end may be used again
To be referred to as data server or other titles.It should be noted that only illustrating in Fig. 1 that distributed file system comprises a visitor
The scene at family end, in practice, can comprise multiple client in distributed file system.
Wherein, name node is used for managing multiple back end, and name node records the literary composition of storage in each back end
The information (such as meta data file) of part, service state of each back end etc.;Back end is used for storage file, works as client
End is when carrying out file read-write operations, client first to the index information of name node acquisition request back end, then root again
Access corresponding back end to carry out file read-write according to the index information asked.May be synchronous between multiple back end
File.Such as when certain file needs to write in two back end, can first write one of back end, then by this
File synchronization is given another back end by back end.Additionally, can also directly carry out letter between name node data node
Breath interaction.
Wherein, name node, back end, client can following any one have on the equipment of computing capability and configure
Corresponding function is realized.The equipment that this has computing capability can be physical equipment or virtual unit;For example, physical equipment can be
Personal computer, notebook, large scale computer, Net-connected computer, handheld computer, personal digital assistant, work station etc.,
Virtual unit can be to dispose virtual machine in a physical device or container etc..
Referring to Fig. 2, when back end is for virtual machine, the virtual hard disk of virtual machine is provided by distributed block storage system,
Distributed block system management memory has multiple physical hard disks, is really to distributed block to the virtual hard disk write file of virtual machine
File is write in the physical hard disk of system management memory.
Referring to Fig. 2, distributed file system, in order to ensure the reliability of itself, typically can adopt file in storage file
Copy mechanism, such as when storing certain file, this document is respectively stored on two back end, that is, is stored in virtual machine
1 and virtual machine 2 on;Distributed block storage system in order to ensure the reliability of itself, specifically in the file of storage virtual machine
Duplicate of the document mechanism can be adopted, such as hard in the physics of physical server 1 respectively when realizing the storage of this document of virtual machine 1
On the physical hard disk 5 of disk 1, the physical hard disk 3 of physical server 2 and physical server 3 store this document, and realize virtual
It is respectively stored in the physical hard disk 2 of physical server 1, physical hard disk 4 and of physical server 2 during the storage of this document of machine 2
This document is stored on the physical hard disk 6 of physical server 3.So, due to distributed file system and distributed block storage system
All using duplicate of the document mechanism, this document is led to preserve six parts in the physical hard disk of distributed block system management memory.Obviously,
The file number redundancy preserving for same file can waste memory space, the process performance of impact system.
It should be noted that in order to explain the operation when storing certain file for the distributed file system, Fig. 2's is distributed
Illustrate only two virtual machines, each virtual machine comprises a virtual hard disk in file system;The distributed block storage system of Fig. 2
In illustrate only three physical servers, each physical server comprises two physical hard disks.Actual when realizing, distributed document
System can be stored with multiple files, thus the quantity of the virtual machine that distributed file system is comprised is not limited, to each
The quantity of the virtual hard disk that virtual machine comprises is not limited;Meanwhile, to the physical server that distributed block storage system comprises
Quantity is not limited, and the quantity of the physical hard disk that each physical server comprises also is not limited.
In order to solve the problems, such as the file number redundancy that distributed file system exists, the embodiment of the present invention provides a kind of point
The method of storage file in cloth file system, distributed file system includes name node, multiple void as back end
Plan machine, wherein, multiple virtual machines share same memory area.As shown in figure 3, the method includes:
S301:Client sends, to name node, the request message that request writes data to be written to distributed file system.
Data to be written can be video data, voice data, document data or other binary data.Data to be written
Granularity can be file, data block or other granularities.The quantity of data to be written can be one or more, as long as by Fig. 3 institute
After showing method execution once, some or multiple data have been written to distributed file system, this one or more data
It is considered as data to be written.
S302:Name node sends the corresponding response message of request message to client.
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, and response message also indicates
One virtual machine is a virtual machine in multiple virtual machines with the authority writing data to memory area, and the second virtual machine is many
Virtual machine in addition to the first virtual machine in individual virtual machine.
Wherein, the quantity of the first virtual machine is necessary for one;The quantity of the second virtual machine can be one or many
Individual, in the embodiment of the present invention, the quantity of the second virtual machine is not limited.
In the embodiment of the present invention, in multiple virtual machines, only one virtual machine has the power writing data to memory area
Limit, its reason is:It is used for writing data to be written if there are multiple virtual machines, then when client will be write to distributed file system
When entering data to be written, have multiple virtual machines and receive the instruction writing data to be written;Because multiple virtual machines share same depositing
Storage area domain, then in synchronization, the instruction of the write data to be written that multiple virtual machines receive can indicate that multiple virtual machines will be treated
Write data and write same memory area, so can cause the instruction writing data to be written cannot be distinguished by passed through which virtual machine
Write data to be written, lead to the instruction writing data to be written cannot execute.In addition, only one of which virtual machine writes data to be written,
Ensure only to have write a data to be written in distributed system write data phase, write in distributed system with respect to prior art
Data phase is many parts of data to be written of write, decreases data redundancy.
In the embodiment of the present invention, the quantity of the second virtual machine the reason being multiple be:To read when there being multiple client
During data to be written, can be read out by multiple second virtual machines, improve the efficiency that client reads data to be written.In addition, the
After two virtual machines obtain the metadata of this data to be written that the first virtual machine sends, can directly read from distributed block storage system
Take this data to be written, it is to avoid prior art breaks down in the first virtual machine and cannot read the situation of this data to be written.
Restriction to the quantity of the first virtual machine according to embodiments of the present invention, name node is sent to the response of client
Can only indicate in message that the first virtual machine has the authority to memory area write data, without the power indicating the second virtual machine
Limit, its reason is:Because only one virtual machine has the authority writing data to memory area in multiple virtual machines, work as response
Indicate the first virtual machine in multiple virtual machines in message and there is the authority writing data to memory area, then be multiple virtual
In machine, the second virtual machine in addition to the first virtual machine gives tacit consent to the authority having from memory area reading data to be written.
Alternatively, response message also indicates that the second virtual machine has the authority from institute's memory area reading data to be written.
In S302, the authority of the first virtual machine of response message instruction and the authority of the second virtual machine are only for number to be written
According to.First virtual machine and the second virtual machine share the memory area writing data to be written.Such as, data 1 to be written is being stored
During distributed file system, response message instruction virtual machine 1 is the first virtual machine, and virtual machine 2 is the second virtual machine, virtual machine 1
Data 1 to be written is write in the memory area 1 of virtual machine 1, then the metadata of data 1 to be written is sent to virtual machine 2, its
In, virtual machine 2 and virtual machine 1 shared storage area 1;When data 2 to be written is stored distributed file system, response message
Instruction virtual machine 1 is the first virtual machine, and virtual machine 3 is the second virtual machine, and data 2 to be written is write depositing of virtual machine 1 by virtual machine 1
In storage area domain 2, then the metadata of data 2 to be written is sent to virtual machine 3, wherein, virtual machine 3 shares storage with virtual machine 1
Region 2.
S303:Client sends data to be written and the second virtual machine according to the address of the first virtual machine to the first virtual machine
Address.
Wherein, client sends data to be written to the first virtual machine and the address of the second virtual machine is indicated for the first void
Plan machine writes data to be written, generation or the metadata updating data to be written and empty to second according to the address of the second virtual machine
Plan machine sends the metadata of data to be written.
S304:First virtual machine writes data to be written to the memory area that multiple virtual machines are shared, and generates or update and treat
Write the metadata of data.
Wherein, the metadata of data to be written can be used for the first virtual machine and the second virtual machine according to this metadata from multiple void
The memory area that plan machine is shared reads data to be written;The metadata of data to be written includes but is not limited to:The storage position of data to be written
Put, the catalogue of the title of data to be written and data to be written.
S305:First virtual machine sends, to the second virtual machine, the first number generating or updating according to the address of the second virtual machine
According to.
It should be noted that distributed file system typically may include client, also may not include client.If distributed
File system includes client, the quantity of client including but not limited to one.In the embodiment of the present invention, in order to more clearly retouch
State the interaction between client, name node, the first virtual machine and the second virtual machine, client is included in distributed field system
In system.It is actual that distributed file system also may not include client when realizing, now the embodiment of the present invention can be considered client and
The interaction of distributed file system.
Alternatively, the same void that multiple virtual machine carry distributed block storage systems that distributed file system includes provide
Intend hard disk, this virtual hard disk includes the memory area that multiple virtual machines are shared.
Using the method for storage file in the distributed file system shown in Fig. 3, included due to distributed file system
Multiple virtual machines share same memory area, thus in distributed file system, this data to be written is only in this memory area
Storage is a.For data to be written, only can preserve many due to the duplicate of the document mechanism that distributed block storage system adopts
Part, and there is not the preservation all leading to due to distributed file system and distributed block storage system using duplicate of the document mechanism
File number redundancy problem.
Further, since in the method for storage file in the distributed file system shown in Fig. 3, distributed file system bag
The first virtual machine in the multiple virtual machines including has the authority writing data to memory area, removes first empty in multiple virtual machines
The second virtual machine beyond plan machine has the authority reading data to be written from memory area.Thus, can in distributed file system
Quantity for providing the virtual machine of the service reading and writing data to be written for client is multiple.When certain virtual machine breaks down
When, can provide, for client, the service reading and writing data to be written by other virtual machines, so that the availability of distributed file system is obtained
Arrive raising, it also avoid the problem of resource waste existing during mechanism hot standby using virtual machine in prior art simultaneously.
Explain how method shown in Fig. 3 solves storage file number redundancy issue, improves system simultaneously in order to vivider
Availability, now method shown in Fig. 3 is applied and is illustrated in distributed file system and distributed block storage system.Using
The distributed file system of method shown in Fig. 3 and distributed block storage system can be as shown in Figure 4.Distributed document shown in Fig. 4
System comprises the first virtual machine, the second virtual machine, client and name node.Actual when realizing, the quantity to the second virtual machine
It is not limited with the quantity of client.Distributed block storage system shown in Fig. 4 comprises three physical servers, and each physics takes
Business device comprises two physical hard disks.
Wherein, the first virtual machine has the authority writing data to memory area, and the second virtual machine has from memory area
Read the authority of data to be written.Because the first virtual machine and the second virtual machine share same memory area, can be considered that first is virtual
Machine and the second virtual shared same virtual hard disk 1, the first virtual machine can write data to be written in virtual hard disk 1, and second is empty
Plan machine can read data to be written from virtual hard disk 1.Thus data to be written only stores portion, that is, in distributed file system
It is stored in virtual hard disk 1, data to be written can store three parts in distributed block storage system, for example, be respectively stored in physics
On the physical hard disk 5 of the physical hard disk 1 of server 1, the physical hard disk 3 of physical server 2 and physical server 3.So, to be written
Data only saves three parts in physical hard disk.Same in distributed file system shown in Fig. 2 and distributed block storage system
Individual file saves six parts in physical hard disk, by contrast, after method shown in Fig. 3, the distributed field system shown in Fig. 4
In system and distributed block storage system, data to be written only saves three parts in physical hard disk, thus greatly reduce file preserving
Number, solve the problems, such as in distributed file system preserve file number redundancy.
Additionally, in the diagram, the first virtual machine can be used for writing data to be written and reads data to be written, and the second virtual machine can
For reading data to be written, thus when wherein certain virtual machine breaks down, the virtual machine that can not broken down by another
There is provided, for client, the service reading and writing data to be written, improve the availability of system.
Further, the second virtual machine is after receiving the metadata of the data to be written that the first virtual machine sends, if
Need the fileinfo generating or updating record in the operating system of itself can be divided into following two situations:
The first situation
If it is by the first virtual machine that the first virtual machine writes during data to be written to the memory area that multiple virtual machines are shared
Operating system write, and client pass through second virtual machine reading data to be written be also required to the behaviour by the second virtual machine
Make system to read, now, the second virtual machine needs the metadata according to data to be written to generate or update in the operating system of itself
The fileinfo of record, the operating system just enabling the second virtual machine reads data to be written from this memory area.Wherein, should
The operating system that fileinfo is used for the second virtual machine reads described data to be written from memory area.Update the second virtual machine
The mode of the fileinfo in operating system can have two kinds;The first, if the operating system of the second virtual machine can be known
The change of data in memory area, then can be oneself to update the fileinfo of this data to be written;Second, the behaviour of the second virtual machine
Make the metadata of this data to be written that system can send according to the first virtual machine, update the fileinfo of this data to be written.
Second situation
It is this storage of writing direct when the first virtual machine writes data to be written to the memory area that multiple virtual machines are shared
Region, rather than when being write by the operating system of the first virtual machine, client can directly read this by the second virtual machine
Data to be written in shared memory area, is read without by the operating system of the second virtual machine.Now, second is virtual
Machine does not need the metadata according to data to be written to generate or update the fileinfo of record in the operating system of itself, and second is virtual
Machine can read data to be written according only to the metadata of data to be written from this memory area.
In S302, name node needs to indicate that the first virtual machine is to have to depositing in multiple virtual machines by corresponding message
Storage area domain writes a virtual machine of the authority of data, and the second virtual machine is the void in multiple virtual machines in addition to the first virtual machine
Plan machine, that is, after the operation of as above S302, client not only can obtain the address of the first virtual machine and the second virtual machine
Address, can also know that the first virtual machine has the authority to memory area write data, the second virtual machine has from memory block
The authority of data to be written is read in domain.Name node is to the mode bag of client notification first virtual machine and the authority of the second virtual machine
Include but be not limited to following two:
First kind of way
In execution S302, name node also includes the write permission mark of the first virtual machine to the response message that client sends
Know and the read right of the second virtual machine identifies, write permission identifies to be written to memory area write for specifying the first virtual machine to have
The authority of data, read right identifies the authority reading data to be written for specifying the second virtual machine to have from memory area.Client
The first virtual machine and the authority of the second virtual machine are known according to write permission mark and read right mark in end, and write permission mark is sent out
Give the first virtual machine, read right mark is sent to the second virtual machine, the respective authority of multiple virtual machine will be handed down to phase
The virtual machine answered.
Wherein, the process that write permission mark is sent to the first virtual machine by client can also may be used before execution S303
, write permission is identified it is also possible to execute with S303 simultaneously with after execution S303 simultaneously, data to be written and the second virtual machine
Address be sent to the first virtual machine.The embodiment of the present invention is not limited to the execution sequence of this two steps.Similarly, client
Read right mark can be sent to the second virtual machine by end.
The second way
The address of the address of the first virtual machine and the second virtual machine arranges according to preset rules in the response message, presets rule
Then be used for specify the first virtual machine have to memory area write data to be written authority, and specify the second virtual machine have from
Memory area reads the authority of data to be written.
Wherein, preset rules can be in response to the order of the address of the virtual machine that message includes.Such as, name node and visitor
Family end can be arranged in advance:The address that name node is the first virtual machine to first address that client sends, then client
End is behind the address receiving multiple virtual machines that response message includes it may be determined that first address is to have to memory area
The address of first virtual machine of authority of write data, remaining address is to have the authority reading data to be written from memory area
The second virtual machine address.
It has been mentioned hereinbefore that due to can be used in distributed file system providing, for client, the service reading and writing data to be written
Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client
The service of data to be written, thus improve the availability of distributed file system.Below will be virtual to distributed file system
How machine operates after breaking down is explained in detail.
In distributed file system, including but not limited to following three kinds of the mode that virtual machine breaks down is detected:The
One kind, name node detects certain virtual machine and breaks down;Second, client reads data by certain virtual machine or writes
When entering data, if reading and writing data process cannot be completed it is determined that this virtual machine breaks down, be there is event by client in this virtual machine
The information reporting of barrier is to name node;The third, virtual machine can periodically carry out self-inspection, when certain virtual machine finds that itself occurs event
During barrier, the message of itself fail directly can be reported name node, or name node is reported by client.Cause
This, when certain virtual machine in distributed file system breaks down, name node can know this void with above-mentioned three kinds of approach
The message that plan machine breaks down, can take corresponding operating then, it is to avoid distributed file system cannot provide for client
The situation of reading and writing data service.
In the embodiment of the present invention, the virtual machine of distributed file system breaks down and can be divided into following two situations:
The first situation:First virtual machine breaks down
When the first virtual machine breaks down, name node sends the first new information to client, and this first renewal disappears
Breath includes the address of the first virtual machine of renewal, and this first new information is specified and removed the first void breaking down in multiple virtual machines
Another virtual machine beyond plan machine as the first virtual machine updating, that is, indicates that the first virtual machine of renewal has to multiple void
The memory area that plan machine is shared writes the authority of data;Thus, when client needs write data to be written, can be by renewal
First virtual machine write.
In this way, when the first virtual machine breaks down, name node is specified to remove in multiple virtual machines and event is occurred
Another virtual machine beyond first virtual machine of barrier as the first virtual machine updating, that is, indicates the first virtual equipment of renewal
The memory area that oriented multiple virtual machine is shared writes the authority of data, then can be by updating when client will write data
First virtual machine write, when client will read data, can by second virtual machine read or by update first
Virtual machine reads.
To sum up, in this way, do not interfere with client write when the first virtual machine breaks down or read data,
Improve the availability of system.
As it was previously stated, the quantity of the second virtual machine can be one or more, when the quantity of the second virtual machine is one
When, if the first virtual machine breaks down, can also further execute following method after execution said method:Title section
Point specifies the second virtual machine of one or more renewals outside multiple virtual machines that distributed file system includes, and the of renewal
Two virtual machines have the authority reading data to be written from memory area, and the second virtual machine and the distributed file system of renewal include
Multiple virtual machines share same memory area;The second virtual machine that name node updates to client instruction has from memory block
The authority of data to be written is read in domain.After client receives the instruction of name node, notify the first virtual machine updating:Will be to be written
The metadata of data is sent to the second virtual machine of renewal.So, when client will read data to be written, not only by
Two virtual machines read, and also can be read by the second virtual machine updating.
Name node specifies the second virtual machine updating to have outside multiple virtual machines that distributed file system includes
Read the authority of data to be written from memory area, when there is multiple client and needing to read data to be written, client not only may be used
To be read by the second virtual machine, can also be read by the second virtual machine updating, improve client and read data to be written
Efficiency.
Second situation:Second virtual machine breaks down
When the second virtual machine breaks down, name node sends the second fresh information, the second fresh information to client
Including update the second virtual machine address, the second fresh information specify multiple virtual machines beyond another virtual machine as more
The second new virtual machine, the second virtual machine of renewal has the authority reading data to be written from memory area.Wherein, the of renewal
Multiple virtual machines that two virtual machines and distributed file system include share above-mentioned same memory area.Name node is to client
Indicate that the second virtual machine updating has after memory area reads the authority of data to be written, client is according to the finger of name node
Show and notify the first virtual machine:The metadata of data to be written is sent to the second virtual machine of renewal.
Wherein, the first virtual machine will be able to be treated according to the instruction of notification message after the notification message receiving client transmission
The metadata writing data is sent to the second virtual machine of renewal.So, when client will read data to be written, not only by
The second virtual machine not broken down is read or is read by the first virtual machine, also can be read by the second virtual machine updating.
In this way, when the second virtual machine breaks down, name node specifies the second virtual machine of renewal to have
Read the authority of data to be written from memory area, then can write by the first virtual machine when client will write data to be written
Enter, when client will read data to be written, not only by the first virtual machine and do not break down second virtual machine read,
Can also be read by the second virtual machine updating.To sum up, in this way, will not shadow when the first virtual machine breaks down
Ring client write or read data to be written, improve the availability of system.
Using the method for storage file in distributed file system provided in an embodiment of the present invention, distributed literary composition can be solved
The problem of the file number redundancy of storage in part system.Additionally, when certain virtual machine in distributed file system breaks down
When, client reading is not interfered with using the method for storage file in distributed file system provided in an embodiment of the present invention or writes
Enter file, improve the availability of system.
The embodiment of the present invention provides the first virtual machine in a kind of distributed file system, and this distributed file system includes
Name node, multiple virtual machine as back end, multiple virtual machines share same memory area, and the first virtual machine is multiple
Specified a virtual machine with the authority writing data to memory area by name node in virtual machine.As shown in figure 5, this
One virtual machine 500 includes:
Receiver module 501, for receiving the data to be written of client transmission, the address of the second virtual machine, the second virtual machine
For the virtual machine in addition to the first virtual machine in multiple virtual machines;
Processing module 502, for writing, to memory area, the data to be written that receiver module 501 receives, and generates or updates
The metadata of data to be written;
Sending module 503, the address of the second virtual machine for being received according to receiver module 501 sends to the second virtual machine
Processing module 502 generates or more new metadata.
Alternatively, receiver module 501 is additionally operable to:Before processing module 502 writes data to be written to memory area, receive
The write permission mark of the first virtual machine that client sends.
Wherein, write permission mark is that name node asks to treat to distributed file system write to name node in client
Write to client transmission during data, write permission identifies and writes data to be written for specifying the first virtual machine to have to memory area
Authority.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes
Memory area.
Alternatively, if the second virtual machine reads data to be written by the operating system of itself, it is empty that metadata is used for second
Plan machine generates or updates the fileinfo of record in the operating system of itself, and fileinfo is used for operating system from memory area
Read data to be written;If or the second virtual machine reads data to be written, metadata is used for the second virtual machine from memory area
Read data to be written.
Alternatively, the second virtual machine is specified by name node and is had the authority reading data to be written from memory area.
Using the first virtual machine 500 provided in an embodiment of the present invention, the literary composition of storage in distributed file system can be solved
The problem of part number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, being implemented by the present invention
The operation of the first virtual machine 500 that example provides, so that the fault of virtual machine does not interfere with client reading or write file,
Improve the availability of system.
It should be noted that the first virtual machine 500 provided in an embodiment of the present invention can be used for execute Fig. 3 shown in distributed
The operation of the first virtual machine execution in the method for storage file in file system, the first virtual machine 500 does not explain in detail and describes
Implementation refer to the associated description in the method for storage file in the distributed file system shown in Fig. 3.
It should be noted that being schematic to the division of module in the embodiment of the present invention, only a kind of logic function
Divide, actual can have other dividing mode when realizing.In addition, each functional module in each embodiment of the application is permissible
It is integrated in a processing module or modules are individually physically present it is also possible to two or more module collection
In Cheng Yi module.Above-mentioned integrated module both can be to be realized in the form of hardware, it would however also be possible to employ software function module
Form realize.
Based on above example, the embodiment of the present invention additionally provides a kind of first virtual machine, and this first virtual machine can be held
The method that the corresponding embodiment of row Fig. 3 provides, can be identical with the first virtual machine 500 shown in Fig. 5.
Referring to Fig. 6, the equipment that the first virtual machine 600 is located includes at least one processor 601, memorizer 602 and communication
Interface 603;At least one processor 601 described, described memorizer 602 and described communication interface 603 are all by bus 604 even
Connect;
Described memorizer 602, for storing computer executed instructions;
At least one processor 601 described, for execute the storage of described memorizer 602 computer executed instructions so that
Described first virtual machine 600 carries out data interaction by described communication interface 603 and the miscellaneous equipment in distributed file system
Method to execute storage file in the distributed file system that above-described embodiment provides, or make described first virtual machine
600 carry out data interaction by the miscellaneous equipment in described communication interface 603 and distributed file system realizes distributed literary composition
The some or all of function of part system.
At least one processor 601, can include different types of processor 601, or the process including same type
Device 601;Processor 601 can be following any one:Central processing unit (Central Processing Unit, referred to as
CPU), arm processor, field programmable gate array (Field Programmable Gate Array, abbreviation FPGA), special
Processor etc. has the device calculating disposal ability.A kind of optional embodiment, at least one processor 601 described can also collect
Become many-core processor.
Memorizer 602 can be following any one or any one combination:Random access memory (Random Access
Memory, abbreviation RAM), read only memory (read only memory, abbreviation ROM), nonvolatile memory (non-
Volatile memory, abbreviation NVM), solid state hard disc (Solid State Drives, abbreviation SSD), mechanical hard disk, disk,
The storage mediums such as disk array.
Communication interface 603 is used for the first virtual machine 600, and (other in such as distributed file system set with other equipment
Standby) carry out data interaction.Communication interface 603 can be following any one or any one combination:Network interface (such as Ethernet
Interface), wireless network card etc. there is the device of network access facility.
This bus 604 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Fig. 6 is thick with one
Line represents this bus.Bus 604 can be following any one or any one combination:Industry standard architecture (Industry
Standard Architecture, abbreviation ISA) bus, peripheral component interconnection (Peripheral Component
Interconnect, abbreviation PCI) bus, EISA (Extended Industry Standard
Architecture, abbreviation EISA) wired data transfer such as bus device.
The embodiment of the present invention provides the name node in a kind of distributed file system, and this distributed file system includes name
Claim node, multiple virtual machine as back end, multiple virtual machines share same memory area;As shown in fig. 7, name node
700 include:
Receiver module 701, writes the request message of data to be written for receiving client request to distributed file system;
Sending module 702, for the corresponding response message of request message receiving to client sending/receiving module 701,
Response message includes the address of the first virtual machine and the address of the second virtual machine, and response message indicates that the first virtual machine is multiple void
There is in plan machine a virtual machine of the authority writing data to memory area, the second virtual machine is to remove first in multiple virtual machines
Virtual machine beyond virtual machine.
Alternatively, response message also indicates that the second virtual machine has the authority from memory area reading data to be written.
Alternatively, the read right of write permission mark and the second virtual machine that response message also includes the first virtual machine identifies,
Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, and read right identifies for referring to
Fixed second virtual machine has the authority reading data to be written from memory area.
Alternatively, in response message, the address of the address of the first virtual machine and the second virtual machine arranges according to preset rules,
Preset rules are used for the authority specifying the first virtual machine to have to memory area write data to be written, and specify the second virtual machine
There is the authority reading data to be written from memory area.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes
Memory area.
Alternatively, sending module 702 is additionally operable to:When the first virtual machine breaks down, send the first renewal to client
Information, the first fresh information includes the address of the first virtual machine of renewal, and the first fresh information is specified to remove in multiple virtual machines and sent out
As the first virtual machine updating, the first virtual machine of renewal has another virtual machine beyond first virtual machine of raw fault
Write the authority of data to memory area;And/or when the second virtual machine breaks down, send second to client and update letter
Breath, the second fresh information includes the address of the second virtual machine updating, and the second fresh information is specified another beyond multiple virtual machines
As the second virtual machine updating, the second virtual machine of renewal has the power reading data to be written from memory area to one virtual machine
Limit.
Using name node 700 provided in an embodiment of the present invention, the file of storage in distributed file system can be solved
The problem of number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, by the embodiment of the present invention
The operation of the name node 700 providing, so that the fault of virtual machine does not interfere with client reading or write file, improves
The availability of system.
It should be noted that name node 700 provided in an embodiment of the present invention can be used for executing the distributed literary composition shown in Fig. 3
The operation of name node execution, the realization that name node 700 does not explain in detail and describes in the method for storage file in part system
Mode refers to the associated description in the method for storage file in the distributed file system shown in Fig. 3.
Based on above example, the embodiment of the present invention additionally provides a kind of name node, and this name node can perform Fig. 3 pair
The method that the embodiment answered provides, can be identical with the name node 700 shown in Fig. 7.
Referring to Fig. 8, name node 800 includes at least one processor 801, memorizer 802 and communication interface 803;Described
At least one processor 801, described memorizer 802 and described communication interface 803 are all connected by bus 804;
Described memorizer 802, for storing computer executed instructions;
At least one processor 801 described, for execute the storage of described memorizer 802 computer executed instructions so that
Described name node 800 by the miscellaneous equipment in described communication interface 803 and distributed file system carry out data interaction Lai
The method of storage file in the distributed file system that execution above-described embodiment provides, or described name node 800 is led to
Cross described communication interface 803 to carry out data interaction to realize distributed file system with the miscellaneous equipment in distributed file system
Some or all of function.
At least one processor 801, can include different types of processor 801, or the process including same type
Device 801;Processor 801 can be following any one:CPU, arm processor, FPGA, application specific processor etc. have calculating and process
The device of ability.A kind of optional embodiment, at least one processor 801 described can also be integrated into many-core processor.
Memorizer 802 can be following any one or any one combination:RAM, ROM, NVM, SSD, mechanical hard disk, magnetic
The storage mediums such as disk, disk array.
Communication interface 803 is used for name node 800 and other equipment (other equipment in such as distributed file system)
Carry out data interaction.Communication interface 803 can be following any one or any one combination:(for example Ethernet connects network interface
Mouthful), wireless network card etc. there is the device of network access facility.
This bus 804 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Fig. 8 is thick with one
Line represents this bus.Bus 804 can be following any one or any one combination:Isa bus, pci bus, eisa bus etc.
The device of wired data transfer.
The embodiment of the present invention provide a kind of client, this client be located distributed file system include name node,
Multiple virtual machines as back end, multiple virtual machines share same memory area;As shown in figure 9, client 900 includes:
Sending module 901, for sending, to name node, the request that request writes data to be written to distributed file system
Message;
Receiver module 902, for receiving the corresponding response message of request message of name node transmission, response message includes
The address of the first virtual machine and the address of the second virtual machine, response message indicate the first virtual machine be multiple virtual machines in have to
Memory area write data authority a virtual machine, the second virtual machine be in multiple virtual machines in addition to the first virtual machine
Virtual machine;
Sending module 901, is additionally operable to the address of the first virtual machine including according to the response message that receiver module 902 receives
To first virtual machine send data to be written, the second virtual machine address, and indicate first virtual machine write data to be written, generation or
Update the metadata of data to be written and the address of the second virtual machine including according to the response message that receiver module 902 receives
Send the metadata of data to be written to the second virtual machine.
Alternatively, the read right of write permission mark and the second virtual machine that response message also includes the first virtual machine identifies,
Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, and read right identifies for referring to
Fixed second virtual machine has the authority reading data to be written from memory area.
Alternatively, the address of the first virtual machine that response message includes and the address of the second virtual machine are arranged according to preset rules
Row, preset rules are used for specifying the first virtual machine to have to memory area and write the authority of data to be written and specify second virtual
Machine has the authority reading data to be written from memory area.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes
Memory area.
Alternatively, response message also indicates that the second virtual machine has the authority from memory area reading data to be written.
Using client 900 provided in an embodiment of the present invention, file part of storage in distributed file system can be solved
The problem of number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, being carried by the embodiment of the present invention
For client 900 operation so that the fault of virtual machine does not interfere with client reading or write file, improve and be
The availability of system.
It should be noted that client 900 provided in an embodiment of the present invention can be used for executing the distributed document shown in Fig. 3
The operation of client executing in the method for storage file in system, client 900 does not explain in detail and the implementation that describes can
Associated description in the method for storage file in distributed file system with reference to shown in Fig. 3.
Based on above example, the embodiment of the present invention additionally provides a kind of client, and this client can execute Fig. 3 and correspond to
Embodiment provide method, can be identical with the client 900 shown in Fig. 9.
Referring to Figure 10, the equipment that client 1000 is located includes at least one processor 1001, memorizer 1002 and communication
Interface 1003;At least one processor 1001 described, described memorizer 1002 and described communication interface 1003 are all by bus
1004 connections;
Described memorizer 1002, for storing computer executed instructions;
At least one processor 1001 described, for executing the computer executed instructions of described memorizer 1002 storage, makes
Described client 1000 carries out data interaction by the equipment in described communication interface 1003 and distributed file system and holds
The method of storage file in the distributed file system that row above-described embodiment provides, or make described client 1000 pass through institute
State communication interface 1003 to carry out data interaction to realize the part of distributed file system with the equipment in distributed file system
Or repertoire.
At least one processor 1001, can include different types of processor 1001, or the place including same type
Reason device 1001;Processor 1001 can be following any one:CPU, arm processor, FPGA, application specific processor etc. have calculating
The device of disposal ability.A kind of optional embodiment, at least one processor 1001 described can also be integrated into many-core processor.
Memorizer 1002 can be following any one or any one combination:RAM, ROM, NVM, SSD, mechanical hard disk, magnetic
The storage mediums such as disk, disk array.
Communication interface 1003 is used for client 1000 and other equipment (other equipment in such as distributed file system)
Carry out data interaction.Communication interface 1003 can be following any one or any one combination:(for example Ethernet connects network interface
Mouthful), wireless network card etc. there is the device of network access facility.
This bus 1004 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Figure 10 is with one
Thick line represents this bus.Bus 1004 can be following any one or any one combination:Isa bus, pci bus, EISA are total
The device of the wired data transfer such as line.
The embodiment of the present invention provides the second virtual machine in a kind of distributed file system, and distributed file system includes name
Claim node, multiple virtual machine as back end, multiple virtual machines share same memory area;As shown in figure 11, second is empty
Plan machine 1100 includes:
Receiver module 1101, for receiving the metadata that the first virtual machine sends, the first virtual machine is in multiple virtual machines
Specified a virtual machine with the authority writing data to memory area by name node, the second virtual machine is multiple virtual machines
In virtual machine in addition to the first virtual machine, metadata is that the first virtual machine writes to memory area and generates after data to be written or more
The metadata of new data to be written.
Alternatively, receiver module 1101 is additionally operable to:Before receiving the metadata that the first virtual machine sends, receive client
The read right mark of the second virtual machine sending, read right mark is that name node is asked to distribution to name node in client
Formula file system writes to client transmission during data to be written, and read right identifies for specifying the second virtual machine to have from storage
The authority of data to be written is read in region.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes
Memory area.
Alternatively, the second virtual machine also includes:Processing module 1102, for receiving the first virtual machine in receiver module 1101
After the metadata sending, if the second virtual machine reads data to be written by the operating system of itself, generated according to metadata
Or updating the fileinfo recording in the operating system of itself, fileinfo reads to be written from memory area for operating system
Data;If or the second virtual machine reads data to be written, data to be written is read from memory area according to metadata.
Alternatively, the second virtual machine is specified by name node and is had the authority reading data to be written from memory area.
Using the second virtual machine 1100 provided in an embodiment of the present invention, the literary composition of storage in distributed file system can be solved
The problem of part number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, being implemented by the present invention
The operation of the second virtual machine 1100 that example provides, so that the fault of virtual machine does not interfere with client read or write literary composition
Part, improves the availability of system.
It should be noted that the second virtual machine 1100 provided in an embodiment of the present invention can be used for execute Fig. 3 shown in distributed
The operation of the second virtual machine execution in the method for storage file in file system, the second virtual machine 1100 does not explain in detail and describes
Implementation refer to the associated description in the method for storage file in the distributed file system shown in Fig. 3.
Based on above example, the embodiment of the present invention additionally provides a kind of second virtual machine, and this second virtual machine can be held
The method that the corresponding embodiment of row Fig. 3 provides, can be identical with the second virtual machine 1100 shown in Figure 11.
Referring to Figure 12, the equipment that the second virtual machine 1200 is located includes at least one processor 1201, memorizer 1202 and
Communication interface 1203;At least one processor 1201 described, described memorizer 1202 and described communication interface 1203 are all by total
Line 1204 connects;
Described memorizer 1202, for storing computer executed instructions;
At least one processor 1201 described, for executing the computer executed instructions of described memorizer 1202 storage, makes
Obtain described second virtual machine 1200 and data is carried out by described communication interface 1203 and the miscellaneous equipment in distributed file system
The method to execute storage file in the distributed file system of above-described embodiment offer for the interaction, or make described second virtual
Machine 1200 carries out data interaction by the miscellaneous equipment in described communication interface 1203 and distributed file system and realizes being distributed
The some or all of function of formula file system.
At least one processor 1201, can include different types of processor 1201, or the place including same type
Reason device 1201;Processor 1201 can be following any one:CPU, arm processor, FPGA, application specific processor etc. have calculating
The device of disposal ability.A kind of optional embodiment, at least one processor 1201 described can also be integrated into many-core processor.
Memorizer 1202 can be following any one or any one combination:RAM, ROM, NVM, SSD, mechanical hard disk, magnetic
The storage mediums such as disk, disk array.
Communication interface 1203 is used for the second virtual machine 1200, and (other in such as distributed file system set with other equipment
Standby) carry out data interaction.Communication interface 1203 can be following any one or any one combination:Network interface (such as ether
Network interface), wireless network card etc. there is the device of network access facility.
This bus 1204 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Figure 12 is with one
Thick line represents this bus.Bus 1204 can be following any one or any one combination:Isa bus, pci bus, EISA are total
The device of the wired data transfer such as line.
The embodiment of the present invention provides a kind of distributed file system, and as shown in figure 13, distributed file system 1300 includes:
First virtual machine 1301, name node 1302, client 1303 and the second virtual machine 1304.
Wherein, the first virtual machine 1301 in distributed file system 1300 can be used for executing the distributed literary composition shown in Fig. 3
Associative operation performed by first virtual machine in the method for storage file in part system, it implements form can be Fig. 5 institute
The first virtual machine 600 shown in the first virtual machine 500 showing or Fig. 6;Name node 1302 in distributed file system 1300
Can be used for executing the associative operation performed by name node in the method for storage file in the distributed file system shown in Fig. 3,
It implements form can be the name node 800 shown in name node 700 or Fig. 8 shown in Fig. 7;Distributed file system
Client 1303 in 1300 can be used for executing client institute in the method for storage file in the distributed file system shown in Fig. 3
The associative operation of execution, it implements form can be the client 1000 shown in client 900 or Figure 10 shown in Fig. 9;
The second virtual machine 1304 in distributed file system 1300 can be used for executing storage literary composition in the distributed file system shown in Fig. 3
Associative operation performed by second virtual machine in the method for part, it implements form can be the second virtual machine shown in Figure 11
The second virtual machine 1200 shown in 1100 or Figure 12.
In distributed file system 1300, data to be written only preserves one in the memory area that multiple virtual machines are shared
Part, solve the problems, such as the file number redundancy of storage in distributed file system.Additionally, when in distributed file system 1300
Certain virtual machine when breaking down, client still can be by the virtual machine pair not broken down in distributed file system 1300
Data to be written carries out write operation or read operation, improves the availability of distributed file system.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the present invention can be using complete hardware embodiment, complete software embodiment or the reality combining software and hardware aspect
Apply the form of example.And, the present invention can be using in one or more computers wherein including computer usable program code
The upper computer program implemented of usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) produces
The form of product.
The present invention is the flow process with reference to method according to embodiments of the present invention, equipment (system) and computer program
Figure and/or block diagram are describing.It should be understood that can be by each stream in computer program instructions flowchart and/or block diagram
Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
The processor instructing general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device is to produce
A raw machine is so that produced for reality by the instruction of computer or the computing device of other programmable data processing device
The device of the function of specifying in present one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing device with spy
Determine in the computer-readable memory that mode works so that the instruction generation inclusion being stored in this computer-readable memory refers to
Make the manufacture of device, this command device realize in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or
The function of specifying in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that counting
On calculation machine or other programmable devices, execution series of operation steps to be to produce computer implemented process, thus in computer or
On other programmable devices, the instruction of execution is provided for realizing in one flow process of flow chart or multiple flow process and/or block diagram one
The step of the function of specifying in individual square frame or multiple square frame.
Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creation
Property concept, then can make other change and modification to these embodiments.So, claims are intended to be construed to including excellent
Select embodiment and fall into being had altered and changing of the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification without deviating from this to the embodiment of the present invention
The spirit and scope of bright embodiment.So, if these modifications of the embodiment of the present invention and modification belong to the claims in the present invention
And its within the scope of equivalent technologies, then the present invention is also intended to comprise these changes and modification.
Claims (22)
1. in a kind of distributed file system the method for storage file it is characterised in that described distributed file system includes name
Claim node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area;Methods described includes:
First virtual machine receives data to be written, the address of the second virtual machine that client sends, and described first virtual machine is described
Specified a virtual machine with the authority writing data to described memory area, institute by described name node in multiple virtual machines
Stating the second virtual machine is the virtual machine in addition to described first virtual machine in the plurality of virtual machine;
Described first virtual machine writes described data to be written to described memory area, and generates or update the unit of described data to be written
Data;
Described first virtual machine sends described metadata according to the address of described second virtual machine to described second virtual machine.
2. the method for claim 1 it is characterised in that described first virtual machine to described memory area write described in treat
Before writing data, also include:
Described first virtual machine receives the write permission mark of described first virtual machine that described client sends, described write permission mark
Knowledge is that described name node asks to write described number to be written to distributed file system to described name node in described client
According to when to described client send, described write permission identifies for specifying described first virtual machine to have to described memory area
Write the authority of described data to be written.
3. method as claimed in claim 1 or 2 is it is characterised in that the plurality of virtual machine carry distributed block storage system
The same virtual hard disk providing, described virtual hard disk includes described memory area.
4. the method as described in any one of claims 1 to 3 it is characterised in that
If described second virtual machine reads described data to be written by the operating system of itself, described metadata is used for described the
Two virtual machines generate or update the fileinfo of record in the operating system of itself, and described fileinfo is used for described operating system
Described data to be written is read from described memory area;Or
If described second virtual machine reads described data to be written, described metadata is used for described second virtual machine from described storage
Described data to be written is read in region.
5. the method as described in any one of Claims 1-4 is it is characterised in that described second virtual machine is by described name node
Specify and there is the authority reading described data to be written from described memory area.
6. in a kind of distributed file system the method for storage file it is characterised in that described distributed file system includes name
Claim node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area;Methods described includes:
Described name node receives the request message that client request writes data to be written to described distributed file system;
Described name node sends the corresponding response message of described request message to described client, and described response message includes the
The address of one virtual machine and the address of the second virtual machine, described response message indicates that described first virtual machine is the plurality of virtual
There is in machine a virtual machine of the authority writing data to described memory area, described second virtual machine is the plurality of virtual
Virtual machine in addition to described first virtual machine in machine.
7. method as claimed in claim 6 it is characterised in that described response message also indicate described second virtual machine have from
Described memory area reads the authority of described data to be written.
8. method as claimed in claims 6 or 7 is it is characterised in that described response message also includes described first virtual machine
The read right mark of write permission mark and described second virtual machine, described write permission identifies for specifying the described first virtual equipment
Oriented described memory area writes the authority of described data to be written, and described read right identifies for specifying the described second virtual equipment
There is the authority reading described data to be written from described memory area.
9. method as claimed in claims 6 or 7 is it is characterised in that the address of the first virtual machine described in described response message
Arrange according to preset rules with the address of described second virtual machine, described preset rules are used for specifying described first virtual machine to have
Write the authority of described data to be written to described memory area, and specify described second virtual machine to have from described memory area
Read the authority of described data to be written.
10. the method as described in any one of claim 6 to 9 is it is characterised in that the plurality of virtual machine carry distributed block is deposited
The same virtual hard disk that storage system provides, described virtual hard disk includes described memory area.
11. methods as described in any one of claim 6 to 10 are it is characterised in that methods described also includes:
When described first virtual machine breaks down, described name node sends the first fresh information to described client, described
First fresh information includes the address of the first virtual machine of renewal, and described first fresh information is specified in the plurality of virtual machine and removed
Another virtual machine beyond described first virtual machine breaking down as the first virtual machine of described renewal, described renewal
First virtual machine has the authority writing data to described memory area;And/or when described second virtual machine breaks down, institute
State name node and send the second fresh information to described client, described second fresh information includes the second virtual machine of renewal
Address, another virtual machine beyond described second fresh information specifies the plurality of virtual machine is second empty as described renewal
Plan machine, the second virtual machine of described renewal has the authority reading described data to be written from described memory area.
The first virtual machine in a kind of 12. distributed file systems is it is characterised in that described distributed file system includes title
Node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area, and described first virtual machine is
Specified by described name node in the plurality of virtual machine have to described memory area write one of authority of data virtual
Machine;Described first virtual machine includes:
Receiver module, for receiving the data to be written of client transmission, the address of the second virtual machine, described second virtual machine is institute
State the virtual machine in addition to described first virtual machine in multiple virtual machines;
Processing module, for the data described to be written receiving to the described memory area described receiver module of write, and generates or more
The metadata of newly described data to be written;
Sending module, the address of described second virtual machine for being received according to described receiver module is sent out to described second virtual machine
Send described processing module to generate or update described metadata.
13. first virtual machines as claimed in claim 12 are it is characterised in that described receiver module is additionally operable to:
Before described processing module writes described data to be written to described memory area, receive the described of described client transmission
The write permission mark of the first virtual machine, described write permission mark be described name node in described client to described name node
Ask to write to the transmission of described client during described data to be written to distributed file system, described write permission identifies for referring to
Fixed described first virtual machine has the authority writing described data to be written to described memory area.
14. the first virtual machines as described in claim 12 or 13 are it is characterised in that the plurality of virtual machine carry distributed block
The same virtual hard disk that storage system provides, described virtual hard disk includes described memory area.
15. the first virtual machines as described in any one of claim 12 to 14 it is characterised in that
If described second virtual machine reads described data to be written by the operating system of itself, described metadata is used for described the
Two virtual machines generate or update the fileinfo of record in the operating system of itself, and described fileinfo is used for described operating system
Described data to be written is read from described memory area;Or
If described second virtual machine reads described data to be written, described metadata is used for described second virtual machine from described storage
Described data to be written is read in region.
16. the first virtual machines as described in any one of claim 12 to 15 are it is characterised in that described second virtual machine is described
Name node is specified has the authority reading described data to be written from described memory area.
Name node in a kind of 17. distributed file systems is it is characterised in that described distributed file system includes described name
Claim node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area;Described name node bag
Include:
Receiver module, writes the request message of data to be written for receiving client request to described distributed file system;
Sending module, the corresponding response of described request message for sending described receiver module reception to described client disappears
Breath, described response message includes the address of the first virtual machine and the address of the second virtual machine, described response message instruction described the
One virtual machine is a virtual machine in the plurality of virtual machine with the authority writing data to described memory area, described the
Two virtual machines are the virtual machine in the plurality of virtual machine in addition to described first virtual machine.
18. name node as claimed in claim 17 are it is characterised in that described response message also indicates described second virtual machine
There is the authority reading described data to be written from described memory area.
19. name node as described in claim 17 or 18 are it is characterised in that described response message also includes described first void
The write permission mark of plan machine and the read right mark of described second virtual machine, described write permission identifies for specifying described first void
Intend the authority that the oriented described memory area of equipment writes described data to be written, described read right identifies for specifying described second void
Plan machine has the authority reading described data to be written from described memory area.
20. name node as described in claim 17 or 18 are it is characterised in that the first virtual machine described in described response message
Address and the address of described second virtual machine arrange according to preset rules, described preset rules are used for specifying described first virtual
The oriented described memory area of equipment writes the authority of described data to be written, and specifies described second virtual machine to have to deposit from described
The authority of described data to be written is read in storage area domain.
21. name node as described in any one of claim 17 to 20 are it is characterised in that the plurality of virtual machine carry is distributed
The same virtual hard disk that formula block storage system provides, described virtual hard disk includes described memory area.
22. name node as described in any one of claim 17 to 21 are it is characterised in that described sending module is additionally operable to:
When described first virtual machine breaks down, send the first fresh information, described first fresh information to described client
Including the address of the first virtual machine updating, described first fresh information is specified and is removed the institute breaking down in the plurality of virtual machine
State another virtual machine beyond the first virtual machine as the first virtual machine of described renewal, the first virtual equipment of described renewal
Oriented described memory area writes the authority of data;And/or
When described second virtual machine breaks down, send the second fresh information, described second fresh information to described client
Including the address of the second virtual machine updating, described second fresh information specifies another beyond the plurality of virtual machine virtual
Machine as the second virtual machine of described renewal, the second virtual machine of described renewal have read from described memory area described to be written
The authority of data.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610846967.0A CN106446159B (en) | 2016-09-23 | 2016-09-23 | A kind of method of storage file, the first virtual machine and name node |
PCT/CN2017/085351 WO2018054079A1 (en) | 2016-09-23 | 2017-05-22 | Method for storing file, first virtual machine and namenode |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610846967.0A CN106446159B (en) | 2016-09-23 | 2016-09-23 | A kind of method of storage file, the first virtual machine and name node |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106446159A true CN106446159A (en) | 2017-02-22 |
CN106446159B CN106446159B (en) | 2019-11-12 |
Family
ID=58167356
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610846967.0A Active CN106446159B (en) | 2016-09-23 | 2016-09-23 | A kind of method of storage file, the first virtual machine and name node |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106446159B (en) |
WO (1) | WO2018054079A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704596A (en) * | 2017-10-13 | 2018-02-16 | 郑州云海信息技术有限公司 | A kind of method, apparatus and equipment for reading file |
WO2018054079A1 (en) * | 2016-09-23 | 2018-03-29 | 华为技术有限公司 | Method for storing file, first virtual machine and namenode |
CN109753226A (en) * | 2017-11-07 | 2019-05-14 | 阿里巴巴集团控股有限公司 | Data processing system, method and electronic equipment |
CN110110003A (en) * | 2018-01-26 | 2019-08-09 | 广州中国科学院计算机网络信息中心 | The data storage control method and device of M2M platform |
CN110688194A (en) * | 2018-07-06 | 2020-01-14 | 中兴通讯股份有限公司 | Disk management method based on cloud desktop, virtual machine and storage medium |
CN113037569A (en) * | 2021-04-19 | 2021-06-25 | 杭州和利时自动化有限公司 | Redundant service method, device, equipment and medium based on double servers |
CN114138737A (en) * | 2022-02-08 | 2022-03-04 | 亿次网联(杭州)科技有限公司 | File storage method, device, equipment and storage medium |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111443872A (en) * | 2020-03-26 | 2020-07-24 | 深信服科技股份有限公司 | Distributed storage system construction method, device, equipment and medium |
CN113641467B (en) * | 2021-10-19 | 2022-02-11 | 杭州优云科技有限公司 | Distributed block storage implementation method of virtual machine |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521063A (en) * | 2011-11-30 | 2012-06-27 | 广东电子工业研究院有限公司 | Shared storage method suitable for migration and fault tolerance of virtual machine |
US20130325812A1 (en) * | 2012-05-30 | 2013-12-05 | Spectra Logic Corporation | System and method for archive in a distributed file system |
CN103729250A (en) * | 2012-10-11 | 2014-04-16 | 国际商业机器公司 | Method and system to select data nodes configured to satisfy a set of requirements |
CN103797770A (en) * | 2012-12-31 | 2014-05-14 | 华为技术有限公司 | Method and system for sharing storage resources |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104838374A (en) * | 2012-12-06 | 2015-08-12 | 英派尔科技开发有限公司 | Decentralizing a HADOOP cluster |
US9348707B2 (en) * | 2013-12-18 | 2016-05-24 | International Business Machines Corporation | Dynamically adjusting the number of replicas of a file according to the probability that the file will be accessed within a distributed file system |
CN106446159B (en) * | 2016-09-23 | 2019-11-12 | 华为技术有限公司 | A kind of method of storage file, the first virtual machine and name node |
-
2016
- 2016-09-23 CN CN201610846967.0A patent/CN106446159B/en active Active
-
2017
- 2017-05-22 WO PCT/CN2017/085351 patent/WO2018054079A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521063A (en) * | 2011-11-30 | 2012-06-27 | 广东电子工业研究院有限公司 | Shared storage method suitable for migration and fault tolerance of virtual machine |
US20130325812A1 (en) * | 2012-05-30 | 2013-12-05 | Spectra Logic Corporation | System and method for archive in a distributed file system |
CN103729250A (en) * | 2012-10-11 | 2014-04-16 | 国际商业机器公司 | Method and system to select data nodes configured to satisfy a set of requirements |
CN103797770A (en) * | 2012-12-31 | 2014-05-14 | 华为技术有限公司 | Method and system for sharing storage resources |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018054079A1 (en) * | 2016-09-23 | 2018-03-29 | 华为技术有限公司 | Method for storing file, first virtual machine and namenode |
CN107704596A (en) * | 2017-10-13 | 2018-02-16 | 郑州云海信息技术有限公司 | A kind of method, apparatus and equipment for reading file |
CN107704596B (en) * | 2017-10-13 | 2021-06-29 | 郑州云海信息技术有限公司 | Method, device and equipment for reading file |
CN109753226A (en) * | 2017-11-07 | 2019-05-14 | 阿里巴巴集团控股有限公司 | Data processing system, method and electronic equipment |
CN110110003A (en) * | 2018-01-26 | 2019-08-09 | 广州中国科学院计算机网络信息中心 | The data storage control method and device of M2M platform |
CN110688194A (en) * | 2018-07-06 | 2020-01-14 | 中兴通讯股份有限公司 | Disk management method based on cloud desktop, virtual machine and storage medium |
CN110688194B (en) * | 2018-07-06 | 2023-03-17 | 中兴通讯股份有限公司 | Disk management method based on cloud desktop, virtual machine and storage medium |
CN113037569A (en) * | 2021-04-19 | 2021-06-25 | 杭州和利时自动化有限公司 | Redundant service method, device, equipment and medium based on double servers |
CN114138737A (en) * | 2022-02-08 | 2022-03-04 | 亿次网联(杭州)科技有限公司 | File storage method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106446159B (en) | 2019-11-12 |
WO2018054079A1 (en) | 2018-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106446159B (en) | A kind of method of storage file, the first virtual machine and name node | |
CN111183420B (en) | Log structured storage system | |
AU2017201918B2 (en) | Prioritizing data reconstruction in distributed storage systems | |
CN106687911B (en) | Online data movement without compromising data integrity | |
CN111566611B (en) | Log structured storage system | |
CN103929500A (en) | Method for data fragmentation of distributed storage system | |
CN108351806A (en) | Database trigger of the distribution based on stream | |
US10552089B2 (en) | Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests | |
CN107402722B (en) | Data migration method and storage device | |
CN102282544A (en) | Storage system | |
CN105630418A (en) | Data storage method and device | |
CN109582213B (en) | Data reconstruction method and device and data storage system | |
CN108319618B (en) | Data distribution control method, system and device of distributed storage system | |
CN106933747A (en) | Data-storage system and date storage method based on multithread | |
CN110147203A (en) | A kind of file management method, device, electronic equipment and storage medium | |
CN115617264A (en) | Distributed storage method and device | |
CN105760391A (en) | Data dynamic redistribution method and system, data node and name node | |
CN108536822A (en) | Data migration method, device, system and storage medium | |
CN107463638A (en) | File sharing method and equipment between offline virtual machine | |
CN115756955A (en) | Data backup and data recovery method and device and computer equipment | |
CN114785662A (en) | Storage management method, device, equipment and machine readable storage medium | |
US11531642B2 (en) | Synchronous object placement for information lifecycle management | |
US11163642B2 (en) | Methods, devices and computer readable medium for managing a redundant array of independent disks | |
CN113885798A (en) | Data operation method, device, equipment and medium | |
CN109151016B (en) | Flow forwarding method and device, service system, computing device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220215 Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province Patentee after: Huawei Cloud Computing Technology Co.,Ltd. Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd. |