CN106446159A - Method for storing files, first virtual machine and name node - Google Patents

Method for storing files, first virtual machine and name node Download PDF

Info

Publication number
CN106446159A
CN106446159A CN201610846967.0A CN201610846967A CN106446159A CN 106446159 A CN106446159 A CN 106446159A CN 201610846967 A CN201610846967 A CN 201610846967A CN 106446159 A CN106446159 A CN 106446159A
Authority
CN
China
Prior art keywords
virtual machine
data
written
memory area
virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610846967.0A
Other languages
Chinese (zh)
Other versions
CN106446159B (en
Inventor
李亿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610846967.0A priority Critical patent/CN106446159B/en
Publication of CN106446159A publication Critical patent/CN106446159A/en
Priority to PCT/CN2017/085351 priority patent/WO2018054079A1/en
Application granted granted Critical
Publication of CN106446159B publication Critical patent/CN106446159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

A method for storing files, a first virtual machine and a name node are used for solving the redundancy problem of file numbers, existing when a distributed file system stores the files, and improving availability of the distributed file system. The method for storing the files includes that a client side sends a request message of requesting writing data to be written in the distributed file system to the name node; the name node sends a response message corresponding to the request message to the client side, wherein the response message includes an address of the first virtual machine and an address of a second virtual machine, and indicates the first virtual machine to be a virtual machine in more than one virtual machine, which has permission of writing data in a storage region, and the second virtual machine to be the other virtual machine besides the first virtual machine in the more than one virtual machine; the client side sends the data to be written and the address of the second virtual machine to the first virtual machine; the first virtual machine writes the data to be written into the storage region shared by the more than one virtual machine, and generates or updates metadata of the data to be written; the first virtual machine sends the generated or updated metadata to the second virtual machine.

Description

A kind of method of storage file, the first virtual machine and name node
Technical field
The present invention relates to field of computer technology, more particularly, to a kind of method of storage file, the first virtual machine and title Node.
Background technology
Distributed file system includes client (client), back end (datanode) and name node (namenode);Wherein, back end is used for storage file, and name node is used for managing the file of storage on back end.Visitor Family end can be inquired about, by name node, the file storing in each back end and obtain the address of each back end, thus real From back end, now read file or by file write data node.Back end in distributed file system can be Physical server or virtual machine.
When the back end in distributed file system is virtual machine, the virtual hard disk of this virtual machine is by distributed block Storage system provides, to virtual machine written document really to the virtual hard disk written document of virtual machine, to virtual hard disk written document The physical hard disk written document being achieved in that to distributed block system management memory.
Distributed file system, in order to ensure the reliability of file, can adopt duplicate of the document in virtual hard disk storage file Mechanism, same file is saved in N number of (N is the integer more than 1) virtual hard disk in distributed file system;And it is distributed Block storage system, in order to ensure the reliability of file, also can adopt duplicate of the document mechanism, by the file in same virtual hard disk M (M is the integer more than 1) physical hard disk preserves.Because distributed file system and distributed block storage system are all adopted Use duplicate of the document mechanism, same file actual file number preserving in physical hard disk can be led to be N*M, cause file Number redundancy.The file number redundancy that same file preserves can waste memory space, the process performance of impact system.
In order to solve the problems, such as file number redundancy in distributed file system in prior art, generally adopt following two Method:First method is, for the text document needing storage, stores only in a virtual machine of distributed file system This document.Using first method, this document can only could be accessed by this virtual machine, if this virtual machine breaks down, need File read-write service could be provided for client again after waiting this virtual machine to recover normally, lead to distributed file system Availability reduces;Second method is using the hot standby mechanism of virtual machine, that is, to configure the corresponding hot standby virtual machine of host virtual machine, This hot standby virtual machine and host virtual machine are synchronously written file.When host virtual machine breaks down, distributed file system is switched to Hot standby virtual machine continues as client and provides file read-write service.Using second method, distributed file system is switched to heat Need certain waiting time during standby virtual machine, lead to distributed file system cannot provide for client within this waiting time File read-write services, and so that the availability of distributed file system is reduced;And, hot standby virtual machine is before switching to host virtual machine Externally do not provide service, lead to the wasting of resources.
To sum up, the existing method solving file number redundancy issue in distributed file system can lead to distributed document The availability of system is low, cannot preferably solve file number redundancy issue.
Content of the invention
The embodiment of the present invention provides a kind of method of storage file, the first virtual machine and name node, in order to solve to be distributed The problem of the file number redundancy existing during formula file system storage file, and improve the availability of system.
In a first aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, in the method, Distributed file system includes name node, multiple virtual machine as back end, and multiple virtual machines therein are shared same Memory area;The method includes:
First virtual machine receives data to be written, the address of the second virtual machine that client sends, then to multiple virtual machines The data to be written that shared memory area write receives, and generate or update the metadata of data to be written;First virtual machine root Address according to the second virtual machine receiving sends, to the second virtual machine, the metadata that the first virtual machine generates or updates.
Wherein, the first virtual machine is to be specified by name node in multiple virtual machines to have the power writing data to memory area One virtual machine of limit, the second virtual machine is the virtual machine in multiple virtual machines in addition to the first virtual machine;The unit of data to be written Data includes but is not limited to:The file directory of the storage location of data to be written, the file name of data to be written and data to be written.
Using said method, because multiple virtual machines that distributed file system includes share same memory area, thus In distributed file system, the data to be written that the first virtual machine writes this memory area only preserves one in this memory area Part.For data to be written, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not There is the file of the preservation all leading to due to distributed file system and distributed block storage system using duplicate of the document mechanism The problem of number redundancy.
Additionally, adopting such scheme, the first virtual machine in multiple virtual machines that distributed file system includes have to Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, before the first virtual machine writes data to be written to memory area, also include:The One virtual machine receives the write permission mark of the first virtual machine that client sends, write permission mark be name node client to Name node asks to write to client transmission during data to be written to distributed file system, for specifying the first virtual equipment Oriented memory area writes the authority of data to be written.
Using such scheme, there is provided a kind of client indicates the mode of the authority of the first virtual machine to the first virtual machine.
Multiple virtual machines share a memory area can be in the following way when implementing:Multiple virtual machine carries divide The same virtual hard disk that cloth block storage system provides, this virtual hard disk includes the memory area that multiple virtual machines are shared.
The metadata of the data to be written that the first virtual machine sends to the second virtual machine has following two purposes:
The first
If the second virtual machine reads data to be written by the operating system of itself, metadata is used for the second virtual machine and generates Or updating the fileinfo recording in the operating system of itself, fileinfo reads to be written from memory area for operating system Data.
Second
If the second virtual machine reads data to be written, metadata reads number to be written for the second virtual machine from memory area According to.
The metadata of the data to be written that can be sent according to the first virtual machine using such scheme, the second virtual machine reads many Data to be written in the memory area that individual virtual machine is shared.
In a kind of possible implementation, the second virtual machine can be specified to have by name node and be read from memory area The authority of data to be written.
Second aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, this is distributed File system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;The party Method includes:
Name node receives client request and writes after the request message of data to be written to distributed file system, to client End sends the corresponding response message of this request message.
Wherein, the response message that name node sends to client includes address and second virtual machine of the first virtual machine Address, additionally, this response message also indicate the first virtual machine be have in multiple virtual machines to memory area write data One virtual machine of authority, the second virtual machine is the virtual machine in multiple virtual machines in addition to the first virtual machine.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and name Claim the response message that node sends to specify one of multiple virtual machines first virtual machine to have to write in shared memory area Enter the authority of data, thus the data of write only can preserve portion in this memory area in shared memory area.For For the shared data of memory area of write, only can be preserved due to the duplicate of the document mechanism that distributed block storage system adopts Many parts, and there is not the guarantor all leading to due to distributed file system and distributed block storage system using duplicate of the document mechanism The problem of the file number redundancy deposited.
Additionally, the first virtual machine in instruction multiple virtual machines of including of distributed file system in response message have to Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has The authority of data.
In a kind of possible implementation, name node passes through the power that response message indicates the first virtual machine to client The following two kinds mode can be adopted during the authority of limit and the second virtual machine:
First kind of way
The write permission mark that name node also includes the first virtual machine to the response message that client sends is virtual with second The read right mark of machine, write permission mark therein and read right mark have indicated respectively authority and second void of the first virtual machine The authority of plan machine, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, reads Capability identification is used for the authority specifying the second virtual machine to have from memory area reading data to be written.
The second way
In the response message that name node sends to client, the address of the address of the first virtual machine and the second virtual machine is pressed According to preset rules arrangement, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First is virtual The oriented memory area of equipment writes the authority of data to be written, and the second virtual machine has the power reading data to be written from memory area Limit.
Using both modes, there is provided name node passes through the authority that response message indicates the first virtual machine to client Two ways with the authority of the second virtual machine.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
Due to can be used for the number of the virtual machine providing the service reading and writing data to be written for client in distributed file system Measure as multiple, thus when certain virtual machine breaks down, can be that client provides and reads and writes data to be written by other virtual machines Service.When implementing, which kind of mode name node is processed by when virtual machine breaks down can comprise following two feelings Condition:
The first situation
When the first virtual machine breaks down, name node sends the first fresh information to client, this first renewal letter Breath includes the address of the first virtual machine of renewal, and the first fresh information further specify removes first breaking down in multiple virtual machines Another virtual machine beyond virtual machine as the first virtual machine updating, that is, specifies some virtual machine in the second virtual machine As the first virtual machine updating, the first virtual machine of renewal has the authority writing data to memory area.
Second situation
When the second virtual machine breaks down, name node sends the second fresh information to client, this second renewal letter Breath includes the address of the second virtual machine of renewal, and the second fresh information further specify another virtual machine beyond multiple virtual machines As the second virtual machine updating, the second virtual machine of renewal has the authority reading data to be written from memory area.
Using such scheme, either the first virtual machine breaks down or the second virtual machine breaks down, name node All specify other virtual machines to substitute the virtual machine breaking down, thus occurring in the first virtual machine and/or the second virtual machine In the case of fault, distributed file system remains to provide the service of read-write data for client, further increases distributed The availability of file system.
The third aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, this is distributed File system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;The party Method includes:
Client sends, to name node, the request message that request writes data to be written to distributed file system, receives afterwards The corresponding response message of request message that name node sends.
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, additionally, response message also refers to Show that the first virtual machine is a virtual machine in multiple virtual machines with the authority writing data to memory area, the second virtual machine For the virtual machine in addition to the first virtual machine in multiple virtual machines.
The address of the first virtual machine that client includes according to response message sends data to be written and the to the first virtual machine The address of two virtual machines, and indicate the first virtual machine:Write data to be written, generation or metadata the root updating data to be written Address according to the second virtual machine sends the metadata of data to be written to the second virtual machine.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, thus In distributed file system, client indicates that the first virtual machine writes the data to be written of this shared storage area only in this storage Store a in region.For data to be written, only can be due to the duplicate of the document mechanism of distributed block storage system employing Preserve many parts, and do not exist and all led to using duplicate of the document mechanism due to distributed file system and distributed block storage system The file number redundancy of preservation problem.
Further, since the first virtual machine in multiple virtual machines of including of distributed file system has writing to memory area Enter the authority of data, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area reading number to be written According to authority.Thus, can be used in distributed file system providing the virtual machine of service of read-write data to be written for client Quantity is multiple.When certain virtual machine breaks down, can be provided for client by other virtual machines and read and write data to be written Service, makes the availability of distributed file system be improved, and it also avoid adopting virtual machine hot standby in prior art simultaneously The problem of resource waste existing during mechanism.
In a kind of possible implementation, client is known by the response message that the name node receiving sends The following two kinds mode can be adopted during the authority of the authority of the first virtual machine and the second virtual machine:
First kind of way
The response message that client receives also includes the write permission mark of the first virtual machine and the reading power of the second virtual machine Limit mark, write permission mark therein and read right mark have indicated respectively the authority of the first virtual machine and the power of the second virtual machine Limit, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, and read right identifies Read the authority of data to be written for specifying the second virtual machine to have from memory area.
The second way
In the response message that client receives, the address of the address of the first virtual machine and the second virtual machine is according to default rule Then arrange, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First virtual machine have to Memory area writes the authority of data to be written, and the second virtual machine has the authority reading data to be written from memory area.
Using both modes, there is provided client knows the authority and second of the first virtual machine by receiving response message The two ways of the authority of virtual machine.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has The authority of data.
Fourth aspect, the method that the embodiment of the present invention provides storage file in a kind of distributed file system, this is distributed File system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;The party Method includes:
Second virtual machine receives the metadata that the first virtual machine sends.Wherein, the first virtual machine is quilt in multiple virtual machines Name node specifies a virtual machine with the authority writing data to memory area, and the second virtual machine is in multiple virtual machines Virtual machine in addition to the first virtual machine, metadata is that the first virtual machine writes generation or renewal after data to be written to memory area Data to be written metadata.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and many One of individual virtual machine first virtual machine has the authority writing data in shared memory area, thus deposits to shared In storage area domain, the data of write only can preserve portion in this memory area.Data for the shared memory area of write is come Say, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not exist due to distributed literary composition The problem of the file number redundancy of preservation that part system and distributed block storage system are all led to using duplicate of the document mechanism.
Additionally, the first virtual machine in multiple virtual machines of including of distributed file system has writes number to memory area According to authority, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has and reads data to be written from memory area Authority.Thus, can be used in distributed file system providing the quantity of the virtual machine of the service reading and writing data to be written for client For multiple.When certain virtual machine breaks down, can provide, for client, the service reading and writing data to be written by other virtual machines, So that the availability of distributed file system is improved, when it also avoid adopting the hot standby mechanism of virtual machine in prior art simultaneously The problem of resource waste existing.
In a kind of possible implementation, the second virtual machine knows that body authority can be in the following way:Second virtual machine Before receiving the metadata that the first virtual machine sends, receive the read right mark of the second virtual machine that client sends, read power Limit mark is name node when client asks to write data to be written to distributed file system to name node to client Send, read right identifies the authority reading data to be written for specifying the second virtual machine to have from memory area.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
After second virtual machine receives the metadata that the first virtual machine sends, can be read multiple according to the metadata receiving Data to be written in the memory area that virtual machine is shared, specifically can be in the following ways:
The first
If the second virtual machine reads data to be written by the operating system of itself, the second virtual machine generates according to metadata Or updating the fileinfo recording in the operating system of itself, this document information can be used for operating system and reads from memory area Data to be written.
Second
If the second virtual machine reads data to be written, the second virtual machine reads number to be written according to metadata from memory area According to.
The metadata of the data to be written that can be sent according to the first virtual machine using such scheme, the second virtual machine reads many Data to be written in the memory area that individual virtual machine is shared.
In a kind of possible implementation, the second virtual machine by name node specify have from memory area read to be written The authority of data.
5th aspect, the embodiment of the present invention provides the first virtual machine in a kind of distributed file system, this distributed literary composition Part system includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area, and first is empty Plan machine is to be specified a virtual machine with the authority writing data to memory area in multiple virtual machines by name node;This One virtual machine includes:
Receiver module, for receiving the data to be written of client transmission, the address of the second virtual machine, the second virtual machine is many Virtual machine in addition to the first virtual machine in individual virtual machine;
Processing module, for writing, to memory area, the data to be written that receiver module receives, and generates or updates number to be written According to metadata;
Sending module, for the address of the second virtual machine that received according to receiver module to the second virtual machine transmission processe mould Block generates or more new metadata.
Wherein, the metadata of data to be written includes but is not limited to:The storage location of data to be written, the filename of data to be written Title and the file directory of data to be written.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, thus In distributed file system, the data to be written that processing module writes this memory area only preserves portion in this memory area. For data to be written, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not deposit File part in the preservation all being led to using duplicate of the document mechanism due to distributed file system and distributed block storage system The problem of number redundancy.
Additionally, adopting said method, the first virtual machine in multiple virtual machines that distributed file system includes have to Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, receiver module is additionally operable to:Write number to be written in processing module to memory area According to before, receive the write permission mark of the first virtual machine that client sends, write permission mark be name node client to Name node asks to write to client transmission during data to be written to distributed file system, and this write permission identifies for specifying First virtual machine has the authority writing data to be written to memory area.
Using such scheme, there is provided a kind of first virtual machine knows the mode of its own right from client.
Multiple virtual machines share a memory area can be in the following way when implementing:Multiple virtual machine carries divide The same virtual hard disk that cloth block storage system provides, virtual hard disk includes memory area.
The metadata of the data to be written that sending module sends to the second virtual machine has following two purposes:
The first
If the second virtual machine reads data to be written by the operating system of itself, metadata is used for the second virtual machine and generates Or updating the fileinfo recording in the operating system of itself, fileinfo reads to be written from memory area for operating system Data.
Second
If the second virtual machine reads data to be written, metadata reads number to be written for the second virtual machine from memory area According to.
The metadata of the data to be written that can be sent according to sending module using such scheme, the second virtual machine reads multiple Data to be written in the memory area that virtual machine is shared.
In a kind of possible implementation, the second virtual machine by name node specify have from memory area read to be written The authority of data.
6th aspect, the embodiment of the present invention provides the name node in a kind of distributed file system, this distributed document System includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;This title section Point includes:
Receiver module, writes the request message of data to be written for receiving client request to distributed file system;
Sending module, for the corresponding response message of request message receiving to client sending/receiving module, this response Message includes the address of the first virtual machine and the address of the second virtual machine, additionally, this response message also indicates that the first virtual machine is There is in multiple virtual machines a virtual machine of the authority writing data to memory area, the second virtual machine is in multiple virtual machines Virtual machine in addition to the first virtual machine.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and name Claim the response message that node sends to specify one of multiple virtual machines first virtual machine to have to write in shared memory area Enter the authority of data, thus the data that processing module writes in shared memory area only can preserve one in this memory area Part.For the data of the shared memory area of write, only can be due to the duplicate of the document machine of distributed block storage system employing Make and preserve many parts, and do not exist because distributed file system and distributed block storage system are all using duplicate of the document mechanism The problem of the file number redundancy of the preservation leading to.
Additionally, the first virtual machine in instruction multiple virtual machines of including of distributed file system in response message have to Memory area writes the authority of data, and in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area Read the authority of data to be written.Thus, can be used in distributed file system providing, for client, the service reading and writing data to be written Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client The service of data to be written, makes the availability of distributed file system be improved, and it also avoid adopting in prior art simultaneously The problem of resource waste existing during the hot standby mechanism of virtual machine.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has The authority of data.
In a kind of possible implementation, the response message that sending module sends indicates the first virtual machine to client The following two kinds mode can be adopted during the authority of authority and the second virtual machine:
First kind of way
The write permission mark that sending module also includes the first virtual machine to the response message that client sends is virtual with second The read right mark of machine, write permission mark therein and read right mark have indicated respectively authority and second void of the first virtual machine The authority of plan machine, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, reads Capability identification is used for the authority specifying the second virtual machine to have from memory area reading data to be written.
The second way
In the response message that sending module sends to client, the address of the address of the first virtual machine and the second virtual machine is pressed According to preset rules arrangement, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First is virtual The oriented memory area of equipment writes the authority of data to be written, and the second virtual machine has the power reading data to be written from memory area Limit.
Using both modes, there is provided the response message that sending module sends indicates the power of the first virtual machine to client The two ways of the authority of limit and the second virtual machine.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
Due to can be used for the number of the virtual machine providing the service reading and writing data to be written for client in distributed file system Measure as multiple, thus when certain virtual machine breaks down, can be that client provides and reads and writes data to be written by other virtual machines Service.When implementing, which kind of mode sending module is processed by when virtual machine breaks down can comprise following two feelings Condition:
The first situation
When the first virtual machine breaks down, send the first fresh information to client, this first fresh information is included more The address of the first new virtual machine, the first fresh information further specify remove in multiple virtual machines the first virtual machine breaking down with Another outer virtual machine has to memory area write data as the first virtual machine updating, the first virtual machine of renewal Authority.
Second situation
When the second virtual machine breaks down, send the second fresh information to client, this second fresh information is included more The address of the second new virtual machine, the second fresh information further specify multiple virtual machines beyond another virtual machine as renewal The second virtual machine, the second virtual machine of renewal has the authority reading data to be written from memory area.
Using such scheme, either the first virtual machine breaks down or the second virtual machine breaks down, sending module All specify other virtual machines to substitute the virtual machine breaking down, thus occurring in the first virtual machine and/or the second virtual machine In the case of fault, distributed file system remains to provide the service of read-write data for client, further increases distributed The availability of file system.
7th aspect, the embodiment of the present invention provides a kind of client, and the distributed file system that this client is located includes Name node, multiple virtual machine as back end, multiple virtual machines share same memory area;This client includes:
Sending module, disappears to the request of distributed file system write data to be written for sending request to name node Breath;
Receiver module, for receiving the corresponding response message of request message of name node transmission;
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, additionally, response message also refers to Show that the first virtual machine is a virtual machine in multiple virtual machines with the authority writing data to memory area, the second virtual machine For the virtual machine in addition to the first virtual machine in multiple virtual machines;
Sending module, is additionally operable to the address of the first virtual machine that includes according to the response message that receiver module receives to first Virtual machine sends data to be written, the address of the second virtual machine, and indicates the first virtual machine:Write data to be written, generation or renewal The address of the metadata of data to be written and the second virtual machine being included according to the response message that receiver module receives is empty to second Plan machine sends the metadata of data to be written.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, thus In distributed file system, the data to be written that sending module instruction the first virtual machine writes this shared storage area is only deposited at this Store a in storage area domain.For data to be written, only can be due to the duplicate of the document mechanism of distributed block storage system employing And preserve many parts, and do not exist and all led using duplicate of the document mechanism due to distributed file system and distributed block storage system The problem of the file number redundancy of preservation causing.
Further, since the first virtual machine in multiple virtual machines of including of distributed file system has writing to memory area Enter the authority of data, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has from memory area reading number to be written According to authority.Thus, can be used in distributed file system providing the virtual machine of service of read-write data to be written for client Quantity is multiple.When certain virtual machine breaks down, can be provided for client by other virtual machines and read and write data to be written Service, makes the availability of distributed file system be improved, and it also avoid adopting virtual machine hot standby in prior art simultaneously The problem of resource waste existing during mechanism.
In a kind of possible implementation, receiver module is obtained by the response message that the name node receiving sends The following two kinds mode can be adopted during the authority knowing the authority of the first virtual machine and the second virtual machine:
First kind of way
The response message that receiver module receives also includes the write permission mark of the first virtual machine and the reading of the second virtual machine Capability identification, write permission therein mark and read right mark have indicated respectively the authority of the first virtual machine and the second virtual machine Authority, that is,:Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, read right mark Know the authority reading data to be written for specifying the second virtual machine to have from memory area.
The second way
In the response message that receiver module receives, the address of the address of the first virtual machine and the second virtual machine is according to default Regularly arranged, this preset rules indicates the authority of the first virtual machine and the authority of the second virtual machine, that is,:First virtual machine has Write the authority of data to be written to memory area, the second virtual machine has the authority reading data to be written from memory area.
Using both modes, there is provided receiver module knows the authority and the of the first virtual machine by receiving response message The two ways of the authority of two virtual machines.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
In a kind of possible implementation, it is to be written from memory area reading that response message also indicates that the second virtual machine has The authority of data.
Eighth aspect, the second virtual machine in a kind of distributed file system of the embodiment of the present invention, this distributed field system System includes name node, multiple virtual machine as back end, and multiple virtual machines share same memory area;This is second virtual Machine includes:
Receiver module, for receiving the metadata that the first virtual machine sends.Wherein, the first virtual machine is in multiple virtual machines Specified a virtual machine with the authority writing data to memory area by name node, the second virtual machine is multiple virtual machines In virtual machine in addition to the first virtual machine, metadata is that the first virtual machine writes to memory area and generates after data to be written or more The metadata of new data to be written.
Using such scheme, because multiple virtual machines that distributed file system includes share same memory area, and many One of individual virtual machine first virtual machine has the authority writing data in shared memory area, thus deposits to shared In storage area domain, the data of write only can preserve portion in this memory area.Data for the shared memory area of write is come Say, only can preserve many parts due to the duplicate of the document mechanism that distributed block storage system adopts, and not exist due to distributed literary composition The problem of the file number redundancy of preservation that part system and distributed block storage system are all led to using duplicate of the document mechanism.
Additionally, the first virtual machine in multiple virtual machines of including of distributed file system has writes number to memory area According to authority, in multiple virtual machines, the second virtual machine in addition to the first virtual machine has and reads data to be written from memory area Authority.Thus, can be used in distributed file system providing the quantity of the virtual machine of the service reading and writing data to be written for client For multiple.When certain virtual machine breaks down, can provide, for client, the service reading and writing data to be written by other virtual machines, So that the availability of distributed file system is improved, when it also avoid adopting the hot standby mechanism of virtual machine in prior art simultaneously The problem of resource waste existing.
In a kind of possible implementation, receiver module knows that the authority of the second virtual machine can be in the following way:Connect Receive module before receiving the metadata that the first virtual machine sends, receive the read right mark of the second virtual machine that client sends Know, read right mark be name node when client asks to write data to be written to distributed file system to name node to Client sends, and read right identifies the authority reading data to be written for specifying the second virtual machine to have from memory area.
In a kind of possible implementation, multiple virtual machines share a memory area can be using such as when implementing Under type:The same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes memory area.
In a kind of possible implementation, the second virtual machine also includes processing module.It is virtual that receiver module receives first After the metadata that machine sends, processing module can read in the memory area that multiple virtual machines are shared according to the metadata receiving Data to be written, processing module is specifically can be in the following ways:
The first
After receiver module receives the metadata that the first virtual machine sends, if the second virtual machine passes through the operation system of itself System reads data to be written, then processing module generates according to metadata or updates the fileinfo of record in the operating system of itself, Fileinfo reads data to be written for operating system from memory area.
Second
If the second virtual machine reads data to be written, processing module reads number to be written according to metadata from memory area According to.
The metadata of the data to be written that can be sent according to the first virtual machine using such scheme, processing module reads multiple Data to be written in the memory area that virtual machine is shared.
In a kind of possible implementation, the second virtual machine by name node specify have from memory area read to be written The authority of data.
9th aspect, provides a kind of computer-readable recording medium, and be stored with computer-readable recording medium computer Execute instruction, when this computer executed instructions of at least one computing device of calculate node, calculate node execution above-mentioned the Or the various of first aspect may the methods that provide of design or execute above-mentioned second aspect or second aspect on the one hand The various sides that the method providing may be provided or the various possible design offer of the above-mentioned third aspect or the third aspect is provided Method.
Tenth aspect, provides a kind of computer program, and this computer program includes computer executed instructions, should Computer executed instructions store in a computer-readable storage medium.At least one processor of calculate node can be from computer Readable storage medium storing program for executing reads this computer executed instructions, and this computer executed instructions of at least one computing device make to calculate section Point implement above-mentioned first aspect or first aspect various may the methods that provide of design or execute above-mentioned second aspect or The various of person's second aspect may the design method providing or the various possibility executing the above-mentioned third aspect or the third aspect The method that design provides.
Brief description
Fig. 1 is name node, client and multiple back end in distributed file system provided in an embodiment of the present invention Annexation schematic diagram;
Fig. 2 is distributed file system provided in an embodiment of the present invention and the showing of the annexation of distributed block storage system It is intended to;
Fig. 3 is the schematic flow sheet of the method for storage file in distributed file system provided in an embodiment of the present invention;
Fig. 4 is distributed file system and the distributed block storage system of the method using the storage file shown in Fig. 3 Structural representation;
A kind of structural representation of first virtual machine that Fig. 5 provides for bright embodiment;
The structural representation of another kind first virtual machine that Fig. 6 provides for bright embodiment;
A kind of structural representation of name node that Fig. 7 provides for bright embodiment;
The structural representation of another kind of name node that Fig. 8 provides for bright embodiment;
A kind of structural representation of client that Fig. 9 provides for bright embodiment;
The structural representation of another kind of client that Figure 10 provides for bright embodiment;
A kind of structural representation of second virtual machine that Figure 11 provides for bright embodiment;
The structural representation of another kind second virtual machine that Figure 12 provides for bright embodiment;
Figure 13 is a kind of structural representation of distributed file system provided in an embodiment of the present invention.
Specific embodiment
The above-mentioned purpose of embodiment, scheme and advantage, provided hereinafter detailed description for a better understanding of the present invention.Should Describe in detail by using the accompanying drawings such as block diagram, flow chart and/or example, illustrate the various embodiments of device and/or method. In these block diagrams, flow chart and/or example, comprise one or more functions and/or operation.It will be appreciated by those skilled in the art that Arrive:Each function in these block diagrams, flow chart or example and/or operation, can pass through various hardware, software, consolidate Part is separately or cooperatively implemented, or is implemented by the combination in any of hardware, software and firmware.
The present embodiments relate to distributed file system, below distributed file system is described in detail.
As shown in figure 1, distributed file system can comprise name node and multiple back end.According to distributed field system The application scenario of system is different, and name node is properly termed as main control server or other titles again, and accordingly, back end may be used again To be referred to as data server or other titles.It should be noted that only illustrating in Fig. 1 that distributed file system comprises a visitor The scene at family end, in practice, can comprise multiple client in distributed file system.
Wherein, name node is used for managing multiple back end, and name node records the literary composition of storage in each back end The information (such as meta data file) of part, service state of each back end etc.;Back end is used for storage file, works as client End is when carrying out file read-write operations, client first to the index information of name node acquisition request back end, then root again Access corresponding back end to carry out file read-write according to the index information asked.May be synchronous between multiple back end File.Such as when certain file needs to write in two back end, can first write one of back end, then by this File synchronization is given another back end by back end.Additionally, can also directly carry out letter between name node data node Breath interaction.
Wherein, name node, back end, client can following any one have on the equipment of computing capability and configure Corresponding function is realized.The equipment that this has computing capability can be physical equipment or virtual unit;For example, physical equipment can be Personal computer, notebook, large scale computer, Net-connected computer, handheld computer, personal digital assistant, work station etc., Virtual unit can be to dispose virtual machine in a physical device or container etc..
Referring to Fig. 2, when back end is for virtual machine, the virtual hard disk of virtual machine is provided by distributed block storage system, Distributed block system management memory has multiple physical hard disks, is really to distributed block to the virtual hard disk write file of virtual machine File is write in the physical hard disk of system management memory.
Referring to Fig. 2, distributed file system, in order to ensure the reliability of itself, typically can adopt file in storage file Copy mechanism, such as when storing certain file, this document is respectively stored on two back end, that is, is stored in virtual machine 1 and virtual machine 2 on;Distributed block storage system in order to ensure the reliability of itself, specifically in the file of storage virtual machine Duplicate of the document mechanism can be adopted, such as hard in the physics of physical server 1 respectively when realizing the storage of this document of virtual machine 1 On the physical hard disk 5 of disk 1, the physical hard disk 3 of physical server 2 and physical server 3 store this document, and realize virtual It is respectively stored in the physical hard disk 2 of physical server 1, physical hard disk 4 and of physical server 2 during the storage of this document of machine 2 This document is stored on the physical hard disk 6 of physical server 3.So, due to distributed file system and distributed block storage system All using duplicate of the document mechanism, this document is led to preserve six parts in the physical hard disk of distributed block system management memory.Obviously, The file number redundancy preserving for same file can waste memory space, the process performance of impact system.
It should be noted that in order to explain the operation when storing certain file for the distributed file system, Fig. 2's is distributed Illustrate only two virtual machines, each virtual machine comprises a virtual hard disk in file system;The distributed block storage system of Fig. 2 In illustrate only three physical servers, each physical server comprises two physical hard disks.Actual when realizing, distributed document System can be stored with multiple files, thus the quantity of the virtual machine that distributed file system is comprised is not limited, to each The quantity of the virtual hard disk that virtual machine comprises is not limited;Meanwhile, to the physical server that distributed block storage system comprises Quantity is not limited, and the quantity of the physical hard disk that each physical server comprises also is not limited.
In order to solve the problems, such as the file number redundancy that distributed file system exists, the embodiment of the present invention provides a kind of point The method of storage file in cloth file system, distributed file system includes name node, multiple void as back end Plan machine, wherein, multiple virtual machines share same memory area.As shown in figure 3, the method includes:
S301:Client sends, to name node, the request message that request writes data to be written to distributed file system.
Data to be written can be video data, voice data, document data or other binary data.Data to be written Granularity can be file, data block or other granularities.The quantity of data to be written can be one or more, as long as by Fig. 3 institute After showing method execution once, some or multiple data have been written to distributed file system, this one or more data It is considered as data to be written.
S302:Name node sends the corresponding response message of request message to client.
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, and response message also indicates One virtual machine is a virtual machine in multiple virtual machines with the authority writing data to memory area, and the second virtual machine is many Virtual machine in addition to the first virtual machine in individual virtual machine.
Wherein, the quantity of the first virtual machine is necessary for one;The quantity of the second virtual machine can be one or many Individual, in the embodiment of the present invention, the quantity of the second virtual machine is not limited.
In the embodiment of the present invention, in multiple virtual machines, only one virtual machine has the power writing data to memory area Limit, its reason is:It is used for writing data to be written if there are multiple virtual machines, then when client will be write to distributed file system When entering data to be written, have multiple virtual machines and receive the instruction writing data to be written;Because multiple virtual machines share same depositing Storage area domain, then in synchronization, the instruction of the write data to be written that multiple virtual machines receive can indicate that multiple virtual machines will be treated Write data and write same memory area, so can cause the instruction writing data to be written cannot be distinguished by passed through which virtual machine Write data to be written, lead to the instruction writing data to be written cannot execute.In addition, only one of which virtual machine writes data to be written, Ensure only to have write a data to be written in distributed system write data phase, write in distributed system with respect to prior art Data phase is many parts of data to be written of write, decreases data redundancy.
In the embodiment of the present invention, the quantity of the second virtual machine the reason being multiple be:To read when there being multiple client During data to be written, can be read out by multiple second virtual machines, improve the efficiency that client reads data to be written.In addition, the After two virtual machines obtain the metadata of this data to be written that the first virtual machine sends, can directly read from distributed block storage system Take this data to be written, it is to avoid prior art breaks down in the first virtual machine and cannot read the situation of this data to be written.
Restriction to the quantity of the first virtual machine according to embodiments of the present invention, name node is sent to the response of client Can only indicate in message that the first virtual machine has the authority to memory area write data, without the power indicating the second virtual machine Limit, its reason is:Because only one virtual machine has the authority writing data to memory area in multiple virtual machines, work as response Indicate the first virtual machine in multiple virtual machines in message and there is the authority writing data to memory area, then be multiple virtual In machine, the second virtual machine in addition to the first virtual machine gives tacit consent to the authority having from memory area reading data to be written.
Alternatively, response message also indicates that the second virtual machine has the authority from institute's memory area reading data to be written.
In S302, the authority of the first virtual machine of response message instruction and the authority of the second virtual machine are only for number to be written According to.First virtual machine and the second virtual machine share the memory area writing data to be written.Such as, data 1 to be written is being stored During distributed file system, response message instruction virtual machine 1 is the first virtual machine, and virtual machine 2 is the second virtual machine, virtual machine 1 Data 1 to be written is write in the memory area 1 of virtual machine 1, then the metadata of data 1 to be written is sent to virtual machine 2, its In, virtual machine 2 and virtual machine 1 shared storage area 1;When data 2 to be written is stored distributed file system, response message Instruction virtual machine 1 is the first virtual machine, and virtual machine 3 is the second virtual machine, and data 2 to be written is write depositing of virtual machine 1 by virtual machine 1 In storage area domain 2, then the metadata of data 2 to be written is sent to virtual machine 3, wherein, virtual machine 3 shares storage with virtual machine 1 Region 2.
S303:Client sends data to be written and the second virtual machine according to the address of the first virtual machine to the first virtual machine Address.
Wherein, client sends data to be written to the first virtual machine and the address of the second virtual machine is indicated for the first void Plan machine writes data to be written, generation or the metadata updating data to be written and empty to second according to the address of the second virtual machine Plan machine sends the metadata of data to be written.
S304:First virtual machine writes data to be written to the memory area that multiple virtual machines are shared, and generates or update and treat Write the metadata of data.
Wherein, the metadata of data to be written can be used for the first virtual machine and the second virtual machine according to this metadata from multiple void The memory area that plan machine is shared reads data to be written;The metadata of data to be written includes but is not limited to:The storage position of data to be written Put, the catalogue of the title of data to be written and data to be written.
S305:First virtual machine sends, to the second virtual machine, the first number generating or updating according to the address of the second virtual machine According to.
It should be noted that distributed file system typically may include client, also may not include client.If distributed File system includes client, the quantity of client including but not limited to one.In the embodiment of the present invention, in order to more clearly retouch State the interaction between client, name node, the first virtual machine and the second virtual machine, client is included in distributed field system In system.It is actual that distributed file system also may not include client when realizing, now the embodiment of the present invention can be considered client and The interaction of distributed file system.
Alternatively, the same void that multiple virtual machine carry distributed block storage systems that distributed file system includes provide Intend hard disk, this virtual hard disk includes the memory area that multiple virtual machines are shared.
Using the method for storage file in the distributed file system shown in Fig. 3, included due to distributed file system Multiple virtual machines share same memory area, thus in distributed file system, this data to be written is only in this memory area Storage is a.For data to be written, only can preserve many due to the duplicate of the document mechanism that distributed block storage system adopts Part, and there is not the preservation all leading to due to distributed file system and distributed block storage system using duplicate of the document mechanism File number redundancy problem.
Further, since in the method for storage file in the distributed file system shown in Fig. 3, distributed file system bag The first virtual machine in the multiple virtual machines including has the authority writing data to memory area, removes first empty in multiple virtual machines The second virtual machine beyond plan machine has the authority reading data to be written from memory area.Thus, can in distributed file system Quantity for providing the virtual machine of the service reading and writing data to be written for client is multiple.When certain virtual machine breaks down When, can provide, for client, the service reading and writing data to be written by other virtual machines, so that the availability of distributed file system is obtained Arrive raising, it also avoid the problem of resource waste existing during mechanism hot standby using virtual machine in prior art simultaneously.
Explain how method shown in Fig. 3 solves storage file number redundancy issue, improves system simultaneously in order to vivider Availability, now method shown in Fig. 3 is applied and is illustrated in distributed file system and distributed block storage system.Using The distributed file system of method shown in Fig. 3 and distributed block storage system can be as shown in Figure 4.Distributed document shown in Fig. 4 System comprises the first virtual machine, the second virtual machine, client and name node.Actual when realizing, the quantity to the second virtual machine It is not limited with the quantity of client.Distributed block storage system shown in Fig. 4 comprises three physical servers, and each physics takes Business device comprises two physical hard disks.
Wherein, the first virtual machine has the authority writing data to memory area, and the second virtual machine has from memory area Read the authority of data to be written.Because the first virtual machine and the second virtual machine share same memory area, can be considered that first is virtual Machine and the second virtual shared same virtual hard disk 1, the first virtual machine can write data to be written in virtual hard disk 1, and second is empty Plan machine can read data to be written from virtual hard disk 1.Thus data to be written only stores portion, that is, in distributed file system It is stored in virtual hard disk 1, data to be written can store three parts in distributed block storage system, for example, be respectively stored in physics On the physical hard disk 5 of the physical hard disk 1 of server 1, the physical hard disk 3 of physical server 2 and physical server 3.So, to be written Data only saves three parts in physical hard disk.Same in distributed file system shown in Fig. 2 and distributed block storage system Individual file saves six parts in physical hard disk, by contrast, after method shown in Fig. 3, the distributed field system shown in Fig. 4 In system and distributed block storage system, data to be written only saves three parts in physical hard disk, thus greatly reduce file preserving Number, solve the problems, such as in distributed file system preserve file number redundancy.
Additionally, in the diagram, the first virtual machine can be used for writing data to be written and reads data to be written, and the second virtual machine can For reading data to be written, thus when wherein certain virtual machine breaks down, the virtual machine that can not broken down by another There is provided, for client, the service reading and writing data to be written, improve the availability of system.
Further, the second virtual machine is after receiving the metadata of the data to be written that the first virtual machine sends, if Need the fileinfo generating or updating record in the operating system of itself can be divided into following two situations:
The first situation
If it is by the first virtual machine that the first virtual machine writes during data to be written to the memory area that multiple virtual machines are shared Operating system write, and client pass through second virtual machine reading data to be written be also required to the behaviour by the second virtual machine Make system to read, now, the second virtual machine needs the metadata according to data to be written to generate or update in the operating system of itself The fileinfo of record, the operating system just enabling the second virtual machine reads data to be written from this memory area.Wherein, should The operating system that fileinfo is used for the second virtual machine reads described data to be written from memory area.Update the second virtual machine The mode of the fileinfo in operating system can have two kinds;The first, if the operating system of the second virtual machine can be known The change of data in memory area, then can be oneself to update the fileinfo of this data to be written;Second, the behaviour of the second virtual machine Make the metadata of this data to be written that system can send according to the first virtual machine, update the fileinfo of this data to be written.
Second situation
It is this storage of writing direct when the first virtual machine writes data to be written to the memory area that multiple virtual machines are shared Region, rather than when being write by the operating system of the first virtual machine, client can directly read this by the second virtual machine Data to be written in shared memory area, is read without by the operating system of the second virtual machine.Now, second is virtual Machine does not need the metadata according to data to be written to generate or update the fileinfo of record in the operating system of itself, and second is virtual Machine can read data to be written according only to the metadata of data to be written from this memory area.
In S302, name node needs to indicate that the first virtual machine is to have to depositing in multiple virtual machines by corresponding message Storage area domain writes a virtual machine of the authority of data, and the second virtual machine is the void in multiple virtual machines in addition to the first virtual machine Plan machine, that is, after the operation of as above S302, client not only can obtain the address of the first virtual machine and the second virtual machine Address, can also know that the first virtual machine has the authority to memory area write data, the second virtual machine has from memory block The authority of data to be written is read in domain.Name node is to the mode bag of client notification first virtual machine and the authority of the second virtual machine Include but be not limited to following two:
First kind of way
In execution S302, name node also includes the write permission mark of the first virtual machine to the response message that client sends Know and the read right of the second virtual machine identifies, write permission identifies to be written to memory area write for specifying the first virtual machine to have The authority of data, read right identifies the authority reading data to be written for specifying the second virtual machine to have from memory area.Client The first virtual machine and the authority of the second virtual machine are known according to write permission mark and read right mark in end, and write permission mark is sent out Give the first virtual machine, read right mark is sent to the second virtual machine, the respective authority of multiple virtual machine will be handed down to phase The virtual machine answered.
Wherein, the process that write permission mark is sent to the first virtual machine by client can also may be used before execution S303 , write permission is identified it is also possible to execute with S303 simultaneously with after execution S303 simultaneously, data to be written and the second virtual machine Address be sent to the first virtual machine.The embodiment of the present invention is not limited to the execution sequence of this two steps.Similarly, client Read right mark can be sent to the second virtual machine by end.
The second way
The address of the address of the first virtual machine and the second virtual machine arranges according to preset rules in the response message, presets rule Then be used for specify the first virtual machine have to memory area write data to be written authority, and specify the second virtual machine have from Memory area reads the authority of data to be written.
Wherein, preset rules can be in response to the order of the address of the virtual machine that message includes.Such as, name node and visitor Family end can be arranged in advance:The address that name node is the first virtual machine to first address that client sends, then client End is behind the address receiving multiple virtual machines that response message includes it may be determined that first address is to have to memory area The address of first virtual machine of authority of write data, remaining address is to have the authority reading data to be written from memory area The second virtual machine address.
It has been mentioned hereinbefore that due to can be used in distributed file system providing, for client, the service reading and writing data to be written Virtual machine quantity be multiple.When certain virtual machine breaks down, read-write can be provided by other virtual machines for client The service of data to be written, thus improve the availability of distributed file system.Below will be virtual to distributed file system How machine operates after breaking down is explained in detail.
In distributed file system, including but not limited to following three kinds of the mode that virtual machine breaks down is detected:The One kind, name node detects certain virtual machine and breaks down;Second, client reads data by certain virtual machine or writes When entering data, if reading and writing data process cannot be completed it is determined that this virtual machine breaks down, be there is event by client in this virtual machine The information reporting of barrier is to name node;The third, virtual machine can periodically carry out self-inspection, when certain virtual machine finds that itself occurs event During barrier, the message of itself fail directly can be reported name node, or name node is reported by client.Cause This, when certain virtual machine in distributed file system breaks down, name node can know this void with above-mentioned three kinds of approach The message that plan machine breaks down, can take corresponding operating then, it is to avoid distributed file system cannot provide for client The situation of reading and writing data service.
In the embodiment of the present invention, the virtual machine of distributed file system breaks down and can be divided into following two situations:
The first situation:First virtual machine breaks down
When the first virtual machine breaks down, name node sends the first new information to client, and this first renewal disappears Breath includes the address of the first virtual machine of renewal, and this first new information is specified and removed the first void breaking down in multiple virtual machines Another virtual machine beyond plan machine as the first virtual machine updating, that is, indicates that the first virtual machine of renewal has to multiple void The memory area that plan machine is shared writes the authority of data;Thus, when client needs write data to be written, can be by renewal First virtual machine write.
In this way, when the first virtual machine breaks down, name node is specified to remove in multiple virtual machines and event is occurred Another virtual machine beyond first virtual machine of barrier as the first virtual machine updating, that is, indicates the first virtual equipment of renewal The memory area that oriented multiple virtual machine is shared writes the authority of data, then can be by updating when client will write data First virtual machine write, when client will read data, can by second virtual machine read or by update first Virtual machine reads.
To sum up, in this way, do not interfere with client write when the first virtual machine breaks down or read data, Improve the availability of system.
As it was previously stated, the quantity of the second virtual machine can be one or more, when the quantity of the second virtual machine is one When, if the first virtual machine breaks down, can also further execute following method after execution said method:Title section Point specifies the second virtual machine of one or more renewals outside multiple virtual machines that distributed file system includes, and the of renewal Two virtual machines have the authority reading data to be written from memory area, and the second virtual machine and the distributed file system of renewal include Multiple virtual machines share same memory area;The second virtual machine that name node updates to client instruction has from memory block The authority of data to be written is read in domain.After client receives the instruction of name node, notify the first virtual machine updating:Will be to be written The metadata of data is sent to the second virtual machine of renewal.So, when client will read data to be written, not only by Two virtual machines read, and also can be read by the second virtual machine updating.
Name node specifies the second virtual machine updating to have outside multiple virtual machines that distributed file system includes Read the authority of data to be written from memory area, when there is multiple client and needing to read data to be written, client not only may be used To be read by the second virtual machine, can also be read by the second virtual machine updating, improve client and read data to be written Efficiency.
Second situation:Second virtual machine breaks down
When the second virtual machine breaks down, name node sends the second fresh information, the second fresh information to client Including update the second virtual machine address, the second fresh information specify multiple virtual machines beyond another virtual machine as more The second new virtual machine, the second virtual machine of renewal has the authority reading data to be written from memory area.Wherein, the of renewal Multiple virtual machines that two virtual machines and distributed file system include share above-mentioned same memory area.Name node is to client Indicate that the second virtual machine updating has after memory area reads the authority of data to be written, client is according to the finger of name node Show and notify the first virtual machine:The metadata of data to be written is sent to the second virtual machine of renewal.
Wherein, the first virtual machine will be able to be treated according to the instruction of notification message after the notification message receiving client transmission The metadata writing data is sent to the second virtual machine of renewal.So, when client will read data to be written, not only by The second virtual machine not broken down is read or is read by the first virtual machine, also can be read by the second virtual machine updating.
In this way, when the second virtual machine breaks down, name node specifies the second virtual machine of renewal to have Read the authority of data to be written from memory area, then can write by the first virtual machine when client will write data to be written Enter, when client will read data to be written, not only by the first virtual machine and do not break down second virtual machine read, Can also be read by the second virtual machine updating.To sum up, in this way, will not shadow when the first virtual machine breaks down Ring client write or read data to be written, improve the availability of system.
Using the method for storage file in distributed file system provided in an embodiment of the present invention, distributed literary composition can be solved The problem of the file number redundancy of storage in part system.Additionally, when certain virtual machine in distributed file system breaks down When, client reading is not interfered with using the method for storage file in distributed file system provided in an embodiment of the present invention or writes Enter file, improve the availability of system.
The embodiment of the present invention provides the first virtual machine in a kind of distributed file system, and this distributed file system includes Name node, multiple virtual machine as back end, multiple virtual machines share same memory area, and the first virtual machine is multiple Specified a virtual machine with the authority writing data to memory area by name node in virtual machine.As shown in figure 5, this One virtual machine 500 includes:
Receiver module 501, for receiving the data to be written of client transmission, the address of the second virtual machine, the second virtual machine For the virtual machine in addition to the first virtual machine in multiple virtual machines;
Processing module 502, for writing, to memory area, the data to be written that receiver module 501 receives, and generates or updates The metadata of data to be written;
Sending module 503, the address of the second virtual machine for being received according to receiver module 501 sends to the second virtual machine Processing module 502 generates or more new metadata.
Alternatively, receiver module 501 is additionally operable to:Before processing module 502 writes data to be written to memory area, receive The write permission mark of the first virtual machine that client sends.
Wherein, write permission mark is that name node asks to treat to distributed file system write to name node in client Write to client transmission during data, write permission identifies and writes data to be written for specifying the first virtual machine to have to memory area Authority.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes Memory area.
Alternatively, if the second virtual machine reads data to be written by the operating system of itself, it is empty that metadata is used for second Plan machine generates or updates the fileinfo of record in the operating system of itself, and fileinfo is used for operating system from memory area Read data to be written;If or the second virtual machine reads data to be written, metadata is used for the second virtual machine from memory area Read data to be written.
Alternatively, the second virtual machine is specified by name node and is had the authority reading data to be written from memory area.
Using the first virtual machine 500 provided in an embodiment of the present invention, the literary composition of storage in distributed file system can be solved The problem of part number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, being implemented by the present invention The operation of the first virtual machine 500 that example provides, so that the fault of virtual machine does not interfere with client reading or write file, Improve the availability of system.
It should be noted that the first virtual machine 500 provided in an embodiment of the present invention can be used for execute Fig. 3 shown in distributed The operation of the first virtual machine execution in the method for storage file in file system, the first virtual machine 500 does not explain in detail and describes Implementation refer to the associated description in the method for storage file in the distributed file system shown in Fig. 3.
It should be noted that being schematic to the division of module in the embodiment of the present invention, only a kind of logic function Divide, actual can have other dividing mode when realizing.In addition, each functional module in each embodiment of the application is permissible It is integrated in a processing module or modules are individually physically present it is also possible to two or more module collection In Cheng Yi module.Above-mentioned integrated module both can be to be realized in the form of hardware, it would however also be possible to employ software function module Form realize.
Based on above example, the embodiment of the present invention additionally provides a kind of first virtual machine, and this first virtual machine can be held The method that the corresponding embodiment of row Fig. 3 provides, can be identical with the first virtual machine 500 shown in Fig. 5.
Referring to Fig. 6, the equipment that the first virtual machine 600 is located includes at least one processor 601, memorizer 602 and communication Interface 603;At least one processor 601 described, described memorizer 602 and described communication interface 603 are all by bus 604 even Connect;
Described memorizer 602, for storing computer executed instructions;
At least one processor 601 described, for execute the storage of described memorizer 602 computer executed instructions so that Described first virtual machine 600 carries out data interaction by described communication interface 603 and the miscellaneous equipment in distributed file system Method to execute storage file in the distributed file system that above-described embodiment provides, or make described first virtual machine 600 carry out data interaction by the miscellaneous equipment in described communication interface 603 and distributed file system realizes distributed literary composition The some or all of function of part system.
At least one processor 601, can include different types of processor 601, or the process including same type Device 601;Processor 601 can be following any one:Central processing unit (Central Processing Unit, referred to as CPU), arm processor, field programmable gate array (Field Programmable Gate Array, abbreviation FPGA), special Processor etc. has the device calculating disposal ability.A kind of optional embodiment, at least one processor 601 described can also collect Become many-core processor.
Memorizer 602 can be following any one or any one combination:Random access memory (Random Access Memory, abbreviation RAM), read only memory (read only memory, abbreviation ROM), nonvolatile memory (non- Volatile memory, abbreviation NVM), solid state hard disc (Solid State Drives, abbreviation SSD), mechanical hard disk, disk, The storage mediums such as disk array.
Communication interface 603 is used for the first virtual machine 600, and (other in such as distributed file system set with other equipment Standby) carry out data interaction.Communication interface 603 can be following any one or any one combination:Network interface (such as Ethernet Interface), wireless network card etc. there is the device of network access facility.
This bus 604 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Fig. 6 is thick with one Line represents this bus.Bus 604 can be following any one or any one combination:Industry standard architecture (Industry Standard Architecture, abbreviation ISA) bus, peripheral component interconnection (Peripheral Component Interconnect, abbreviation PCI) bus, EISA (Extended Industry Standard Architecture, abbreviation EISA) wired data transfer such as bus device.
The embodiment of the present invention provides the name node in a kind of distributed file system, and this distributed file system includes name Claim node, multiple virtual machine as back end, multiple virtual machines share same memory area;As shown in fig. 7, name node 700 include:
Receiver module 701, writes the request message of data to be written for receiving client request to distributed file system;
Sending module 702, for the corresponding response message of request message receiving to client sending/receiving module 701, Response message includes the address of the first virtual machine and the address of the second virtual machine, and response message indicates that the first virtual machine is multiple void There is in plan machine a virtual machine of the authority writing data to memory area, the second virtual machine is to remove first in multiple virtual machines Virtual machine beyond virtual machine.
Alternatively, response message also indicates that the second virtual machine has the authority from memory area reading data to be written.
Alternatively, the read right of write permission mark and the second virtual machine that response message also includes the first virtual machine identifies, Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, and read right identifies for referring to Fixed second virtual machine has the authority reading data to be written from memory area.
Alternatively, in response message, the address of the address of the first virtual machine and the second virtual machine arranges according to preset rules, Preset rules are used for the authority specifying the first virtual machine to have to memory area write data to be written, and specify the second virtual machine There is the authority reading data to be written from memory area.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes Memory area.
Alternatively, sending module 702 is additionally operable to:When the first virtual machine breaks down, send the first renewal to client Information, the first fresh information includes the address of the first virtual machine of renewal, and the first fresh information is specified to remove in multiple virtual machines and sent out As the first virtual machine updating, the first virtual machine of renewal has another virtual machine beyond first virtual machine of raw fault Write the authority of data to memory area;And/or when the second virtual machine breaks down, send second to client and update letter Breath, the second fresh information includes the address of the second virtual machine updating, and the second fresh information is specified another beyond multiple virtual machines As the second virtual machine updating, the second virtual machine of renewal has the power reading data to be written from memory area to one virtual machine Limit.
Using name node 700 provided in an embodiment of the present invention, the file of storage in distributed file system can be solved The problem of number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, by the embodiment of the present invention The operation of the name node 700 providing, so that the fault of virtual machine does not interfere with client reading or write file, improves The availability of system.
It should be noted that name node 700 provided in an embodiment of the present invention can be used for executing the distributed literary composition shown in Fig. 3 The operation of name node execution, the realization that name node 700 does not explain in detail and describes in the method for storage file in part system Mode refers to the associated description in the method for storage file in the distributed file system shown in Fig. 3.
Based on above example, the embodiment of the present invention additionally provides a kind of name node, and this name node can perform Fig. 3 pair The method that the embodiment answered provides, can be identical with the name node 700 shown in Fig. 7.
Referring to Fig. 8, name node 800 includes at least one processor 801, memorizer 802 and communication interface 803;Described At least one processor 801, described memorizer 802 and described communication interface 803 are all connected by bus 804;
Described memorizer 802, for storing computer executed instructions;
At least one processor 801 described, for execute the storage of described memorizer 802 computer executed instructions so that Described name node 800 by the miscellaneous equipment in described communication interface 803 and distributed file system carry out data interaction Lai The method of storage file in the distributed file system that execution above-described embodiment provides, or described name node 800 is led to Cross described communication interface 803 to carry out data interaction to realize distributed file system with the miscellaneous equipment in distributed file system Some or all of function.
At least one processor 801, can include different types of processor 801, or the process including same type Device 801;Processor 801 can be following any one:CPU, arm processor, FPGA, application specific processor etc. have calculating and process The device of ability.A kind of optional embodiment, at least one processor 801 described can also be integrated into many-core processor.
Memorizer 802 can be following any one or any one combination:RAM, ROM, NVM, SSD, mechanical hard disk, magnetic The storage mediums such as disk, disk array.
Communication interface 803 is used for name node 800 and other equipment (other equipment in such as distributed file system) Carry out data interaction.Communication interface 803 can be following any one or any one combination:(for example Ethernet connects network interface Mouthful), wireless network card etc. there is the device of network access facility.
This bus 804 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Fig. 8 is thick with one Line represents this bus.Bus 804 can be following any one or any one combination:Isa bus, pci bus, eisa bus etc. The device of wired data transfer.
The embodiment of the present invention provide a kind of client, this client be located distributed file system include name node, Multiple virtual machines as back end, multiple virtual machines share same memory area;As shown in figure 9, client 900 includes:
Sending module 901, for sending, to name node, the request that request writes data to be written to distributed file system Message;
Receiver module 902, for receiving the corresponding response message of request message of name node transmission, response message includes The address of the first virtual machine and the address of the second virtual machine, response message indicate the first virtual machine be multiple virtual machines in have to Memory area write data authority a virtual machine, the second virtual machine be in multiple virtual machines in addition to the first virtual machine Virtual machine;
Sending module 901, is additionally operable to the address of the first virtual machine including according to the response message that receiver module 902 receives To first virtual machine send data to be written, the second virtual machine address, and indicate first virtual machine write data to be written, generation or Update the metadata of data to be written and the address of the second virtual machine including according to the response message that receiver module 902 receives Send the metadata of data to be written to the second virtual machine.
Alternatively, the read right of write permission mark and the second virtual machine that response message also includes the first virtual machine identifies, Write permission identifies the authority writing data to be written for specifying the first virtual machine to have to memory area, and read right identifies for referring to Fixed second virtual machine has the authority reading data to be written from memory area.
Alternatively, the address of the first virtual machine that response message includes and the address of the second virtual machine are arranged according to preset rules Row, preset rules are used for specifying the first virtual machine to have to memory area and write the authority of data to be written and specify second virtual Machine has the authority reading data to be written from memory area.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes Memory area.
Alternatively, response message also indicates that the second virtual machine has the authority from memory area reading data to be written.
Using client 900 provided in an embodiment of the present invention, file part of storage in distributed file system can be solved The problem of number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, being carried by the embodiment of the present invention For client 900 operation so that the fault of virtual machine does not interfere with client reading or write file, improve and be The availability of system.
It should be noted that client 900 provided in an embodiment of the present invention can be used for executing the distributed document shown in Fig. 3 The operation of client executing in the method for storage file in system, client 900 does not explain in detail and the implementation that describes can Associated description in the method for storage file in distributed file system with reference to shown in Fig. 3.
Based on above example, the embodiment of the present invention additionally provides a kind of client, and this client can execute Fig. 3 and correspond to Embodiment provide method, can be identical with the client 900 shown in Fig. 9.
Referring to Figure 10, the equipment that client 1000 is located includes at least one processor 1001, memorizer 1002 and communication Interface 1003;At least one processor 1001 described, described memorizer 1002 and described communication interface 1003 are all by bus 1004 connections;
Described memorizer 1002, for storing computer executed instructions;
At least one processor 1001 described, for executing the computer executed instructions of described memorizer 1002 storage, makes Described client 1000 carries out data interaction by the equipment in described communication interface 1003 and distributed file system and holds The method of storage file in the distributed file system that row above-described embodiment provides, or make described client 1000 pass through institute State communication interface 1003 to carry out data interaction to realize the part of distributed file system with the equipment in distributed file system Or repertoire.
At least one processor 1001, can include different types of processor 1001, or the place including same type Reason device 1001;Processor 1001 can be following any one:CPU, arm processor, FPGA, application specific processor etc. have calculating The device of disposal ability.A kind of optional embodiment, at least one processor 1001 described can also be integrated into many-core processor.
Memorizer 1002 can be following any one or any one combination:RAM, ROM, NVM, SSD, mechanical hard disk, magnetic The storage mediums such as disk, disk array.
Communication interface 1003 is used for client 1000 and other equipment (other equipment in such as distributed file system) Carry out data interaction.Communication interface 1003 can be following any one or any one combination:(for example Ethernet connects network interface Mouthful), wireless network card etc. there is the device of network access facility.
This bus 1004 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Figure 10 is with one Thick line represents this bus.Bus 1004 can be following any one or any one combination:Isa bus, pci bus, EISA are total The device of the wired data transfer such as line.
The embodiment of the present invention provides the second virtual machine in a kind of distributed file system, and distributed file system includes name Claim node, multiple virtual machine as back end, multiple virtual machines share same memory area;As shown in figure 11, second is empty Plan machine 1100 includes:
Receiver module 1101, for receiving the metadata that the first virtual machine sends, the first virtual machine is in multiple virtual machines Specified a virtual machine with the authority writing data to memory area by name node, the second virtual machine is multiple virtual machines In virtual machine in addition to the first virtual machine, metadata is that the first virtual machine writes to memory area and generates after data to be written or more The metadata of new data to be written.
Alternatively, receiver module 1101 is additionally operable to:Before receiving the metadata that the first virtual machine sends, receive client The read right mark of the second virtual machine sending, read right mark is that name node is asked to distribution to name node in client Formula file system writes to client transmission during data to be written, and read right identifies for specifying the second virtual machine to have from storage The authority of data to be written is read in region.
Alternatively, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes Memory area.
Alternatively, the second virtual machine also includes:Processing module 1102, for receiving the first virtual machine in receiver module 1101 After the metadata sending, if the second virtual machine reads data to be written by the operating system of itself, generated according to metadata Or updating the fileinfo recording in the operating system of itself, fileinfo reads to be written from memory area for operating system Data;If or the second virtual machine reads data to be written, data to be written is read from memory area according to metadata.
Alternatively, the second virtual machine is specified by name node and is had the authority reading data to be written from memory area.
Using the second virtual machine 1100 provided in an embodiment of the present invention, the literary composition of storage in distributed file system can be solved The problem of part number redundancy.Additionally, when certain virtual machine in distributed file system breaks down, being implemented by the present invention The operation of the second virtual machine 1100 that example provides, so that the fault of virtual machine does not interfere with client read or write literary composition Part, improves the availability of system.
It should be noted that the second virtual machine 1100 provided in an embodiment of the present invention can be used for execute Fig. 3 shown in distributed The operation of the second virtual machine execution in the method for storage file in file system, the second virtual machine 1100 does not explain in detail and describes Implementation refer to the associated description in the method for storage file in the distributed file system shown in Fig. 3.
Based on above example, the embodiment of the present invention additionally provides a kind of second virtual machine, and this second virtual machine can be held The method that the corresponding embodiment of row Fig. 3 provides, can be identical with the second virtual machine 1100 shown in Figure 11.
Referring to Figure 12, the equipment that the second virtual machine 1200 is located includes at least one processor 1201, memorizer 1202 and Communication interface 1203;At least one processor 1201 described, described memorizer 1202 and described communication interface 1203 are all by total Line 1204 connects;
Described memorizer 1202, for storing computer executed instructions;
At least one processor 1201 described, for executing the computer executed instructions of described memorizer 1202 storage, makes Obtain described second virtual machine 1200 and data is carried out by described communication interface 1203 and the miscellaneous equipment in distributed file system The method to execute storage file in the distributed file system of above-described embodiment offer for the interaction, or make described second virtual Machine 1200 carries out data interaction by the miscellaneous equipment in described communication interface 1203 and distributed file system and realizes being distributed The some or all of function of formula file system.
At least one processor 1201, can include different types of processor 1201, or the place including same type Reason device 1201;Processor 1201 can be following any one:CPU, arm processor, FPGA, application specific processor etc. have calculating The device of disposal ability.A kind of optional embodiment, at least one processor 1201 described can also be integrated into many-core processor.
Memorizer 1202 can be following any one or any one combination:RAM, ROM, NVM, SSD, mechanical hard disk, magnetic The storage mediums such as disk, disk array.
Communication interface 1203 is used for the second virtual machine 1200, and (other in such as distributed file system set with other equipment Standby) carry out data interaction.Communication interface 1203 can be following any one or any one combination:Network interface (such as ether Network interface), wireless network card etc. there is the device of network access facility.
This bus 1204 can include address bus, data/address bus, controlling bus etc., and for ease of representing, Figure 12 is with one Thick line represents this bus.Bus 1204 can be following any one or any one combination:Isa bus, pci bus, EISA are total The device of the wired data transfer such as line.
The embodiment of the present invention provides a kind of distributed file system, and as shown in figure 13, distributed file system 1300 includes: First virtual machine 1301, name node 1302, client 1303 and the second virtual machine 1304.
Wherein, the first virtual machine 1301 in distributed file system 1300 can be used for executing the distributed literary composition shown in Fig. 3 Associative operation performed by first virtual machine in the method for storage file in part system, it implements form can be Fig. 5 institute The first virtual machine 600 shown in the first virtual machine 500 showing or Fig. 6;Name node 1302 in distributed file system 1300 Can be used for executing the associative operation performed by name node in the method for storage file in the distributed file system shown in Fig. 3, It implements form can be the name node 800 shown in name node 700 or Fig. 8 shown in Fig. 7;Distributed file system Client 1303 in 1300 can be used for executing client institute in the method for storage file in the distributed file system shown in Fig. 3 The associative operation of execution, it implements form can be the client 1000 shown in client 900 or Figure 10 shown in Fig. 9; The second virtual machine 1304 in distributed file system 1300 can be used for executing storage literary composition in the distributed file system shown in Fig. 3 Associative operation performed by second virtual machine in the method for part, it implements form can be the second virtual machine shown in Figure 11 The second virtual machine 1200 shown in 1100 or Figure 12.
In distributed file system 1300, data to be written only preserves one in the memory area that multiple virtual machines are shared Part, solve the problems, such as the file number redundancy of storage in distributed file system.Additionally, when in distributed file system 1300 Certain virtual machine when breaking down, client still can be by the virtual machine pair not broken down in distributed file system 1300 Data to be written carries out write operation or read operation, improves the availability of distributed file system.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the present invention can be using complete hardware embodiment, complete software embodiment or the reality combining software and hardware aspect Apply the form of example.And, the present invention can be using in one or more computers wherein including computer usable program code The upper computer program implemented of usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) produces The form of product.
The present invention is the flow process with reference to method according to embodiments of the present invention, equipment (system) and computer program Figure and/or block diagram are describing.It should be understood that can be by each stream in computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided The processor instructing general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device is to produce A raw machine is so that produced for reality by the instruction of computer or the computing device of other programmable data processing device The device of the function of specifying in present one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing device with spy Determine in the computer-readable memory that mode works so that the instruction generation inclusion being stored in this computer-readable memory refers to Make the manufacture of device, this command device realize in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function of specifying in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that counting On calculation machine or other programmable devices, execution series of operation steps to be to produce computer implemented process, thus in computer or On other programmable devices, the instruction of execution is provided for realizing in one flow process of flow chart or multiple flow process and/or block diagram one The step of the function of specifying in individual square frame or multiple square frame.
Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creation Property concept, then can make other change and modification to these embodiments.So, claims are intended to be construed to including excellent Select embodiment and fall into being had altered and changing of the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification without deviating from this to the embodiment of the present invention The spirit and scope of bright embodiment.So, if these modifications of the embodiment of the present invention and modification belong to the claims in the present invention And its within the scope of equivalent technologies, then the present invention is also intended to comprise these changes and modification.

Claims (22)

1. in a kind of distributed file system the method for storage file it is characterised in that described distributed file system includes name Claim node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area;Methods described includes:
First virtual machine receives data to be written, the address of the second virtual machine that client sends, and described first virtual machine is described Specified a virtual machine with the authority writing data to described memory area, institute by described name node in multiple virtual machines Stating the second virtual machine is the virtual machine in addition to described first virtual machine in the plurality of virtual machine;
Described first virtual machine writes described data to be written to described memory area, and generates or update the unit of described data to be written Data;
Described first virtual machine sends described metadata according to the address of described second virtual machine to described second virtual machine.
2. the method for claim 1 it is characterised in that described first virtual machine to described memory area write described in treat Before writing data, also include:
Described first virtual machine receives the write permission mark of described first virtual machine that described client sends, described write permission mark Knowledge is that described name node asks to write described number to be written to distributed file system to described name node in described client According to when to described client send, described write permission identifies for specifying described first virtual machine to have to described memory area Write the authority of described data to be written.
3. method as claimed in claim 1 or 2 is it is characterised in that the plurality of virtual machine carry distributed block storage system The same virtual hard disk providing, described virtual hard disk includes described memory area.
4. the method as described in any one of claims 1 to 3 it is characterised in that
If described second virtual machine reads described data to be written by the operating system of itself, described metadata is used for described the Two virtual machines generate or update the fileinfo of record in the operating system of itself, and described fileinfo is used for described operating system Described data to be written is read from described memory area;Or
If described second virtual machine reads described data to be written, described metadata is used for described second virtual machine from described storage Described data to be written is read in region.
5. the method as described in any one of Claims 1-4 is it is characterised in that described second virtual machine is by described name node Specify and there is the authority reading described data to be written from described memory area.
6. in a kind of distributed file system the method for storage file it is characterised in that described distributed file system includes name Claim node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area;Methods described includes:
Described name node receives the request message that client request writes data to be written to described distributed file system;
Described name node sends the corresponding response message of described request message to described client, and described response message includes the The address of one virtual machine and the address of the second virtual machine, described response message indicates that described first virtual machine is the plurality of virtual There is in machine a virtual machine of the authority writing data to described memory area, described second virtual machine is the plurality of virtual Virtual machine in addition to described first virtual machine in machine.
7. method as claimed in claim 6 it is characterised in that described response message also indicate described second virtual machine have from Described memory area reads the authority of described data to be written.
8. method as claimed in claims 6 or 7 is it is characterised in that described response message also includes described first virtual machine The read right mark of write permission mark and described second virtual machine, described write permission identifies for specifying the described first virtual equipment Oriented described memory area writes the authority of described data to be written, and described read right identifies for specifying the described second virtual equipment There is the authority reading described data to be written from described memory area.
9. method as claimed in claims 6 or 7 is it is characterised in that the address of the first virtual machine described in described response message Arrange according to preset rules with the address of described second virtual machine, described preset rules are used for specifying described first virtual machine to have Write the authority of described data to be written to described memory area, and specify described second virtual machine to have from described memory area Read the authority of described data to be written.
10. the method as described in any one of claim 6 to 9 is it is characterised in that the plurality of virtual machine carry distributed block is deposited The same virtual hard disk that storage system provides, described virtual hard disk includes described memory area.
11. methods as described in any one of claim 6 to 10 are it is characterised in that methods described also includes:
When described first virtual machine breaks down, described name node sends the first fresh information to described client, described First fresh information includes the address of the first virtual machine of renewal, and described first fresh information is specified in the plurality of virtual machine and removed Another virtual machine beyond described first virtual machine breaking down as the first virtual machine of described renewal, described renewal First virtual machine has the authority writing data to described memory area;And/or when described second virtual machine breaks down, institute State name node and send the second fresh information to described client, described second fresh information includes the second virtual machine of renewal Address, another virtual machine beyond described second fresh information specifies the plurality of virtual machine is second empty as described renewal Plan machine, the second virtual machine of described renewal has the authority reading described data to be written from described memory area.
The first virtual machine in a kind of 12. distributed file systems is it is characterised in that described distributed file system includes title Node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area, and described first virtual machine is Specified by described name node in the plurality of virtual machine have to described memory area write one of authority of data virtual Machine;Described first virtual machine includes:
Receiver module, for receiving the data to be written of client transmission, the address of the second virtual machine, described second virtual machine is institute State the virtual machine in addition to described first virtual machine in multiple virtual machines;
Processing module, for the data described to be written receiving to the described memory area described receiver module of write, and generates or more The metadata of newly described data to be written;
Sending module, the address of described second virtual machine for being received according to described receiver module is sent out to described second virtual machine Send described processing module to generate or update described metadata.
13. first virtual machines as claimed in claim 12 are it is characterised in that described receiver module is additionally operable to:
Before described processing module writes described data to be written to described memory area, receive the described of described client transmission The write permission mark of the first virtual machine, described write permission mark be described name node in described client to described name node Ask to write to the transmission of described client during described data to be written to distributed file system, described write permission identifies for referring to Fixed described first virtual machine has the authority writing described data to be written to described memory area.
14. the first virtual machines as described in claim 12 or 13 are it is characterised in that the plurality of virtual machine carry distributed block The same virtual hard disk that storage system provides, described virtual hard disk includes described memory area.
15. the first virtual machines as described in any one of claim 12 to 14 it is characterised in that
If described second virtual machine reads described data to be written by the operating system of itself, described metadata is used for described the Two virtual machines generate or update the fileinfo of record in the operating system of itself, and described fileinfo is used for described operating system Described data to be written is read from described memory area;Or
If described second virtual machine reads described data to be written, described metadata is used for described second virtual machine from described storage Described data to be written is read in region.
16. the first virtual machines as described in any one of claim 12 to 15 are it is characterised in that described second virtual machine is described Name node is specified has the authority reading described data to be written from described memory area.
Name node in a kind of 17. distributed file systems is it is characterised in that described distributed file system includes described name Claim node, multiple virtual machine as back end, the plurality of virtual machine shares same memory area;Described name node bag Include:
Receiver module, writes the request message of data to be written for receiving client request to described distributed file system;
Sending module, the corresponding response of described request message for sending described receiver module reception to described client disappears Breath, described response message includes the address of the first virtual machine and the address of the second virtual machine, described response message instruction described the One virtual machine is a virtual machine in the plurality of virtual machine with the authority writing data to described memory area, described the Two virtual machines are the virtual machine in the plurality of virtual machine in addition to described first virtual machine.
18. name node as claimed in claim 17 are it is characterised in that described response message also indicates described second virtual machine There is the authority reading described data to be written from described memory area.
19. name node as described in claim 17 or 18 are it is characterised in that described response message also includes described first void The write permission mark of plan machine and the read right mark of described second virtual machine, described write permission identifies for specifying described first void Intend the authority that the oriented described memory area of equipment writes described data to be written, described read right identifies for specifying described second void Plan machine has the authority reading described data to be written from described memory area.
20. name node as described in claim 17 or 18 are it is characterised in that the first virtual machine described in described response message Address and the address of described second virtual machine arrange according to preset rules, described preset rules are used for specifying described first virtual The oriented described memory area of equipment writes the authority of described data to be written, and specifies described second virtual machine to have to deposit from described The authority of described data to be written is read in storage area domain.
21. name node as described in any one of claim 17 to 20 are it is characterised in that the plurality of virtual machine carry is distributed The same virtual hard disk that formula block storage system provides, described virtual hard disk includes described memory area.
22. name node as described in any one of claim 17 to 21 are it is characterised in that described sending module is additionally operable to:
When described first virtual machine breaks down, send the first fresh information, described first fresh information to described client Including the address of the first virtual machine updating, described first fresh information is specified and is removed the institute breaking down in the plurality of virtual machine State another virtual machine beyond the first virtual machine as the first virtual machine of described renewal, the first virtual equipment of described renewal Oriented described memory area writes the authority of data;And/or
When described second virtual machine breaks down, send the second fresh information, described second fresh information to described client Including the address of the second virtual machine updating, described second fresh information specifies another beyond the plurality of virtual machine virtual Machine as the second virtual machine of described renewal, the second virtual machine of described renewal have read from described memory area described to be written The authority of data.
CN201610846967.0A 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node Active CN106446159B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610846967.0A CN106446159B (en) 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node
PCT/CN2017/085351 WO2018054079A1 (en) 2016-09-23 2017-05-22 Method for storing file, first virtual machine and namenode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610846967.0A CN106446159B (en) 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node

Publications (2)

Publication Number Publication Date
CN106446159A true CN106446159A (en) 2017-02-22
CN106446159B CN106446159B (en) 2019-11-12

Family

ID=58167356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610846967.0A Active CN106446159B (en) 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node

Country Status (2)

Country Link
CN (1) CN106446159B (en)
WO (1) WO2018054079A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704596A (en) * 2017-10-13 2018-02-16 郑州云海信息技术有限公司 A kind of method, apparatus and equipment for reading file
WO2018054079A1 (en) * 2016-09-23 2018-03-29 华为技术有限公司 Method for storing file, first virtual machine and namenode
CN109753226A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Data processing system, method and electronic equipment
CN110110003A (en) * 2018-01-26 2019-08-09 广州中国科学院计算机网络信息中心 The data storage control method and device of M2M platform
CN110688194A (en) * 2018-07-06 2020-01-14 中兴通讯股份有限公司 Disk management method based on cloud desktop, virtual machine and storage medium
CN113037569A (en) * 2021-04-19 2021-06-25 杭州和利时自动化有限公司 Redundant service method, device, equipment and medium based on double servers
CN114138737A (en) * 2022-02-08 2022-03-04 亿次网联(杭州)科技有限公司 File storage method, device, equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111443872A (en) * 2020-03-26 2020-07-24 深信服科技股份有限公司 Distributed storage system construction method, device, equipment and medium
CN113641467B (en) * 2021-10-19 2022-02-11 杭州优云科技有限公司 Distributed block storage implementation method of virtual machine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521063A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Shared storage method suitable for migration and fault tolerance of virtual machine
US20130325812A1 (en) * 2012-05-30 2013-12-05 Spectra Logic Corporation System and method for archive in a distributed file system
CN103729250A (en) * 2012-10-11 2014-04-16 国际商业机器公司 Method and system to select data nodes configured to satisfy a set of requirements
CN103797770A (en) * 2012-12-31 2014-05-14 华为技术有限公司 Method and system for sharing storage resources

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104838374A (en) * 2012-12-06 2015-08-12 英派尔科技开发有限公司 Decentralizing a HADOOP cluster
US9348707B2 (en) * 2013-12-18 2016-05-24 International Business Machines Corporation Dynamically adjusting the number of replicas of a file according to the probability that the file will be accessed within a distributed file system
CN106446159B (en) * 2016-09-23 2019-11-12 华为技术有限公司 A kind of method of storage file, the first virtual machine and name node

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521063A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Shared storage method suitable for migration and fault tolerance of virtual machine
US20130325812A1 (en) * 2012-05-30 2013-12-05 Spectra Logic Corporation System and method for archive in a distributed file system
CN103729250A (en) * 2012-10-11 2014-04-16 国际商业机器公司 Method and system to select data nodes configured to satisfy a set of requirements
CN103797770A (en) * 2012-12-31 2014-05-14 华为技术有限公司 Method and system for sharing storage resources

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018054079A1 (en) * 2016-09-23 2018-03-29 华为技术有限公司 Method for storing file, first virtual machine and namenode
CN107704596A (en) * 2017-10-13 2018-02-16 郑州云海信息技术有限公司 A kind of method, apparatus and equipment for reading file
CN107704596B (en) * 2017-10-13 2021-06-29 郑州云海信息技术有限公司 Method, device and equipment for reading file
CN109753226A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Data processing system, method and electronic equipment
CN110110003A (en) * 2018-01-26 2019-08-09 广州中国科学院计算机网络信息中心 The data storage control method and device of M2M platform
CN110688194A (en) * 2018-07-06 2020-01-14 中兴通讯股份有限公司 Disk management method based on cloud desktop, virtual machine and storage medium
CN110688194B (en) * 2018-07-06 2023-03-17 中兴通讯股份有限公司 Disk management method based on cloud desktop, virtual machine and storage medium
CN113037569A (en) * 2021-04-19 2021-06-25 杭州和利时自动化有限公司 Redundant service method, device, equipment and medium based on double servers
CN114138737A (en) * 2022-02-08 2022-03-04 亿次网联(杭州)科技有限公司 File storage method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN106446159B (en) 2019-11-12
WO2018054079A1 (en) 2018-03-29

Similar Documents

Publication Publication Date Title
CN106446159B (en) A kind of method of storage file, the first virtual machine and name node
CN111183420B (en) Log structured storage system
AU2017201918B2 (en) Prioritizing data reconstruction in distributed storage systems
CN106687911B (en) Online data movement without compromising data integrity
CN111566611B (en) Log structured storage system
CN103929500A (en) Method for data fragmentation of distributed storage system
CN108351806A (en) Database trigger of the distribution based on stream
US10552089B2 (en) Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests
CN107402722B (en) Data migration method and storage device
CN102282544A (en) Storage system
CN105630418A (en) Data storage method and device
CN109582213B (en) Data reconstruction method and device and data storage system
CN108319618B (en) Data distribution control method, system and device of distributed storage system
CN106933747A (en) Data-storage system and date storage method based on multithread
CN110147203A (en) A kind of file management method, device, electronic equipment and storage medium
CN115617264A (en) Distributed storage method and device
CN105760391A (en) Data dynamic redistribution method and system, data node and name node
CN108536822A (en) Data migration method, device, system and storage medium
CN107463638A (en) File sharing method and equipment between offline virtual machine
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN114785662A (en) Storage management method, device, equipment and machine readable storage medium
US11531642B2 (en) Synchronous object placement for information lifecycle management
US11163642B2 (en) Methods, devices and computer readable medium for managing a redundant array of independent disks
CN113885798A (en) Data operation method, device, equipment and medium
CN109151016B (en) Flow forwarding method and device, service system, computing device and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220215

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Patentee after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.