CN106446159B - A kind of method of storage file, the first virtual machine and name node - Google Patents

A kind of method of storage file, the first virtual machine and name node Download PDF

Info

Publication number
CN106446159B
CN106446159B CN201610846967.0A CN201610846967A CN106446159B CN 106446159 B CN106446159 B CN 106446159B CN 201610846967 A CN201610846967 A CN 201610846967A CN 106446159 B CN106446159 B CN 106446159B
Authority
CN
China
Prior art keywords
virtual machine
written
data
storage region
permission
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610846967.0A
Other languages
Chinese (zh)
Other versions
CN106446159A (en
Inventor
李亿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610846967.0A priority Critical patent/CN106446159B/en
Publication of CN106446159A publication Critical patent/CN106446159A/en
Priority to PCT/CN2017/085351 priority patent/WO2018054079A1/en
Application granted granted Critical
Publication of CN106446159B publication Critical patent/CN106446159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

A kind of method of storage file, the first virtual machine and name node, existing file number redundancy issue when solving distributed file system storage file, and improve the availability of system.Method includes: that client sends the request message for requesting that data to be written are written to distributed file system to name node;Name node sends the corresponding response message of request message to client, response message includes the address of the first virtual machine and the address of the second virtual machine, and indicating that the first virtual machine is a virtual machine in multiple virtual machines with the permission that data are written to storage region, the second virtual machine is the virtual machine in multiple virtual machines in addition to the first virtual machine;Client sends the address of data to be written and the second virtual machine to the first virtual machine;Data to be written are written in the storage region that first virtual machine is shared to multiple virtual machines, and generate or update the metadata of data to be written;First virtual machine sends the metadata for generating or updating to the second virtual machine.

Description

A kind of method of storage file, the first virtual machine and name node
Technical field
The present invention relates to field of computer technology more particularly to a kind of methods of storage file, the first virtual machine and title Node.
Background technique
Distributed file system includes client (client), back end (datanode) and name node (namenode);Wherein, back end is used for storage file, and name node is for managing the file stored on back end.Visitor Family end can be inquired the file stored in each back end by name node and obtain the address of each back end, thus real File is now read from back end or by file write data node.Back end in distributed file system can be Physical server is also possible to virtual machine.
When the back end in distributed file system is virtual machine, the virtual hard disk of the virtual machine is by distributed block What storage system provided, be really the virtual hard disk written document to virtual machine to virtual machine written document, to virtual hard disk written document The physical hard disk written document being achieved in that distributed block system management memory.
Distributed file system can use duplicate of the document in virtual hard disk storage file to guarantee the reliability of file Same file is stored in distributed file system in N number of (N is the integer greater than 1) virtual hard disk by mechanism;And it is distributed Block storage system can also use duplicate of the document mechanism, by the file in the same virtual hard disk to guarantee the reliability of file It is saved in M (M is the integer greater than 1) physical hard disks.Since distributed file system and distributed block storage system are adopted With duplicate of the document mechanism, will lead to the file number that same file actually saves in physical hard disk is N*M, causes file Number redundancy.The file number redundancy that same file saves can waste memory space, influence the process performance of system.
In the prior art in order to solve the problems, such as file number redundancy in distributed file system, following two is generallyd use Method: first method is, for the text document that needs store, only to store in a virtual machine of distributed file system This document.Using first method, this document can only could be accessed by the virtual machine, if the virtual machine breaks down, needed File read-write service could be provided for client again after waiting the virtual machine to restore normal, lead to distributed file system Availability reduces;Second method is, using the hot standby mechanism of virtual machine, that is, configures the corresponding hot standby virtual machine of host virtual machine, The hot standby virtual machine and host virtual machine are synchronously written file.When host virtual machine breaks down, distributed file system is switched to Hot standby virtual machine continues as client and provides file read-write service.Using second method, distributed file system is switched to heat Certain waiting time is needed when standby virtual machine, causes distributed file system that can not provide within the waiting time for client File read-write service reduces the availability of distributed file system;Also, hot standby virtual machine is before switching to host virtual machine Service is not provided externally, the wasting of resources is caused.
To sum up, the existing method for solving file number redundancy issue in distributed file system will lead to distributed document The availability of system is low, can not preferably solve file number redundancy issue.
Summary of the invention
The embodiment of the present invention provides a kind of method of storage file, the first virtual machine and name node, is distributed to solve When formula file system storage file the problem of existing file number redundancy, and improve the availability of system.
In a first aspect, the embodiment of the present invention provides a kind of method of storage file in distributed file system, in this method, Distributed file system includes name node, multiple virtual machines as back end, and multiple virtual machines therein are shared same Storage region;This method comprises:
First virtual machine receives the address of the data to be written of client transmission, the second virtual machine, then to multiple virtual machines The data to be written received are written in shared storage region, and generate or update the metadata of data to be written;First virtual machine root The metadata that the first virtual machine is generated or updated is sent to the second virtual machine according to the address of the second virtual machine received.
Wherein, the first virtual machine is specified with the power that data are written to storage region by name node in multiple virtual machines One virtual machine of limit, the second virtual machine are the virtual machine in multiple virtual machines in addition to the first virtual machine;The member of data to be written Data include but is not limited to: the file directory of the storage location of data to be written, the file name of data to be written and data to be written.
Using the above method, since multiple virtual machines that distributed file system includes share same storage region, thus In distributed file system, the data to be written that the storage region is written in the first virtual machine only save one in the storage region Part.For data to be written, only can due to distributed block storage system use duplicate of the document mechanism and save more parts, without The file saved caused by the presence of duplicate of the document mechanism is all made of due to distributed file system and distributed block storage system The problem of number redundancy.
In addition, using the above scheme, the first virtual machine in multiple virtual machines that distributed file system includes have to The permission of data is written in storage region, and the second virtual machine in multiple virtual machines in addition to the first virtual machine has from storage region Read the permission of data to be written.Thus, it can be used for providing the service for reading and writing data to be written in distributed file system for client Virtual machine quantity be it is multiple.When some virtual machine breaks down, read-write can be provided for client by other virtual machines The service of data to be written makes the availability of distributed file system be improved, while also avoiding using in the prior art Existing problem of resource waste when the hot standby mechanism of virtual machine.
In one possible implementation, the first virtual machine is written before data to be written to storage region, further includes: the One virtual machine receives the write permission mark for the first virtual machine that client is sent, write permission mark be name node client to It is sent when name node requests that data to be written are written to distributed file system to client, for specifying the first virtual equipment The permission of data to be written is written in oriented storage region.
Using the above scheme, the mode that a kind of client indicates the permission of the first virtual machine to the first virtual machine is provided.
Multiple virtual machines, which share a storage region, can be used in specific implementation such as under type: multiple virtual machine carries point The same virtual hard disk that cloth block storage system provides, the virtual hard disk include the shared storage region of multiple virtual machines.
The metadata for the data to be written that first virtual machine is sent to the second virtual machine has following two purposes:
The first
If the second virtual machine reads data to be written by the operating system of itself, metadata is generated for the second virtual machine Or the file information recorded in itself operating system is updated, the file information reads from storage region to be written for operating system Data.
Second
If the second virtual machine reads data to be written, metadata reads number to be written for the second virtual machine from storage region According to.
Using the above scheme, the second virtual machine can be read more according to the metadata for the data to be written that the first virtual machine is sent Data to be written in the shared storage region of a virtual machine.
In one possible implementation, the second virtual machine can be specified to have from storage region by name node and be read The permission of data to be written.
Second aspect, the embodiment of the present invention provide a kind of method of storage file in distributed file system, the distribution File system includes name node, multiple virtual machines as back end, and multiple virtual machines share same storage region;The party Method includes:
After the request message of data to be written is written to distributed file system for name node reception client request, to client End sends the corresponding response message of the request message.
Wherein, address and the second virtual machine in the response message that name node is sent to client including the first virtual machine Address, in addition, it is to have in multiple virtual machines to storage region that data are written that the response message, which also indicates the first virtual machine, One virtual machine of permission, the second virtual machine are the virtual machine in multiple virtual machines in addition to the first virtual machine.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, and name The response message for claiming node to send is specified first virtual machine in multiple virtual machines to have and is write into shared storage region Enter the permission of data, thus the data being written into shared storage region can only save portion in the storage region.For For the data of the shared storage region of write-in, it can only be saved due to duplicate of the document mechanism that distributed block storage system uses More parts, it may be not present and protected caused by being all made of duplicate of the document mechanism due to distributed file system and distributed block storage system The problem of file number redundancy deposited.
In addition, in response message indicate distributed file system include multiple virtual machines in the first virtual machine have to The permission of data is written in storage region, and the second virtual machine in multiple virtual machines in addition to the first virtual machine has from storage region Read the permission of data to be written.Thus, it can be used for providing the service for reading and writing data to be written in distributed file system for client Virtual machine quantity be it is multiple.When some virtual machine breaks down, read-write can be provided for client by other virtual machines The service of data to be written makes the availability of distributed file system be improved, while also avoiding using in the prior art Existing problem of resource waste when the hot standby mechanism of virtual machine.
In one possible implementation, it is to be written with reading from storage region to also indicate the second virtual machine for response message The permission of data.
In one possible implementation, name node indicates the power of the first virtual machine by response message to client The following two kinds mode can be used when limit and the permission of the second virtual machine:
First way
Name node is identified to the write permission that the response message that client is sent further includes the first virtual machine and second is virtual The read right of machine identifies, and write permission mark and read right mark therein have indicated respectively the permission and the second void of the first virtual machine The permission of quasi- machine, it may be assumed that the permission that data to be written are written to storage region for specifying the first virtual machine to have for write permission mark is read Capability identification is used to that the second virtual machine to be specified to have the permission for reading data to be written from storage region.
The second way
The address of the first virtual machine and the address of the second virtual machine are pressed in the response message that name node is sent to client It is arranged according to preset rules, which indicates the permission of the first virtual machine and the permission of the second virtual machine, it may be assumed that first is virtual The permission of data to be written is written in the oriented storage region of equipment, and the second virtual machine has the power that data to be written are read from storage region Limit.
Using both modes, the permission that name node indicates the first virtual machine by response message to client is provided With the two ways of the permission of the second virtual machine.
In one possible implementation, multiple virtual machines share a storage region can be used in specific implementation as Under type: the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes storage region.
Due to can be used for providing the number of the virtual machine for the service for reading and writing data to be written in distributed file system for client Amount is multiple, thus when some virtual machine breaks down, it can provide read-write data to be written for client by other virtual machines Service.When specific implementation, it may include following two feelings that name node, which is handled when virtual machine breaks down by which kind of mode, Condition:
The first situation
When the first virtual machine breaks down, name node sends the first more new information to client, the first update letter Breath includes the address of the first virtual machine updated, and the first more new information further specifies in multiple virtual machines except first to break down Another virtual machine other than virtual machine specifies some virtual machine in the second virtual machine as the first virtual machine updated As the first virtual machine of update, the first virtual machine of update has the permission to storage region write-in data.
Second situation
When the second virtual machine breaks down, name node sends the second more new information to client, the second update letter Breath includes the address of the second virtual machine updated, and the second more new information further specifies another virtual machine other than multiple virtual machines As the second virtual machine of update, the second virtual machine of update has the permission that data to be written are read from storage region.
Using the above scheme, either the first virtual machine breaks down or the second virtual machine breaks down, name node Other virtual machines are specified to substitute the virtual machine to break down, to occur in the first virtual machine and/or the second virtual machine In the case where failure, distributed file system remains to provide the service of read-write data for client, further improves distribution The availability of file system.
The third aspect, the embodiment of the present invention provide a kind of method of storage file in distributed file system, the distribution File system includes name node, multiple virtual machines as back end, and multiple virtual machines share same storage region;The party Method includes:
Client sends the request message for requesting that data to be written are written to distributed file system to name node, rear to receive The corresponding response message of request message that name node is sent.
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, in addition, response message also refers to Show that the first virtual machine is a virtual machine in multiple virtual machines with the permission that data are written to storage region, the second virtual machine For the virtual machine in multiple virtual machines in addition to the first virtual machine.
The address of client the first virtual machine that message includes according to response sends data to be written and the to the first virtual machine The address of two virtual machines, and indicate the first virtual machine: data, generation or the metadata of update data to be written, simultaneously root to be written is written The metadata of data to be written is sent to the second virtual machine according to the address of the second virtual machine.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, thus In distributed file system, client indicates that the data to be written of the shared storage area are written only in the storage in the first virtual machine It is stored in region a.For data to be written, only can due to distributed block storage system use duplicate of the document mechanism and More parts are saved, may be not present causes since distributed file system and distributed block storage system are all made of duplicate of the document mechanism Preservation file number redundancy the problem of.
It is write further, since the first virtual machine in multiple virtual machines that distributed file system includes has to storage region Enter the permission of data, the second virtual machine in multiple virtual machines in addition to the first virtual machine, which has from storage region, reads number to be written According to permission.Thus, it can be used for providing the virtual machine for the service for reading and writing data to be written in distributed file system for client Quantity is multiple.When some virtual machine breaks down, read-write data to be written can be provided for client by other virtual machines Service, makes the availability of distributed file system be improved, while also avoiding hot standby using virtual machine in the prior art Existing problem of resource waste when mechanism.
In one possible implementation, client is known by the response message of the name node transmission received The following two kinds mode can be used when the permission of the first virtual machine and the permission of the second virtual machine:
First way
The response message that client receives further includes that the reading of the write permission mark and the second virtual machine of the first virtual machine is weighed Limit mark, write permission mark and read right mark therein have indicated respectively the permission of the first virtual machine and the power of the second virtual machine Limit, it may be assumed that the permission that data to be written are written to storage region for specifying the first virtual machine to have for write permission mark, read right mark The permission that data to be written are read from storage region for specifying the second virtual machine to have.
The second way
The address of the first virtual machine and the address of the second virtual machine are according to default rule in the response message that client receives Then arrange, which indicates the permission of the first virtual machine and the permission of the second virtual machine, it may be assumed that the first virtual machine have to The permission of data to be written is written in storage region, and the second virtual machine has the permission that data to be written are read from storage region.
Using both modes, the permission and second that client knows the first virtual machine by receiving response message is provided The two ways of the permission of virtual machine.
In one possible implementation, multiple virtual machines share a storage region can be used in specific implementation as Under type: the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes storage region.
In one possible implementation, it is to be written with reading from storage region to also indicate the second virtual machine for response message The permission of data.
Fourth aspect, the embodiment of the present invention provide a kind of method of storage file in distributed file system, the distribution File system includes name node, multiple virtual machines as back end, and multiple virtual machines share same storage region;The party Method includes:
Second virtual machine receives the metadata that the first virtual machine is sent.Wherein, the first virtual machine is quilt in multiple virtual machines The specified virtual machine with the permission to storage region write-in data of name node, the second virtual machine is in multiple virtual machines Virtual machine in addition to the first virtual machine, metadata are to generate or update after data to be written are written to storage region in the first virtual machine Data to be written metadata.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, and it is more First virtual machine in a virtual machine has the permission that data are written into shared storage region, thus deposits to shared The data being written in storage area domain can only save portion in the storage region.Data for shared storage region is written are come It says, only can save more parts due to duplicate of the document mechanism that distributed block storage system uses, may be not present due to distributed text Part system and distributed block storage system are all made of the problem of file number redundancy saved caused by duplicate of the document mechanism.
In addition, the first virtual machine in multiple virtual machines that distributed file system includes, which has to storage region, is written number According to permission, the second virtual machine in multiple virtual machines in addition to the first virtual machine, which has from storage region, reads data to be written Permission.Thus, it can be used for providing the quantity for reading and writing the virtual machine of service of data to be written in distributed file system for client It is multiple.When some virtual machine breaks down, the service for reading and writing data to be written can be provided for client by other virtual machines, When so that the availability of distributed file system is improved, while also avoiding mechanism hot standby using virtual machine in the prior art Existing problem of resource waste.
In one possible implementation, the second virtual machine knows that body permission can be used such as under type: the second virtual machine Before receiving the metadata that the first virtual machine is sent, the read right mark for the second virtual machine that client is sent is received, power is read Limit mark is name node when client to name node requests that data to be written are written to distributed file system to client It sends, the permission that read right mark reads data to be written from storage region for specifying the second virtual machine to have.
In one possible implementation, multiple virtual machines share a storage region can be used in specific implementation as Under type: the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes storage region.
After second virtual machine receives the metadata that the first virtual machine is sent, it can be read according to the metadata received multiple Data to be written in the shared storage region of virtual machine, specifically can be used following manner:
The first
If the second virtual machine reads data to be written by the operating system of itself, the second virtual machine is generated according to metadata Or the file information recorded in itself operating system is updated, this document information can be used for operating system and read from storage region Data to be written.
Second
If the second virtual machine reads data to be written, the second virtual machine reads number to be written according to metadata from storage region According to.
Using the above scheme, the second virtual machine can be read more according to the metadata for the data to be written that the first virtual machine is sent Data to be written in the shared storage region of a virtual machine.
In one possible implementation, the second virtual machine is specified to be written with reading from storage region by name node The permission of data.
5th aspect, the embodiment of the present invention provide the first virtual machine in a kind of distributed file system, and distribution is literary Part system includes name node, multiple virtual machines as back end, and multiple virtual machines share same storage region, and first is empty Quasi- machine is in multiple virtual machines by the specified virtual machine with the permission that data are written to storage region of name node;This One virtual machine includes:
Receiving module, for receiving the data to be written of client transmission, the address of the second virtual machine, the second virtual machine is more Virtual machine in a virtual machine in addition to the first virtual machine;
Processing module for the received data to be written of receiving module to be written to storage region, and generates or updates number to be written According to metadata;
Sending module, for sending processing mould to the second virtual machine according to the address of received second virtual machine of receiving module Block generates or more new metadata.
Wherein, the metadata of data to be written includes but is not limited to: the filename of the storage location of data to be written, data to be written The file directory of title and data to be written.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, thus In distributed file system, the data to be written that the storage region is written in processing module only save a in the storage region. For data to be written, only can due to distributed block storage system use duplicate of the document mechanism and save more parts, without depositing The file part saved caused by being all made of duplicate of the document mechanism due to distributed file system and distributed block storage system The problem of number redundancy.
In addition, using the above method, the first virtual machine in multiple virtual machines that distributed file system includes have to The permission of data is written in storage region, and the second virtual machine in multiple virtual machines in addition to the first virtual machine has from storage region Read the permission of data to be written.Thus, it can be used for providing the service for reading and writing data to be written in distributed file system for client Virtual machine quantity be it is multiple.When some virtual machine breaks down, read-write can be provided for client by other virtual machines The service of data to be written makes the availability of distributed file system be improved, while also avoiding using in the prior art Existing problem of resource waste when the hot standby mechanism of virtual machine.
In one possible implementation, receiving module is also used to: number to be written is written to storage region in processing module According to before, receive the write permission mark for the first virtual machine that client is sent, write permission mark be name node client to It is sent when name node requests that data to be written are written to distributed file system to client, which identifies for specifying First virtual machine has the permission that data to be written are written to storage region.
Using the above scheme, the mode that the first virtual machine of one kind knows own right from client is provided.
Multiple virtual machines, which share a storage region, can be used in specific implementation such as under type: multiple virtual machine carries point The same virtual hard disk that cloth block storage system provides, virtual hard disk includes storage region.
The metadata for the data to be written that sending module is sent to the second virtual machine has following two purposes:
The first
If the second virtual machine reads data to be written by the operating system of itself, metadata is generated for the second virtual machine Or the file information recorded in itself operating system is updated, the file information reads from storage region to be written for operating system Data.
Second
If the second virtual machine reads data to be written, metadata reads number to be written for the second virtual machine from storage region According to.
Using the above scheme, the second virtual machine can be read multiple according to the metadata for the data to be written that sending module is sent Data to be written in the shared storage region of virtual machine.
In one possible implementation, the second virtual machine is specified to be written with reading from storage region by name node The permission of data.
6th aspect, the embodiment of the present invention provide the name node in a kind of distributed file system, the distributed document System includes name node, multiple virtual machines as back end, and multiple virtual machines share same storage region;The title section It puts and includes:
The request message of data to be written is written for receiving client request to distributed file system for receiving module;
Sending module is used for the corresponding response message of the received request message of client sending/receiving module, the response Message includes the address of the first virtual machine and the address of the second virtual machine, in addition, the response message also indicates the first virtual machine is There is a virtual machine of the permission to storage region write-in data, the second virtual machine is in multiple virtual machines in multiple virtual machines Virtual machine in addition to the first virtual machine.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, and name The response message for claiming node to send is specified first virtual machine in multiple virtual machines to have and is write into shared storage region Enter the permission of data, thus the data that processing module is written into shared storage region only can save one in the storage region Part.It, only can be due to the duplicate of the document machine of distributed block storage system use for being written for the data of shared storage region Make and save more parts, may be not present since distributed file system and distributed block storage system are all made of duplicate of the document mechanism and The problem of file number redundancy of caused preservation.
In addition, in response message indicate distributed file system include multiple virtual machines in the first virtual machine have to The permission of data is written in storage region, and the second virtual machine in multiple virtual machines in addition to the first virtual machine has from storage region Read the permission of data to be written.Thus, it can be used for providing the service for reading and writing data to be written in distributed file system for client Virtual machine quantity be it is multiple.When some virtual machine breaks down, read-write can be provided for client by other virtual machines The service of data to be written makes the availability of distributed file system be improved, while also avoiding using in the prior art Existing problem of resource waste when the hot standby mechanism of virtual machine.
In one possible implementation, it is to be written with reading from storage region to also indicate the second virtual machine for response message The permission of data.
In one possible implementation, the response message that sending module is sent indicates the first virtual machine to client The following two kinds mode can be used when permission and the permission of the second virtual machine:
First way
Sending module is identified to the write permission that the response message that client is sent further includes the first virtual machine and second is virtual The read right of machine identifies, and write permission mark and read right mark therein have indicated respectively the permission and the second void of the first virtual machine The permission of quasi- machine, it may be assumed that the permission that data to be written are written to storage region for specifying the first virtual machine to have for write permission mark is read Capability identification is used to that the second virtual machine to be specified to have the permission for reading data to be written from storage region.
The second way
The address of the first virtual machine and the address of the second virtual machine are pressed in the response message that sending module is sent to client It is arranged according to preset rules, which indicates the permission of the first virtual machine and the permission of the second virtual machine, it may be assumed that first is virtual The permission of data to be written is written in the oriented storage region of equipment, and the second virtual machine has the power that data to be written are read from storage region Limit.
Using both modes, the response message for providing sending module transmission indicates the power of the first virtual machine to client The two ways of limit and the permission of the second virtual machine.
In one possible implementation, multiple virtual machines share a storage region can be used in specific implementation as Under type: the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes storage region.
Due to can be used for providing the number of the virtual machine for the service for reading and writing data to be written in distributed file system for client Amount is multiple, thus when some virtual machine breaks down, it can provide read-write data to be written for client by other virtual machines Service.When specific implementation, it may include following two feelings that sending module, which is handled when virtual machine breaks down by which kind of mode, Condition:
The first situation
When the first virtual machine breaks down, the first more new information is sent to client, which includes more The address of the first new virtual machine, the first more new information further specify in multiple virtual machines except the first virtual machine for breaking down with As the first virtual machine updated, the first virtual machine of update has to storage region write-in data another outer virtual machine Permission.
Second situation
When the second virtual machine breaks down, the second more new information is sent to client, which includes more The address of the second new virtual machine, the second more new information further specify another virtual machine other than multiple virtual machines as update The second virtual machine, the second virtual machine of update has the permission that data to be written are read from storage region.
Using the above scheme, either the first virtual machine breaks down or the second virtual machine breaks down, sending module Other virtual machines are specified to substitute the virtual machine to break down, to occur in the first virtual machine and/or the second virtual machine In the case where failure, distributed file system remains to provide the service of read-write data for client, further improves distribution The availability of file system.
7th aspect, the embodiment of the present invention provide a kind of client, and the distributed file system where the client includes Name node, multiple virtual machines as back end, multiple virtual machines share same storage region;The client includes:
Sending module disappears for sending request to name node to the request that data to be written are written in distributed file system Breath;
Receiving module, for receiving the corresponding response message of request message of name node transmission;
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, in addition, response message also refers to Show that the first virtual machine is a virtual machine in multiple virtual machines with the permission that data are written to storage region, the second virtual machine For the virtual machine in multiple virtual machines in addition to the first virtual machine;
Sending module is also used to the address of the first virtual machine for including according to the received response message of receiving module to first Virtual machine sends the address of data to be written, the second virtual machine, and indicates the first virtual machine: data, generation or update to be written is written The address of the metadata of data to be written and the second virtual machine for including according to the received response message of receiving module is empty to second Quasi- machine sends the metadata of data to be written.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, thus In distributed file system, sending module indicates that the first virtual machine is written the data to be written of the shared storage area and only deposits at this It is stored in storage area domain a.It, only can be due to the duplicate of the document mechanism of distributed block storage system use for data to be written And more parts are saved, it may be not present and led since distributed file system and distributed block storage system are all made of duplicate of the document mechanism The problem of file number redundancy of the preservation of cause.
It is write further, since the first virtual machine in multiple virtual machines that distributed file system includes has to storage region Enter the permission of data, the second virtual machine in multiple virtual machines in addition to the first virtual machine, which has from storage region, reads number to be written According to permission.Thus, it can be used for providing the virtual machine for the service for reading and writing data to be written in distributed file system for client Quantity is multiple.When some virtual machine breaks down, read-write data to be written can be provided for client by other virtual machines Service, makes the availability of distributed file system be improved, while also avoiding hot standby using virtual machine in the prior art Existing problem of resource waste when mechanism.
In one possible implementation, receiving module is obtained by the response message of the name node transmission received Know the first virtual machine permission and the second virtual machine permission when the following two kinds mode can be used:
First way
The response message that receiving module receives further includes the reading of the write permission mark and the second virtual machine of the first virtual machine Capability identification, write permission mark therein and read right identify the permission and the second virtual machine for having indicated respectively the first virtual machine Permission, it may be assumed that the permission that data to be written are written to storage region for specifying the first virtual machine to have for write permission mark, read right mark Know the permission that data to be written are read from storage region for specifying the second virtual machine to have.
The second way
The address of the first virtual machine and the address of the second virtual machine are according to default in the response message that receiving module receives Regularly arranged, which indicates the permission of the first virtual machine and the permission of the second virtual machine, it may be assumed that the first virtual machine has The permission of data to be written is written to storage region, the second virtual machine has the permission that data to be written are read from storage region.
Using both modes, the permission and that receiving module knows the first virtual machine by receiving response message is provided The two ways of the permission of two virtual machines.
In one possible implementation, multiple virtual machines share a storage region can be used in specific implementation as Under type: the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes storage region.
In one possible implementation, it is to be written with reading from storage region to also indicate the second virtual machine for response message The permission of data.
Eighth aspect, the second virtual machine in a kind of distributed file system of the embodiment of the present invention, the distributed field system System includes name node, multiple virtual machines as back end, and multiple virtual machines share same storage region;This is second virtual Machine includes:
Receiving module, for receiving the metadata of the first virtual machine transmission.Wherein, the first virtual machine is in multiple virtual machines By the specified virtual machine with the permission to storage region write-in data of name node, the second virtual machine is multiple virtual machines In virtual machine in addition to the first virtual machine, metadata is to generate after data to be written are written to storage region in the first virtual machine or more The metadata of new data to be written.
Using the above scheme, since multiple virtual machines that distributed file system includes share same storage region, and it is more First virtual machine in a virtual machine has the permission that data are written into shared storage region, thus deposits to shared The data being written in storage area domain can only save portion in the storage region.Data for shared storage region is written are come It says, only can save more parts due to duplicate of the document mechanism that distributed block storage system uses, may be not present due to distributed text Part system and distributed block storage system are all made of the problem of file number redundancy saved caused by duplicate of the document mechanism.
In addition, the first virtual machine in multiple virtual machines that distributed file system includes, which has to storage region, is written number According to permission, the second virtual machine in multiple virtual machines in addition to the first virtual machine, which has from storage region, reads data to be written Permission.Thus, it can be used for providing the quantity for reading and writing the virtual machine of service of data to be written in distributed file system for client It is multiple.When some virtual machine breaks down, the service for reading and writing data to be written can be provided for client by other virtual machines, When so that the availability of distributed file system is improved, while also avoiding mechanism hot standby using virtual machine in the prior art Existing problem of resource waste.
In one possible implementation, receiving module knows that the permission of the second virtual machine can be used such as under type: connecing Module is received before receiving the metadata that the first virtual machine is sent, receives the read right mark for the second virtual machine that client is sent Know, read right mark be name node when client to name node requests that data to be written are written to distributed file system to What client was sent, the permission that read right mark reads data to be written from storage region for specifying the second virtual machine to have.
In one possible implementation, multiple virtual machines share a storage region can be used in specific implementation as Under type: the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk includes storage region.
In one possible implementation, the second virtual machine further includes processing module.It is virtual that receiving module receives first After the metadata that machine is sent, processing module can be read according to the metadata received in the shared storage region of multiple virtual machines Data to be written, specifically following manner is can be used in processing module:
The first
After the metadata that receiving module receives that the first virtual machine is sent, if the second virtual machine passes through the operation system of itself System reads data to be written, then processing module generates or update the file information recorded in itself operating system according to metadata, The file information reads data to be written for operating system from storage region.
Second
If the second virtual machine reads data to be written, processing module reads number to be written according to metadata from storage region According to.
Using the above scheme, processing module can be read multiple according to the metadata for the data to be written that the first virtual machine is sent Data to be written in the shared storage region of virtual machine.
In one possible implementation, the second virtual machine is specified to be written with reading from storage region by name node The permission of data.
9th aspect, provides a kind of computer readable storage medium, is stored with computer in computer readable storage medium It executes instruction, when at least one processor of calculate node executes the computer executed instructions, calculate node executes above-mentioned the The various possible methods that offer is provided or the above-mentioned second aspect of execution or second aspect of one side or first aspect The side that the various possible designs of method or the above-mentioned third aspect of execution or the third aspect that various possible designs provide provide Method.
Tenth aspect, provides a kind of computer program product, which includes computer executed instructions, should Computer executed instructions store in a computer-readable storage medium.At least one processor of calculate node can be from computer Readable storage medium storing program for executing reads the computer executed instructions, at least one processor executes the computer executed instructions and to calculate section Methods that the various possible designs that point implements above-mentioned first aspect or first aspect provide or execute above-mentioned second aspect or The various possibility of method or the above-mentioned third aspect of execution or the third aspect that the various possible designs of person's second aspect provide The method provided is provided.
Detailed description of the invention
Fig. 1 is name node, client and multiple back end in distributed file system provided in an embodiment of the present invention Connection relationship schematic diagram;
Fig. 2 is showing for distributed file system provided in an embodiment of the present invention and the connection relationship of distributed block storage system It is intended to;
Fig. 3 is the flow diagram of the method for storage file in distributed file system provided in an embodiment of the present invention;
Fig. 4 is the distributed file system and distributed block storage system using the method for storage file shown in Fig. 3 Structural schematic diagram;
Fig. 5 is the structural schematic diagram for the first virtual machine of one kind that bright embodiment provides;
Fig. 6 is the structural schematic diagram for the first virtual machine of another kind that bright embodiment provides;
Fig. 7 is a kind of structural schematic diagram for name node that bright embodiment provides;
Fig. 8 is the structural schematic diagram for another name node that bright embodiment provides;
Fig. 9 is a kind of structural schematic diagram for client that bright embodiment provides;
Figure 10 is the structural schematic diagram for another client that bright embodiment provides;
Figure 11 is the structural schematic diagram for the second virtual machine of one kind that bright embodiment provides;
Figure 12 is the structural schematic diagram for the second virtual machine of another kind that bright embodiment provides;
Figure 13 is a kind of structural schematic diagram of distributed file system provided in an embodiment of the present invention.
Specific embodiment
The above-mentioned purpose of embodiment, scheme and advantage for a better understanding of the present invention, provided hereinafter detailed descriptions.It should Detailed description illustrates the various embodiments of device and/or method by using the attached drawings such as block diagram, flow chart and/or example. It include one or more functions and/or operation in these block diagrams, flow chart and/or example.It will be appreciated by those skilled in the art that It arrives: each function and/or operation in these block diagrams, flow chart or example, it can be by various hardware, software, solid Part is separately or cooperatively implemented, or is implemented by any combination of hardware, software and firmware.
The present embodiments relate to distributed file systems, describe in detail below to distributed file system.
As shown in Figure 1, distributed file system may include name node and multiple back end.According to distributed field system The application of system is different, and name node is properly termed as main control server or other titles again, and accordingly, back end again may be used With referred to as data server or other titles.It should be noted that only showing distributed file system in Fig. 1 includes a visitor The scene at family end may include in practice multiple client in distributed file system.
Wherein, for name node for managing multiple back end, name node records the text stored in each back end The information (such as meta data file) of part, service state of each back end etc.;Back end is used for storage file, works as client When end carries out file read-write operations, client is first to the index information of name node request back end, then root again Corresponding back end is accessed according to the index information requested to carry out file read-write.It may be synchronized between multiple back end File.For example when some file needs are written in two back end, one of back end can first be written, then by this Back end gives file synchronization to another back end.In addition, can also directly carry out letter between name node and back end Breath interaction.
Wherein, name node, back end, client can configure in following any equipment with computing capability Corresponding function is realized.The equipment with computing capability can be physical equipment or virtual unit;For example, physical equipment can be Personal computer, notebook computer, mainframe, Net-connected computer, handheld computer, personal digital assistant, work station etc., Virtual unit can be virtual machine or the container etc. disposed in a physical device.
Referring to fig. 2, when back end is virtual machine, the virtual hard disk of virtual machine is provided by distributed block storage system, Distributed block system management memory has multiple physical hard disks, to the virtual hard disk of virtual machine write-in file really to distributed block File is written in the physical hard disk of system management memory.
Referring to fig. 2, distributed file system can generally use file in storage file to guarantee itself reliability Copy mechanism, for example when storing some file, this document is respectively stored on two back end, that is, is stored in virtual machine 1 and virtual machine 2 on;Distributed block storage system is in order to guarantee itself reliability, specifically in the file of storage virtual machine It can be hard in the physics of physical server 1 respectively using duplicate of the document mechanism, such as when realizing the storage of this document of virtual machine 1 This document is stored on the physical hard disk 5 of disk 1, the physical hard disk 3 of physical server 2 and physical server 3, and virtual realizing The physical hard disk 2 of physical server 1,4 and of physical hard disk of physical server 2 are respectively stored in when the storage of this document of machine 2 This document is stored on the physical hard disk 6 of physical server 3.In this way, due to distributed file system and distributed block storage system It is all made of duplicate of the document mechanism, this document is caused to save six parts in the physical hard disk of distributed block system management memory.Obviously, The file number redundancy saved for same file can waste memory space, influence the process performance of system.
It should be noted that in order to explain operation of the distributed file system when storing some file, the distribution of Fig. 2 Two virtual machines are illustrated only in file system, each virtual machine includes a virtual hard disk;The distributed block storage system of Fig. 2 In illustrate only three physical servers, each physical server includes two physical hard disks.In actual implementation, distributed document System can store multiple files, thus the quantity for the virtual machine for including to distributed file system is with no restrictions, to each The quantity for the virtual hard disk that virtual machine includes is with no restrictions;Meanwhile to the physical server that distributed block storage system includes The quantity for the physical hard disk that quantity includes with no restrictions, to each physical server is also with no restrictions.
In order to solve the problems, such as that file number redundancy existing for distributed file system, the embodiment of the present invention provide a kind of point The method of storage file in cloth file system, distributed file system include name node, multiple void as back end Quasi- machine, wherein multiple virtual machines share same storage region.As shown in figure 3, this method comprises:
S301: client sends the request message for requesting that data to be written are written to distributed file system to name node.
Data to be written can be video data, audio data, document data or other binary data.Data to be written Granularity can be file, data block or other granularities.The quantity of data to be written can be one or more, as long as by Fig. 3 institute After showing that method executes once, some or multiple data have been written to distributed file system, the one or more data It is considered as data to be written.
S302: name node sends the corresponding response message of request message to client.
Wherein, response message includes the address of the first virtual machine and the address of the second virtual machine, and response message also indicates One virtual machine is a virtual machine in multiple virtual machines with the permission that data are written to storage region, and the second virtual machine is more Virtual machine in a virtual machine in addition to the first virtual machine.
Wherein, the quantity of the first virtual machine is necessary for one;The quantity of second virtual machine can be one, be also possible to more It is a, in the embodiment of the present invention with no restrictions to the quantity of the second virtual machine.
Only having a virtual machine in the embodiment of the present invention, in multiple virtual machines has the power that data are written to storage region Limit, the reason is that: if there is multiple virtual machines are for being written data to be written, then when client will be write to distributed file system When entering data to be written, has multiple virtual machines and receive the instruction that data to be written are written;Since multiple virtual machines share same deposit Storage area domain, then the instruction for the write-in data to be written that multiple virtual machines receive can indicate that multiple virtual machines will be in synchronization Write data and same storage region be written, will cause in this way be written data to be written instruction it is indistinguishable should be by which virtual machine Data to be written are written, cause the instruction that data to be written are written that can not execute.In addition, data to be written are written in only one virtual machine, Guarantee that data phase, which is written, in distributed system has only write a data to be written, is written in distributed system compared with the existing technology Data phase is more parts of data to be written of write-in, reduces data redundancy.
The quantity of the second virtual machine can be in the embodiment of the present invention for multiple reasons: when there is multiple client to read It when data to be written, can be read out by multiple second virtual machines, improve the efficiency that client reads data to be written.In addition, the After two virtual machines obtain the metadata of the data to be written of the first virtual machine transmission, can directly it be read from distributed block storage system The data to be written are taken, avoid the prior art the case where the first virtual machine breaks down and can not read the data to be written.
To the limitation of the quantity of the first virtual machine according to embodiments of the present invention, name node is sent to the response of client It can only indicate that the first virtual machine has the permission to storage region write-in data, the power without indicating the second virtual machine in message Limit, the reason is that: there is the permission that data are written to storage region due to only having a virtual machine in multiple virtual machines, work as response The first virtual machine in multiple virtual machines is indicated in message has the permission that data are written to storage region, then multiple virtual The second virtual machine in machine in addition to the first virtual machine is to default to have the permission for reading data to be written from storage region.
Optionally, response message, which also indicates the second virtual machine, has the permission that data to be written are read from institute's storage region.
In S302, the permission of the first virtual machine of response message instruction and the permission of the second virtual machine are only for number to be written According to.The storage region of first virtual machine and the shared write-in data to be written of the second virtual machine.For example, being arrived by data 1 to be written storage When distributed file system, response message indicates that virtual machine 1 is the first virtual machine, and virtual machine 2 is the second virtual machine, virtual machine 1 By in the storage region 1 of data 1 to be written write-in virtual machine 1, the metadata of data 1 to be written is then sent to virtual machine 2, In, virtual machine 2 and 1 shared storage area 1 of virtual machine;When distributed file system is arrived in data 2 to be written storage, response message Instruction virtual machine 1 is the first virtual machine, and virtual machine 3 is the second virtual machine, and virtual machine 1 is deposited data 2 to be written write-in virtual machine 1 In storage area domain 2, the metadata of data 2 to be written is then sent to virtual machine 3, wherein virtual machine 3 and the shared storage of virtual machine 1 Region 2.
S303: client sends data to be written and the second virtual machine to the first virtual machine according to the address of the first virtual machine Address.
Wherein, client is to be used to indicate the first void to the address that the first virtual machine sends data to be written and the second virtual machine Quasi- machine is written data to be written, generation or updates the metadata of data to be written and according to the address of the second virtual machine to the second void Quasi- machine sends the metadata of data to be written.
Data to be written are written in the storage region that S304: the first virtual machine is shared to multiple virtual machines, and generate or update to Write the metadata of data.
Wherein, the metadata of data to be written can be used for the first virtual machine and the second virtual machine according to the metadata from multiple void The shared storage region of quasi- machine reads data to be written;The metadata of data to be written includes but is not limited to: the storage position of data to be written It sets, the catalogue of the title of data to be written and data to be written.
S305: the first virtual machine sends the first number for generating or updating to the second virtual machine according to the address of the second virtual machine According to.
It should be noted that distributed file system generally may include client, client also may not include.If distributed File system includes client, and the quantity of client includes but is not limited to one.In the embodiment of the present invention, in order to more clearly retouch The interaction between client, name node, the first virtual machine and the second virtual machine is stated, includes in distributed field system by client In system.In actual implementation, distributed file system also may not include client, at this time the embodiment of the present invention can be considered client and The interaction of distributed file system.
Optionally, the same void that multiple virtual machine carry distributed block storage systems that distributed file system includes provide Quasi- hard disk, the virtual hard disk include the shared storage region of multiple virtual machines.
Using the method for storage file in distributed file system shown in Fig. 3, since distributed file system includes Multiple virtual machines share same storage region, thus in distributed file system, the data to be written are only in the storage region Storage is a.For data to be written, only can due to distributed block storage system use duplicate of the document mechanism and save more Part, it may be not present and saved caused by being all made of duplicate of the document mechanism due to distributed file system and distributed block storage system File number redundancy the problem of.
Further, since in distributed file system shown in Fig. 3 in the method for storage file, distributed file system packet The first virtual machine in multiple virtual machines included has the permission to storage region write-in data, and it is empty to remove first in multiple virtual machines The second virtual machine other than quasi- machine has the permission that data to be written are read from storage region.Thus, it can in distributed file system For providing for client, to read and write the quantity of the virtual machine of service of data to be written be multiple.When some virtual machine breaks down When, the service for reading and writing data to be written can be provided by other virtual machines for client, the availability of distributed file system is obtained Raising has been arrived, while having also avoided existing problem of resource waste when mechanism hot standby using virtual machine in the prior art.
In order to which how method shown in vivider ground explanation figure 3 solves storage file number redundancy issue while improving system Availability, now method shown in Fig. 3 is applied and is illustrated in distributed file system and distributed block storage system.Using The distributed file system and distributed block storage system of method shown in Fig. 3 can be as shown in Figure 4.Distributed document shown in Fig. 4 System includes the first virtual machine, the second virtual machine, client and name node.In actual implementation, to the quantity of the second virtual machine With no restrictions with the quantity of client.Distributed block storage system shown in Fig. 4 includes three physical servers, each physics clothes Business device includes two physical hard disks.
Wherein, the first virtual machine has the permission to storage region write-in data, and the second virtual machine has from storage region Read the permission of data to be written.Since the first virtual machine and the second virtual machine share same storage region, it can be considered that first is virtual Machine and second virtually shares the same virtual hard disk 1, and data to be written can be written in the first virtual machine into virtual hard disk 1, and second is empty Quasi- machine can read data to be written from virtual hard disk 1.Thus data to be written only store portion in distributed file system, i.e., It is stored in virtual hard disk 1, data to be written can store three parts in distributed block storage system, such as be respectively stored in physics On the physical hard disk 5 of the physical hard disk 1 of server 1, the physical hard disk 3 of physical server 2 and physical server 3.In this way, to be written Data only save three parts in physical hard disk.It is same in distributed file system shown in Fig. 2 and distributed block storage system A file saves six parts in physical hard disk, in contrast, after method shown in Fig. 3, distributed field system shown in Fig. 4 System only saves three parts with data to be written in distributed block storage system in physical hard disk, to greatly reduce file preservation Number, solve the problems, such as the file number redundancy saved in distributed file system.
In addition, the first virtual machine can be used for being written data to be written and read data to be written, and the second virtual machine can in Fig. 4 For reading data to be written, thus when wherein some virtual machine breaks down, the virtual machine that can not be broken down by another The service for reading and writing data to be written is provided for client, improves the availability of system.
Further, the second virtual machine is after receiving the metadata of data to be written of the first virtual machine transmission, if Following two situation can be divided by needing to generate or update the file information recorded in itself operating system:
The first situation
If the storage region that the first virtual machine is shared to multiple virtual machines is by the first virtual machine when data to be written are written Operating system write-in, and client reads data to be written by the second virtual machine and is also required to behaviour by the second virtual machine Make system reading, at this point, the second virtual machine needs to generate or update according to the metadata of data to be written in the operating system of itself The file information of record, the operating system for being just able to achieve the second virtual machine read data to be written from the storage region.Wherein, should Operating system of the file information for the second virtual machine reads the data to be written from storage region.Update the second virtual machine The mode of the file information in operating system can there are two types of;The first, if the operating system of the second virtual machine can be known The variation of data in storage region oneself can then update the file information of the data to be written;Second, the behaviour of the second virtual machine The metadata for making the data to be written that system can be sent according to the first virtual machine, updates the file information of the data to be written.
Second situation
It is to write direct the storage when data to be written are written in the storage region that the first virtual machine is shared to multiple virtual machines Region, rather than when being written by the operating system of the first virtual machine, client can directly read this by the second virtual machine Data to be written in shared storage region, without being read by the operating system of the second virtual machine.At this point, second is virtual Machine does not need to generate or update the file information recorded in itself operating system according to the metadata of data to be written, and second is virtual Machine can read data to be written according only to the metadata of data to be written from the storage region.
In S302, name node needs to indicate that the first virtual machine is to have in multiple virtual machines to depositing by corresponding message A virtual machine of the permission of data is written in storage area domain, and the second virtual machine is the void in multiple virtual machines in addition to the first virtual machine Quasi- machine, i.e., by the operation of S302 as above after, the client not only address of available first virtual machine and the second virtual machine Address can also know that the first virtual machine has the permission to storage region write-in data, and the second virtual machine has from memory block Read the permission of data to be written in domain.Name node to the first virtual machine of client notification and the second virtual machine permission mode packet It includes but is not limited to following two:
First way
In executing S302, name node further includes the write permission mark of the first virtual machine to the response message that client is sent Know and the read right of the second virtual machine identifies, write permission mark is for specifying the first virtual machine to be written with being written to storage region The permission of data, the permission that read right mark reads data to be written from storage region for specifying the second virtual machine to have.Client The permission of the first virtual machine and the second virtual machine is known according to write permission mark and read right mark in end, and write permission is identified and is sent out The first virtual machine is given, read right mark is sent to the second virtual machine, i.e., the respective permission of multiple virtual machines is handed down to phase The virtual machine answered.
Wherein, the process that write permission mark is sent to the first virtual machine by client can be before executing S303, can also Can also be performed simultaneously with S303 after executing S303, i.e., simultaneously by write permission mark, data to be written and the second virtual machine Address be sent to the first virtual machine.The embodiment of the present invention to the execution of the two steps sequence with no restrictions.Similarly, client Read right can be identified and be sent to the second virtual machine by end.
The second way
The address of the first virtual machine and the address of the second virtual machine are arranged according to preset rules in the response message, preset rule The permissions of data to be written is then written to storage region for specifying the first virtual machine to have, and specified second virtual machine have from Storage region reads the permission of data to be written.
Wherein, preset rules can be the sequence of the address for the virtual machine that response message includes.For example, name node and visitor Family end can arrange in advance: first address that name node is sent to client is the address of the first virtual machine, then client End can determine that first address is with to storage region behind the address for receiving multiple virtual machines that response message includes The address of the first virtual machine of the permission of data is written, remaining address is with the permission for reading data to be written from storage region The second virtual machine address.
It has been mentioned hereinbefore that due to can be used for providing the service of data to be written of reading and writing in distributed file system for client Virtual machine quantity be it is multiple.When some virtual machine breaks down, read-write can be provided for client by other virtual machines The service of data to be written, to improve the availability of distributed file system.It below will be to the virtual of distributed file system How machine operates after breaking down is explained in detail.
In distributed file system, detect that the mode that virtual machine breaks down includes but is not limited to following three kinds: the One kind, name node detect that some virtual machine breaks down;Second, client reads data by some virtual machine or writes When entering data, if being unable to complete reading and writing data process, it is determined that the virtual machine breaks down, which event by client occurs The information reporting of barrier is to name node;The third, virtual machine can periodically carry out self-test, when some virtual machine finds that event occurs in itself When barrier, the message of itself fail can be directly reported to name node, or name node is reported to by client.Cause This, when some virtual machine in distributed file system breaks down, name node can know the void with above-mentioned three kinds of approach The message that quasi- machine breaks down can then take corresponding operating, and avoiding the occurrence of distributed file system can not provide for client The case where reading and writing data services.
In the embodiment of the present invention, the virtual machine of distributed file system, which breaks down, can be divided into following two situation:
The first situation: the first virtual machine breaks down
When the first virtual machine breaks down, name node sends the first update message to client, which disappears Breath includes the address of the first virtual machine updated, which specifies in multiple virtual machines except break down first is empty Another virtual machine other than quasi- machine indicates that the first virtual machine updated has to multiple void as the first virtual machine updated The permission of the shared storage region write-in data of quasi- machine;Thus, client need to be written data it is to be written when, update can be passed through The write-in of first virtual machine.
In this way, when the first virtual machine breaks down, name node is specified in multiple virtual machines except generation event Another virtual machine other than first virtual machine of barrier indicates the first virtual equipment updated as the first virtual machine updated The permission of the shared storage region write-in data of oriented multiple virtual machines, then update can be passed through when data will be written in client The first virtual machine write-in, when client will read data, can by the second virtual machine read or pass through update first Virtual machine is read.
To sum up, in this way, it will not influence client write-in when the first virtual machine breaks down or read data, Improve the availability of system.
As previously mentioned, the quantity of the second virtual machine can be one or more, when the quantity of the second virtual machine is one When, if the first virtual machine breaks down, following method: title section can also be further executed after executing the above method Point specifies one or more the second virtual machines updated except multiple virtual machines that distributed file system includes, and the of update Two virtual machines have the permission that data to be written are read from storage region, and the second virtual machine and the distributed file system of update include Multiple virtual machines share same storage region;Name node has to the second virtual machine that client instruction updates from memory block Read the permission of data to be written in domain.After client receives the instruction of name node, the first virtual machine updated is notified: will be to be written The metadata of data is sent to the second virtual machine of update.In this way, can not only pass through when client will read data to be written Two virtual machines are read, and can also be read by the second virtual machine of update.
Name node specifies the second virtual machine updated to have except multiple virtual machines that distributed file system includes The permission that data to be written are read from storage region, when needing to read data to be written there are multiple client, client not only may be used It to be read by the second virtual machine, can also be read by the second virtual machine of update, improve client and read data to be written Efficiency.
Second situation: the second virtual machine breaks down
When the second virtual machine breaks down, name node sends the second more new information, the second more new information to client The address of the second virtual machine including update, the second more new information specify another virtual machine other than multiple virtual machines as more The second new virtual machine, the second virtual machine of update have the permission that data to be written are read from storage region.Wherein, the of update Multiple virtual machines that two virtual machines and distributed file system include share above-mentioned same storage region.Name node is to client Indicate that the second virtual machine updated has after the permission that storage region reads data to be written, client is according to the finger of name node Show and notify the first virtual machine: the metadata of data to be written is sent to the second virtual machine of update.
Wherein, the first virtual machine can will be to according to the instruction of notification message after the notification message for receiving client transmission The metadata for writing data is sent to the second virtual machine of update.In this way, can not only pass through when client will read data to be written The second virtual machine not broken down reads or is read by the first virtual machine, can also be read by the second virtual machine of update.
In this way, when the second virtual machine breaks down, specified the second virtual machine updated of name node has The permission of data to be written is read from storage region, then can write by the first virtual machine when data to be written will be written in client Enter, when client will read data to be written, can not only be read by the first virtual machine and the second virtual machine not broken down, It can also be read by the second virtual machine of update.It to sum up, in this way, will not shadow when the first virtual machine breaks down It rings client write-in or reads data to be written, improve the availability of system.
Using the method for storage file in distributed file system provided in an embodiment of the present invention, distributed text can solve The problem of file number redundancy stored in part system.In addition, some virtual machine in distributed file system breaks down When, client will not influence using the method for storage file in distributed file system provided in an embodiment of the present invention and read or write Enter file, improves the availability of system.
The embodiment of the present invention provides the first virtual machine in a kind of distributed file system, which includes Name node, multiple virtual machines as back end, multiple virtual machines share same storage region, and the first virtual machine is multiple By the specified virtual machine with the permission to storage region write-in data of name node in virtual machine.As shown in figure 5, this One virtual machine 500 includes:
Receiving module 501, for receiving the data to be written of client transmission, the address of the second virtual machine, the second virtual machine For the virtual machine in multiple virtual machines in addition to the first virtual machine;
Processing module 502 for the received data to be written of receiving module 501 to be written to storage region, and is generated or is updated The metadata of data to be written;
Sending module 503, for being sent according to the address of received second virtual machine of receiving module 501 to the second virtual machine Processing module 502 generates or more new metadata.
Optionally, receiving module 501 is also used to: before data to be written are written to storage region in processing module 502, being received The write permission mark for the first virtual machine that client is sent.
Wherein, write permission mark be name node client to name node request to be written to distributed file system to It is sent when writing data to client, data to be written are written for specifying the first virtual machine to have to storage region in write permission mark Permission.
Optionally, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk include Storage region.
Optionally, if the second virtual machine reads data to be written by the operating system of itself, metadata is empty for second Quasi- machine generates or updates the file information recorded in itself operating system, and the file information is for operating system from storage region Read data to be written;If the second virtual machine reading data to be written, metadata are used for the second virtual machine from storage region Read data to be written.
Optionally, the second virtual machine is specified with the permission for reading data to be written from storage region by name node.
Using the first virtual machine 500 provided in an embodiment of the present invention, the text stored in distributed file system can solve The problem of part number redundancy.In addition, implementing through the invention when some virtual machine in distributed file system breaks down The operation for the first virtual machine 500 that example provides can make the failure of virtual machine will not influence client and read or be written file, Improve the availability of system.
It should be noted that the first virtual machine 500 provided in an embodiment of the present invention can be used for executing distribution shown in Fig. 3 The operation that the first virtual machine executes in the method for storage file in file system, the first virtual machine 500 are not explained in detail and are described Implementation can refer to the associated description in distributed file system shown in Fig. 3 in the method for storage file.
It should be noted that being schematical, only a kind of logic function to the division of module in the embodiment of the present invention It divides, there may be another division manner in actual implementation.In addition, each functional module in each embodiment of the application can be with It is integrated in a processing module, is also possible to modules and physically exists alone, it can also be with two or more module collection In Cheng Yi module.Above-mentioned integrated module both can take the form of hardware realization, can also use software function module Form realize.
Based on above embodiments, the embodiment of the invention also provides a kind of first virtual machine, which can be held The method that the corresponding embodiment of row Fig. 3 provides, can be identical as the first virtual machine 500 shown in fig. 5.
Referring to Fig. 6, the equipment where the first virtual machine 600 includes at least one processor 601, memory 602 and communication Interface 603;At least one described processor 601, the memory 602 and the communication interface 603 are connected by bus 604 It connects;
The memory 602, for storing computer executed instructions;
At least one described processor 601, the computer executed instructions stored for executing the memory 602, so that First virtual machine 600 carries out data interaction with other equipment in distributed file system by the communication interface 603 Method to execute storage file in distributed file system provided by the above embodiment, or make first virtual machine 600 carry out data interaction with other equipment in distributed file system to realize distributed text by the communication interface 603 The some or all of function of part system.
At least one processor 601 may include different types of processor 601, or the processing including same type Device 601;Processor 601 can be below any: and central processing unit (Central Processing Unit, referred to as CPU), arm processor, field programmable gate array (Field Programmable Gate Array, abbreviation FPGA), dedicated Processor etc. has the device of calculation processing ability.A kind of optional embodiment, at least one described processor 601 can also collect As many-core processor.
Memory 602 can be below any or any combination: random access memory (Random Access Memory, abbreviation RAM), read-only memory (read only memory, abbreviation ROM), nonvolatile memory (non- Volatile memory, abbreviation NVM), solid state hard disk (Solid State Drives, abbreviation SSD), mechanical hard disk, disk, The storage mediums such as disk array.
For the first virtual machine 600 and other equipment, (such as other in distributed file system are set communication interface 603 It is standby) carry out data interaction.Communication interface 603 can be below any or any combination: network interface (such as Ethernet Interface), the device with network access facility such as wireless network card.
The bus 604 may include address bus, data/address bus, control bus etc., for convenient for indicating, Fig. 6 is thick with one Line indicates the bus.Bus 604 can be below any or any combination: industry standard architecture (Industry Standard Architecture, abbreviation ISA) bus, peripheral component interconnection (Peripheral Component Interconnect, abbreviation PCI) bus, expanding the industrial standard structure (Extended Industry Standard Architecture, abbreviation EISA) wired data transfers such as bus device.
The embodiment of the present invention provides the name node in a kind of distributed file system, which includes name Claim node, multiple virtual machines as back end, multiple virtual machines share same storage region;As shown in fig. 7, name node 700 include:
The request message of data to be written is written for receiving client request to distributed file system for receiving module 701;
Sending module 702 is used for the corresponding response message of the received request message of client sending/receiving module 701, Response message includes the address of the first virtual machine and the address of the second virtual machine, and response message indicates that the first virtual machine is multiple void There is a virtual machine of the permission to storage region write-in data, the second virtual machine is that first is removed in multiple virtual machines in quasi- machine Virtual machine other than virtual machine.
Optionally, response message, which also indicates the second virtual machine, has the permission that data to be written are read from storage region.
Optionally, response message further includes that the write permission mark of the first virtual machine and the read right of the second virtual machine identify, The permission that data to be written are written to storage region for specifying the first virtual machine to have for write permission mark, read right are identified for referring to Fixed second virtual machine has the permission that data to be written are read from storage region.
Optionally, the address of the first virtual machine and the address of the second virtual machine are arranged according to preset rules in response message, Preset rules are used to that the first virtual machine to be specified to have the permission that data to be written are written to storage region, and specified second virtual machine With the permission for reading data to be written from storage region.
Optionally, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk include Storage region.
Optionally, sending module 702 is also used to: when the first virtual machine breaks down, being sent first to client and is updated Information, the first more new packets include the address of the first virtual machine of update, and the first more new information is specified in multiple virtual machines except hair As the first virtual machine updated, the first virtual machine of update has another virtual machine other than first virtual machine of raw failure To the permission of storage region write-in data;And/or when the second virtual machine breaks down, second is sent to client and updates letter Breath, the second more new packets include the address of the second virtual machine of update, and the second more new information is specified another other than multiple virtual machines For one virtual machine as the second virtual machine updated, the second virtual machine of update has the power that data to be written are read from storage region Limit.
Using name node 700 provided in an embodiment of the present invention, the file stored in distributed file system can solve The problem of number redundancy.In addition, when some virtual machine in distributed file system breaks down, through the embodiment of the present invention The operation of the name node 700 of offer can make the failure of virtual machine will not influence client and read or be written file, improve The availability of system.
It should be noted that name node 700 provided in an embodiment of the present invention can be used for executing distributed text shown in Fig. 3 The operation that name node executes in the method for storage file in part system, the realization that name node 700 is not explained in detail and described Mode can refer to the associated description in distributed file system shown in Fig. 3 in the method for storage file.
Based on above embodiments, the embodiment of the invention also provides a kind of name node, the name node is Fig. 3 pairs executable The method that the embodiment answered provides, can be identical as name node 700 shown in Fig. 7.
Referring to Fig. 8, name node 800 includes at least one processor 801, memory 802 and communication interface 803;It is described At least one processor 801, the memory 802 and the communication interface 803 are connected by bus 804;
The memory 802, for storing computer executed instructions;
At least one described processor 801, the computer executed instructions stored for executing the memory 802, so that The name node 800 by the communication interface 803 and other equipment in distributed file system carry out data interaction come The method for executing storage file in distributed file system provided by the above embodiment, or make the name node 800 logical It crosses the communication interface 803 and carries out data interaction with other equipment in distributed file system to realize distributed file system Some or all of function.
At least one processor 801 may include different types of processor 801, or the processing including same type Device 801;Processor 801 can be below any: CPU, arm processor, FPGA, application specific processor etc. have calculation processing The device of ability.A kind of optional embodiment, at least one described processor 801 can also be integrated into many-core processor.
Memory 802 can be below any or any combination: RAM, ROM, NVM, SSD, mechanical hard disk, magnetic The storage mediums such as disk, disk array.
Communication interface 803 is for name node 800 and other equipment (such as other equipment in distributed file system) Carry out data interaction.Communication interface 803 can be below any or any combination: network interface (such as Ethernet connects Mouthful), the device with network access facility such as wireless network card.
The bus 804 may include address bus, data/address bus, control bus etc., for convenient for indicating, Fig. 8 is thick with one Line indicates the bus.Bus 804 can be below any or any combination: isa bus, pci bus, eisa bus etc. The device of wired data transfer.
The embodiment of the present invention provides a kind of client, the distributed file system where the client include name node, Multiple virtual machines as back end, multiple virtual machines share same storage region;As shown in figure 9, client 900 includes:
Sending module 901, for sending the request for requesting that data to be written are written to distributed file system to name node Message;
Receiving module 902, for receiving the corresponding response message of request message of name node transmission, response message includes The address of the address of first virtual machine and the second virtual machine, response message indicate the first virtual machine be multiple virtual machines in have to Storage region be written data permission a virtual machine, the second virtual machine be multiple virtual machines in addition to the first virtual machine Virtual machine;
Sending module 901 is also used to the address for the first virtual machine for including according to the received response message of receiving module 902 The address of data to be written, the second virtual machine is sent to the first virtual machine, and indicate the first virtual machine be written data to be written, generation or Update the metadata of data to be written and the address for the second virtual machine for including according to the received response message of receiving module 902 The metadata of data to be written is sent to the second virtual machine.
Optionally, response message further includes that the write permission mark of the first virtual machine and the read right of the second virtual machine identify, The permission that data to be written are written to storage region for specifying the first virtual machine to have for write permission mark, read right are identified for referring to Fixed second virtual machine has the permission that data to be written are read from storage region.
Optionally, the address of the first virtual machine and the address of the second virtual machine that response message includes are arranged according to preset rules Column, preset rules, which are used to that the first virtual machine to be specified to have to storage region, is written the permission and specified second of data to be written virtually Machine has the permission that data to be written are read from storage region.
Optionally, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk include Storage region.
Optionally, response message, which also indicates the second virtual machine, has the permission that data to be written are read from storage region.
Using client 900 provided in an embodiment of the present invention, the file part stored in distributed file system can solve The problem of number redundancy.In addition, being mentioned through the embodiment of the present invention when some virtual machine in distributed file system breaks down The operation of the client 900 of confession can make the failure of virtual machine will not influence client and read or write-in file, improve and be The availability of system.
It should be noted that client 900 provided in an embodiment of the present invention can be used for executing distributed document shown in Fig. 3 In system in the method for storage file client executing operation, the implementation that client 900 is not explained in detail and described can With reference to storage file in distributed file system shown in Fig. 3 method in associated description.
Based on above embodiments, the embodiment of the invention also provides a kind of client, it is corresponding which can execute Fig. 3 Embodiment provide method, can be identical as client 900 shown in Fig. 9.
Referring to Figure 10, the equipment where client 1000 includes at least one processor 1001, memory 1002 and communication Interface 1003;At least one described processor 1001, the memory 1002 and the communication interface 1003 pass through bus 1004 connections;
The memory 1002, for storing computer executed instructions;
At least one described processor 1001, the computer executed instructions stored for executing the memory 1002, makes It obtains the client 1000 and data interaction is carried out with the equipment in distributed file system to hold by the communication interface 1003 The method of storage file in row distributed file system provided by the above embodiment, or the client 1000 is made to pass through institute It states the equipment in communication interface 1003 and distributed file system and carries out data interaction to realize the part of distributed file system Or repertoire.
At least one processor 1001 may include different types of processor 1001, or the place including same type Manage device 1001;Processor 1001 can be below any: CPU, arm processor, FPGA, application specific processor etc., which have, to be calculated The device of processing capacity.A kind of optional embodiment, at least one described processor 1001 can also be integrated into many-core processor.
Memory 1002 can be below any or any combination: RAM, ROM, NVM, SSD, mechanical hard disk, magnetic The storage mediums such as disk, disk array.
Communication interface 1003 is for client 1000 and other equipment (such as other equipment in distributed file system) Carry out data interaction.Communication interface 1003 can be below any or any combination: network interface (such as Ethernet connects Mouthful), the device with network access facility such as wireless network card.
The bus 1004 may include address bus, data/address bus, control bus etc., for convenient for indicating, Figure 10 is with one Thick line indicates the bus.Bus 1004 can be below any or any combination: isa bus, pci bus, EISA are total The device of the wired data transfers such as line.
The embodiment of the present invention provides the second virtual machine in a kind of distributed file system, and distributed file system includes name Claim node, multiple virtual machines as back end, multiple virtual machines share same storage region;As shown in figure 11, second is empty Intending machine 1100 includes:
Receiving module 1101, for receiving the metadata of the first virtual machine transmission, the first virtual machine is in multiple virtual machines By the specified virtual machine with the permission to storage region write-in data of name node, the second virtual machine is multiple virtual machines In virtual machine in addition to the first virtual machine, metadata is to generate after data to be written are written to storage region in the first virtual machine or more The metadata of new data to be written.
Optionally, receiving module 1101 is also used to: before receiving the metadata that the first virtual machine is sent, receiving client The read right mark of the second virtual machine sent, read right mark is that name node is requested to name node to distribution in client To client transmission when data to be written are written in formula file system, read right mark is for specifying the second virtual machine to have from storage Read the permission of data to be written in region.
Optionally, the same virtual hard disk that multiple virtual machine carry distributed block storage systems provide, virtual hard disk include Storage region.
Optionally, the second virtual machine further include: processing module 1102, for receiving the first virtual machine in receiving module 1101 After the metadata of transmission, if the second virtual machine reads data to be written by the operating system of itself, generated according to metadata Or the file information recorded in itself operating system is updated, the file information reads from storage region to be written for operating system Data;If the second virtual machine reads data to be written, reads data to be written from storage region according to metadata.
Optionally, the second virtual machine is specified with the permission for reading data to be written from storage region by name node.
Using the second virtual machine 1100 provided in an embodiment of the present invention, the text stored in distributed file system can solve The problem of part number redundancy.In addition, implementing through the invention when some virtual machine in distributed file system breaks down The operation for the second virtual machine 1100 that example provides, the failure of virtual machine can be made to will not influence, and client is read or write-in is literary Part improves the availability of system.
It should be noted that the second virtual machine 1100 provided in an embodiment of the present invention can be used for executing distribution shown in Fig. 3 The operation that the second virtual machine executes in the method for storage file in file system, the second virtual machine 1100 are not explained in detail and are described Implementation can refer to the associated description in distributed file system shown in Fig. 3 in the method for storage file.
Based on above embodiments, the embodiment of the invention also provides a kind of second virtual machine, which can be held The method that the corresponding embodiment of row Fig. 3 provides, can be identical as the second virtual machine 1100 shown in Figure 11.
Referring to Figure 12, the equipment where the second virtual machine 1200 includes at least one processor 1201,1202 and of memory Communication interface 1203;At least one described processor 1201, the memory 1202 and the communication interface 1203 are by total Line 1204 connects;
The memory 1202, for storing computer executed instructions;
At least one described processor 1201, the computer executed instructions stored for executing the memory 1202, makes It obtains second virtual machine 1200 and passes through the communication interface 1203 and other equipment progress data in distributed file system Interaction is come the method that executes storage file in distributed file system provided by the above embodiment, or makes described second virtual Machine 1200 carries out data interaction with other equipment in distributed file system to realize distribution by the communication interface 1203 The some or all of function of formula file system.
At least one processor 1201 may include different types of processor 1201, or the place including same type Manage device 1201;Processor 1201 can be below any: CPU, arm processor, FPGA, application specific processor etc., which have, to be calculated The device of processing capacity.A kind of optional embodiment, at least one described processor 1201 can also be integrated into many-core processor.
Memory 1202 can be below any or any combination: RAM, ROM, NVM, SSD, mechanical hard disk, magnetic The storage mediums such as disk, disk array.
For the second virtual machine 1200 and other equipment, (such as other in distributed file system are set communication interface 1203 It is standby) carry out data interaction.Communication interface 1203 can be below any or any combination: network interface (such as ether Network interface), the device with network access facility such as wireless network card.
The bus 1204 may include address bus, data/address bus, control bus etc., for convenient for indicating, Figure 12 is with one Thick line indicates the bus.Bus 1204 can be below any or any combination: isa bus, pci bus, EISA are total The device of the wired data transfers such as line.
The embodiment of the present invention provides a kind of distributed file system, and as shown in figure 13, distributed file system 1300 includes: First virtual machine 1301, name node 1302, client 1303 and the second virtual machine 1304.
Wherein, the first virtual machine 1301 in distributed file system 1300 can be used for executing distributed text shown in Fig. 3 Relevant operation performed by first virtual machine in the method for storage file in part system, specific implementation form can be Fig. 5 institute The first virtual machine 500 or the first virtual machine 600 shown in fig. 6 shown;Name node 1302 in distributed file system 1300 It can be used for executing relevant operation performed by name node in the method for storage file in distributed file system shown in Fig. 3, Its specific implementation form can be name node 700 or name node shown in Fig. 8 800 shown in Fig. 7;Distributed file system Client 1303 in 1300 can be used for executing in distributed file system shown in Fig. 3 client institute in the method for storage file The relevant operation of execution, specific implementation form can be client 900 or client shown in Fig. 10 1000 shown in Fig. 9; The second virtual machine 1304 in distributed file system 1300, which can be used for executing, stores text in distributed file system shown in Fig. 3 Relevant operation performed by second virtual machine in the method for part, specific implementation form can be the second virtual machine shown in Figure 11 Second virtual machine 1200 shown in 1100 or Figure 12.
In distributed file system 1300, data to be written only save one in the storage region that multiple virtual machines are shared Part, solve the problems, such as the file number redundancy stored in distributed file system.In addition, when in distributed file system 1300 Some virtual machine when breaking down, client still can be by the virtual machine pair that does not break down in distributed file system 1300 Data to be written carry out write operation or read operation, improve the availability of distributed file system.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, those skilled in the art can carry out various modification and variations without departing from this hair to the embodiment of the present invention The spirit and scope of bright embodiment.In this way, if these modifications and variations of the embodiment of the present invention belong to the claims in the present invention And its within the scope of equivalent technologies, then the present invention is also intended to include these modifications and variations.

Claims (22)

1. a kind of method of storage file in distributed file system, which is characterized in that the distributed file system includes name Claim node, multiple virtual machines as back end, the multiple virtual machine shares same storage region;The described method includes:
First virtual machine receives the address of the data to be written of client transmission, the second virtual machine, and first virtual machine is described By the specified virtual machine with the permission to storage region write-in data of the name node, institute in multiple virtual machines Stating the second virtual machine is the virtual machine in the multiple virtual machine in addition to first virtual machine;
The data to be written are written to the storage region in first virtual machine, and generate or update the member of the data to be written Data;
First virtual machine sends the metadata to second virtual machine according to the address of second virtual machine.
2. the method as described in claim 1, which is characterized in that first virtual machine to storage region write-in it is described to Before writing data, further includes:
First virtual machine receives the write permission mark for first virtual machine that the client is sent, the write permission mark Knowledge is that the name node requests that the number to be written is written to distributed file system to the name node in the client According to when sent to the client, write permission mark is for specifying first virtual machine to have to the storage region The permission of the data to be written is written.
3. method according to claim 1 or 2, which is characterized in that the multiple virtual machine carry distributed block storage system The same virtual hard disk provided, the virtual hard disk includes the storage region.
4. method according to claim 1 or 2, which is characterized in that
If second virtual machine reads the data to be written by the operating system of itself, the metadata is for described the Two virtual machines generate or update the file information recorded in itself operating system, and the file information is used for the operating system The data to be written are read from the storage region;Or
If second virtual machine reads the data to be written, the metadata is used for second virtual machine from the storage The data to be written are read in region.
5. method according to claim 1 or 2, which is characterized in that second virtual machine is by the specified tool of the name node There is the permission that the data to be written are read from the storage region.
6. a kind of method of storage file in distributed file system, which is characterized in that the distributed file system includes name Claim node, multiple virtual machines as back end, the multiple virtual machine shares same storage region;The described method includes:
The name node receives the request message that data to be written are written to the distributed file system for client request;
The name node sends the corresponding response message of the request message to the client, and the response message includes the The address of the address of one virtual machine and the second virtual machine, the response message indicate that first virtual machine is the multiple virtual There is a virtual machine of the permission to storage region write-in data, second virtual machine is the multiple virtual in machine Virtual machine in machine in addition to first virtual machine.
7. method as claimed in claim 6, which is characterized in that the response message also indicate second virtual machine have from The storage region reads the permission of the data to be written.
8. method according to claim 6 or 7, which is characterized in that the response message further includes first virtual machine The read right of write permission mark and second virtual machine identifies, and the write permission mark is for specifying the described first virtual equipment The permission of the data to be written is written in the oriented storage region, and the read right mark is for specifying the described second virtual equipment There is the permission that the data to be written are read from the storage region.
9. method according to claim 6 or 7, which is characterized in that the address of the first virtual machine described in the response message It is arranged with the address of second virtual machine according to preset rules, the preset rules are for specifying first virtual machine to have The permission of the data to be written is written to the storage region, and second virtual machine is specified to have from the storage region Read the permission of the data to be written.
10. method according to claim 6 or 7, which is characterized in that the multiple virtual machine carry distributed block storage system The same virtual hard disk provided, the virtual hard disk includes the storage region.
11. method according to claim 6 or 7, which is characterized in that the method also includes:
When first virtual machine breaks down, the name node sends the first more new information to the client, described First more new packets include the address of the first virtual machine of update, and the described first more new information is specified to be removed in the multiple virtual machine First virtual machine of another virtual machine as the update other than first virtual machine to break down, the update First virtual machine has the permission to storage region write-in data;And/or when second virtual machine breaks down, institute It states name node and sends the second more new information to the client, the described second more new packets include the second virtual machine of update Address, the described second more new information specify the multiple virtual machine except first virtual machine, the second virtual machine to break down Second virtual machine of another virtual machine as the update other than the second virtual machine not broken down, the update Second virtual machine has the permission that the data to be written are read from the storage region.
12. the first virtual machine in a kind of distributed file system, which is characterized in that the distributed file system includes title Node, multiple virtual machines as back end, the multiple virtual machine share same storage region, and first virtual machine is It is virtual by the name node specified one with the permission to storage region write-in data in the multiple virtual machine Machine;First virtual machine includes:
Receiving module, for receiving the data to be written of client transmission, the address of the second virtual machine, second virtual machine is institute State the virtual machine in multiple virtual machines in addition to first virtual machine;
Processing module for the received data to be written of receiving module to be written to the storage region, and generates or more The metadata of the new data to be written;
Sending module, for being sent out according to the address of received second virtual machine of the receiving module to second virtual machine The processing module is sent to generate or update the metadata.
13. the first virtual machine as claimed in claim 12, which is characterized in that the receiving module is also used to:
Before the data to be written are written to the storage region in the processing module, the described of the client transmission is received The write permission of first virtual machine identifies, and the write permission mark is the name node in the client to the name node It is sent when requesting that the data to be written are written to distributed file system to the client, the write permission mark is for referring to Fixed first virtual machine has the permission that the data to be written are written to the storage region.
14. the first virtual machine as described in claim 12 or 13, which is characterized in that the multiple virtual machine carry distributed block The same virtual hard disk that storage system provides, the virtual hard disk includes the storage region.
15. the first virtual machine as described in claim 12 or 13, which is characterized in that
If second virtual machine reads the data to be written by the operating system of itself, the metadata is for described the Two virtual machines generate or update the file information recorded in itself operating system, and the file information is used for the operating system The data to be written are read from the storage region;Or
If second virtual machine reads the data to be written, the metadata is used for second virtual machine from the storage The data to be written are read in region.
16. the first virtual machine as described in claim 12 or 13, which is characterized in that second virtual machine is by the title section Point is specified with the permission for reading the data to be written from the storage region.
17. the name node in a kind of distributed file system, which is characterized in that the distributed file system includes the name Claim node, multiple virtual machines as back end, the multiple virtual machine shares same storage region;The name node packet It includes:
The request message of data to be written is written for receiving client request to the distributed file system for receiving module;
Sending module disappears for sending corresponding respond of the received request message of the receiving module to the client Breath, the response message include the address of the first virtual machine and the address of the second virtual machine, the response message instruction described the One virtual machine is a virtual machine in the multiple virtual machine with the permission that data are written to the storage region, described the Two virtual machines are the virtual machine in the multiple virtual machine in addition to first virtual machine.
18. name node as claimed in claim 17, which is characterized in that the response message also indicates second virtual machine With the permission for reading the data to be written from the storage region.
19. the name node as described in claim 17 or 18, which is characterized in that the response message further includes described first empty The read right of the write permission mark and second virtual machine of intending machine identifies, and the write permission mark is for specifying first void The permission of the data to be written is written in the quasi- oriented storage region of equipment, and the read right mark is for specifying second void Quasi- machine has the permission that the data to be written are read from the storage region.
20. the name node as described in claim 17 or 18, which is characterized in that the first virtual machine described in the response message Address and the address of second virtual machine arranged according to preset rules, the preset rules are described first virtual for specifying The permission of the data to be written is written in the oriented storage region of equipment, and specifies second virtual machine to have and deposit from described Read the permission of the data to be written in storage area domain.
21. the name node as described in claim 17 or 18, which is characterized in that the multiple virtual machine carry distributed block is deposited The same virtual hard disk that storage system provides, the virtual hard disk includes the storage region.
22. the name node as described in claim 17 or 18, which is characterized in that the sending module is also used to:
When first virtual machine breaks down, Xiang Suoshu client sends the first more new information, the described first more new information The address of the first virtual machine including update, the described first more new information are specified in the multiple virtual machine except the institute to break down State first virtual machine of another virtual machine other than the first virtual machine as the update, the virtual equipment of the first of the update The permission of the oriented storage region write-in data;And/or
When second virtual machine breaks down, Xiang Suoshu client sends the second more new information, the described second more new information The address of the second virtual machine including update, the described second more new information specify the multiple virtual machine virtual except described first Another virtual machine other than machine, the second virtual machine to break down and the second virtual machine not broken down is as the update The second virtual machine, the second virtual machine of the update has the permission that the data to be written are read from the storage region.
CN201610846967.0A 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node Active CN106446159B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610846967.0A CN106446159B (en) 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node
PCT/CN2017/085351 WO2018054079A1 (en) 2016-09-23 2017-05-22 Method for storing file, first virtual machine and namenode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610846967.0A CN106446159B (en) 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node

Publications (2)

Publication Number Publication Date
CN106446159A CN106446159A (en) 2017-02-22
CN106446159B true CN106446159B (en) 2019-11-12

Family

ID=58167356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610846967.0A Active CN106446159B (en) 2016-09-23 2016-09-23 A kind of method of storage file, the first virtual machine and name node

Country Status (2)

Country Link
CN (1) CN106446159B (en)
WO (1) WO2018054079A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446159B (en) * 2016-09-23 2019-11-12 华为技术有限公司 A kind of method of storage file, the first virtual machine and name node
CN107704596B (en) * 2017-10-13 2021-06-29 郑州云海信息技术有限公司 Method, device and equipment for reading file
CN109753226A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Data processing system, method and electronic equipment
CN110110003A (en) * 2018-01-26 2019-08-09 广州中国科学院计算机网络信息中心 The data storage control method and device of M2M platform
CN110688194B (en) * 2018-07-06 2023-03-17 中兴通讯股份有限公司 Disk management method based on cloud desktop, virtual machine and storage medium
CN111443872A (en) * 2020-03-26 2020-07-24 深信服科技股份有限公司 Distributed storage system construction method, device, equipment and medium
CN113037569A (en) * 2021-04-19 2021-06-25 杭州和利时自动化有限公司 Redundant service method, device, equipment and medium based on double servers
CN113641467B (en) * 2021-10-19 2022-02-11 杭州优云科技有限公司 Distributed block storage implementation method of virtual machine
CN114138737B (en) * 2022-02-08 2022-07-12 亿次网联(杭州)科技有限公司 File storage method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521063A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Shared storage method suitable for migration and fault tolerance of virtual machine
CN103729250A (en) * 2012-10-11 2014-04-16 国际商业机器公司 Method and system to select data nodes configured to satisfy a set of requirements
CN103797770A (en) * 2012-12-31 2014-05-14 华为技术有限公司 Method and system for sharing storage resources

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130325812A1 (en) * 2012-05-30 2013-12-05 Spectra Logic Corporation System and method for archive in a distributed file system
US9588984B2 (en) * 2012-12-06 2017-03-07 Empire Technology Development Llc Peer-to-peer data management for a distributed file system
US9348707B2 (en) * 2013-12-18 2016-05-24 International Business Machines Corporation Dynamically adjusting the number of replicas of a file according to the probability that the file will be accessed within a distributed file system
CN106446159B (en) * 2016-09-23 2019-11-12 华为技术有限公司 A kind of method of storage file, the first virtual machine and name node

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521063A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Shared storage method suitable for migration and fault tolerance of virtual machine
CN103729250A (en) * 2012-10-11 2014-04-16 国际商业机器公司 Method and system to select data nodes configured to satisfy a set of requirements
CN103797770A (en) * 2012-12-31 2014-05-14 华为技术有限公司 Method and system for sharing storage resources

Also Published As

Publication number Publication date
WO2018054079A1 (en) 2018-03-29
CN106446159A (en) 2017-02-22

Similar Documents

Publication Publication Date Title
CN106446159B (en) A kind of method of storage file, the first virtual machine and name node
US10956601B2 (en) Fully managed account level blob data encryption in a distributed storage environment
CN106407040B (en) A kind of duplicating remote data method and system
CN106687911B (en) Online data movement without compromising data integrity
CN103929500A (en) Method for data fragmentation of distributed storage system
US10552089B2 (en) Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests
US20190007208A1 (en) Encrypting existing live unencrypted data using age-based garbage collection
CN107851122B (en) Large scale storage and retrieval of data with well-bounded life
CN108351806A (en) Database trigger of the distribution based on stream
CN105549905A (en) Method for multiple virtual machines to access distributed object storage system
CN105027070A (en) Safety for volume operations
CN103890729A (en) Collaborative management of shared resources
CN106933747B (en) Data-storage system and date storage method based on multithread
US20130031221A1 (en) Distributed data storage system and method
US9110820B1 (en) Hybrid data storage system in an HPC exascale environment
CN104937564B (en) The data flushing of group form
CN110134338B (en) Distributed storage system and data redundancy protection method and related equipment thereof
CN110147203A (en) A kind of file management method, device, electronic equipment and storage medium
CN109582213A (en) Data reconstruction method and device, data-storage system
CN104965835B (en) A kind of file read/write method and device of distributed file system
CN105468296A (en) No-sharing storage management method based on virtualization platform
CN104517067B (en) Access the method, apparatus and system of data
JP2016144169A (en) Communication system, queue management server, and communication method
CN102282545A (en) Storage system
KR101601877B1 (en) Apparatus and method for client's participating in data storage of distributed file system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220215

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Patentee after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.