CN107463577B - Data storage system and data searching method - Google Patents

Data storage system and data searching method Download PDF

Info

Publication number
CN107463577B
CN107463577B CN201610393783.3A CN201610393783A CN107463577B CN 107463577 B CN107463577 B CN 107463577B CN 201610393783 A CN201610393783 A CN 201610393783A CN 107463577 B CN107463577 B CN 107463577B
Authority
CN
China
Prior art keywords
metadata
host
target
routing table
target file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610393783.3A
Other languages
Chinese (zh)
Other versions
CN107463577A (en
Inventor
金中良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610393783.3A priority Critical patent/CN107463577B/en
Publication of CN107463577A publication Critical patent/CN107463577A/en
Application granted granted Critical
Publication of CN107463577B publication Critical patent/CN107463577B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

The embodiment of the invention discloses a data storage system and a data searching method, which are used for storing file metadata in a distributed mode. The method provided by the embodiment of the invention comprises the following steps: the metadata host is used for storing the metadata of the file and returning the metadata of the target file; and the query host is used for receiving a metadata search request for the target file metadata, determining the target metadata host in which the target file metadata is stored, and acquiring the target file metadata from the target metadata host. As the system is scaled up, more hosts are available for storing the file metadata, that is, more hosts are available for receiving queries for the file metadata, and thus the system is not limited by the service capability of the metadata host when being expanded.

Description

Data storage system and data searching method
Technical Field
The present invention relates to the field of electricity, and in particular, to a data storage system and a data searching method.
Background
The distributed storage technology is a key technology of a big data system, and the storage technology can share the local storage of a plurality of hosts and uses the hosts as a unified storage system. In such a storage system, any host may transparently access any host of the storage system.
The most widely used distributed storage technology at present is the distributed file system HDFS. In the HDFS, a host that actually stores a Data file is called a Data Node, and a system of one HDFS includes a plurality of Data nodes. Any Data Node in the system will generate the file metadata according to the Data file, and then will store the file metadata to another specific host, which is called Name Node. Generally, in an HDFS system, there are two Name nodes, which store the same content and are active and standby. When the Data Node1 application program needs to read a Data file, it first requests the Name Node to return the file metadata of the Data file, then finds the host Data Node2 where the Data is actually stored according to the returned file metadata of the Name Node, and finally obtains the Data file from the Data Node 2.
The Name Node is a host, and the processing of the data is always limited, so that the service capability is limited. However, in practical applications, a large-scale system is often required, the number of Data nodes in the system is quite large, and the number of file metadata to be stored is also quite large. Meanwhile, a large number of hosts need to search the stored file metadata, and because the service capability of the Name Node is often insufficient to deal with such a large number of searches, system crash is easily caused, so that the scale of the system is difficult to continue to expand.
Disclosure of Invention
The embodiment of the invention provides a data storage system and a data searching method, which are used for storing file metadata in a distributed mode.
In view of the above, a first aspect of the present application provides a data storage system, comprising: the metadata host is used for storing the file metadata and returning the target file metadata; and the query host is used for receiving a metadata search request for the target file metadata, determining the target metadata host in which the target file metadata is stored, and acquiring the target file metadata from the target metadata host.
The data storage system consists of a plurality of hosts, and the number scale of the hosts can be continuously expanded. The host may act as a metadata host for storing file metadata, and returning file metadata. The host can also be used as a query host, and is used for determining the metadata host storing the file metadata according to the metadata search request and acquiring the file metadata from the metadata host when receiving the metadata search request for the metadata of the target file.
It should be noted that the querying host and the metadata host may be the same host, that is, when the querying host receives a metadata search request for metadata of a target file, and determines that the metadata host storing the metadata of the file is a local host according to the metadata search request, the metadata of the file may be obtained from the local host.
Because all hosts in the system can be used as metadata hosts for storing the file metadata, rather than storing all the file metadata on a specific host in a centralized manner, as the scale of the system is larger and larger, more hosts can be used for storing the file metadata, that is, more hosts can receive queries for the file metadata, and therefore, the system is not limited by the service capability of the metadata hosts when being expanded.
In combination with the first aspect of the present application, in a first implementation of the first aspect of the present application, the system includes: and the routing table host is used for storing routing table entries and returning the target routing table entries, and the target routing table entries are used for indicating that the target file metadata is stored in the target metadata host.
With reference to the first implementation manner of the first aspect of the present application, in a second implementation manner of the first aspect of the present application, the query host includes a determining unit, where the determining unit is configured to determine a target metadata host storing the target file metadata; the determination unit includes: a first determining subunit, configured to determine, according to the metadata lookup request, a target routing table host in which a target routing table entry is stored; an obtaining subunit, configured to obtain the target routing table entry from the target routing table host determined by the first determining subunit; and the second determining subunit is configured to determine, according to the target routing table entry acquired by the acquiring subunit, the target metadata host in which the target file metadata is stored.
The routing table entry is used to indicate that the target file metadata is stored in the target metadata host.
Optionally, the querying host may not be able to determine the target metadata host storing the target file metadata directly through the metadata lookup request, in which case, the host in the data storage system may also serve as a routing table host, where the routing table host is configured to store a routing table entry, and the routing table entry is configured to indicate the target metadata host storing the target file metadata.
Specifically, the correspondence between the file metadata and the metadata host storing the file metadata may be stored as a routing table constructed in the form of a file directory, and each routing table entry in the routing table may indicate the correspondence between a file metadata and the metadata host storing the file metadata. In the embodiment of the present invention, the routing table is divided into a plurality of routing table entries, which are stored in each host of the system, and from another perspective, it can also be said that the stored routing table entries in the system form a complete routing table configured in the form of a file directory.
The target routing table host storing the target routing table entry may be obtained by using a uniform Hash algorithm, and thus, the determined same host may be obtained no matter the querying host needs to query the target routing table host for obtaining the target routing table entry or the metadata storage host needs to determine the target routing table host for storing the target routing table entry.
The target routing table host storing the target routing table entry may be obtained by using a uniform Hash algorithm, and thus, the determined same host may be obtained no matter the querying host needs to query the target routing table host for obtaining the target routing table entry or the metadata storage host needs to determine the target routing table host for storing the target routing table entry. In addition, in general, each routing table entry can be stored on a plurality of hosts almost randomly and uniformly through the Hash algorithm, so that the storage position of each routing table entry has certain fixity.
With reference to the second implementation manner of the first aspect of the present application, in a third implementation manner of the first aspect of the present application, the query host further includes: and the storage unit is used for storing the target file metadata after the query host acquires the target file metadata from the target metadata host.
With reference to the third implementation manner of the first aspect of the present application, in a fourth implementation manner of the first aspect of the present application, the target routing table host further includes: and the updating unit is used for updating the target routing table entry to obtain an updated target routing table entry after returning the target routing table entry, wherein the updated target routing table entry is used for indicating that the target file metadata is stored in the query host.
With reference to the third implementation manner of the first aspect of the present application and the fourth implementation manner of the first aspect of the present application, in a fifth implementation manner of the first aspect of the present application, the target metadata host further includes: and the deleting unit is used for deleting the metadata of the target file after returning the metadata of the target file.
With reference to the first aspect of the present application, the first implementation manner of the first aspect of the present application, the second implementation manner of the first aspect of the present application, the third implementation manner of the first aspect of the present application, and the fourth implementation manner of the first aspect of the present application, the query host further includes: and the searching unit is used for searching the file metadata locally after the inquiring host receives the metadata searching request, and triggering the searching host to determine a target metadata host storing the target file metadata if the file metadata cannot be searched.
A second aspect of the present application provides a data search method for use in a system as in the second aspect of the present application, the method comprising: the query host receives a metadata search request for metadata of a target file; the inquiring host computer determines a target metadata host computer which stores the target file metadata; the query host obtains the target file metadata from the target metadata host.
When the query host receives a metadata search request for metadata of a target file, a user can input a search request for some data or files at a terminal, and the metadata of the target file is determined according to the search request for the data or files, so that the metadata search request for the metadata of the target file is generated. If the target routing table host and the query host are not the same host, the query host may send an acquisition request for the target routing table entry to the target routing table host, and the target routing table host returns the directory routing table entry when receiving the acquisition request for the target routing table entry.
If the target metadata host and the query host are not the same host, the query host may send an acquisition request for the target file metadata to the target metadata host, and the target metadata host returns the directory routing table entry when receiving the acquisition request for the target file metadata. After the query host obtains the target file metadata from the target metadata host, optionally, the target file metadata may be moved to the local, the target file metadata in the target metadata host is deleted, and the target routing table entry in the target routing table host is updated.
Because all hosts in the system can be used as metadata hosts for storing the file metadata, rather than storing all the file metadata on a specific host in a centralized manner, as the scale of the system is larger and larger, more hosts can be used for storing the file metadata, that is, more hosts can receive queries for the file metadata, and therefore, the system is not limited by the service capability of the metadata hosts when being expanded.
In combination with the second aspect of the present application, in a first embodiment of the second aspect of the present application, the method comprises: the inquiring host determines a target routing table host storing a target routing table entry according to the metadata searching request; the inquiring host obtains the target routing table entry from the target routing table host; the querying host determines the target metadata host storing the target file metadata according to the target routing table entry.
In combination with the second aspect of the present application, in a second embodiment of the second aspect of the present application, the method further comprises: the querying host stores the target file metadata.
With reference to the second aspect of the present application, the first embodiment of the second aspect of the present application, and the second embodiment of the second aspect of the present application, the method further includes: the query host searches the target file metadata from the local; if the target file metadata is searched, returning the target file metadata; if the target file metadata cannot be searched, triggering the inquiry host to determine the target metadata host storing the target file metadata.
According to the technical scheme, the embodiment of the invention has the following advantages:
because all hosts in the system can be used as metadata hosts for storing the file metadata, rather than storing all the file metadata on a specific host in a centralized manner, as the scale of the system is larger and larger, more hosts can be used for storing the file metadata, that is, more hosts can receive queries for the file metadata, and therefore, the system is not limited by the service capability of the metadata hosts when being expanded.
Drawings
FIG. 1 is a schematic diagram of an architecture of a data storage system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an embodiment of a query host in an embodiment of the present application;
FIG. 3 is a diagram of an embodiment of a metadata host in an embodiment of the present application;
FIG. 4 is a schematic diagram of an embodiment of a routing table host in an embodiment of the present application;
FIG. 5 is a schematic diagram of an embodiment of a data lookup method in an embodiment of the present application;
FIG. 6 is a schematic diagram of another embodiment of a data lookup method in an embodiment of the present application;
FIG. 7 is a schematic diagram of another embodiment of a data lookup method in an embodiment of the present application;
FIG. 8 is a schematic diagram of an embodiment of a host in the system of the embodiment of the present application;
fig. 9 is a schematic diagram of another embodiment of the host in the system in the embodiment of the present application.
Detailed Description
The embodiment of the invention provides a data storage system and a data searching method, which are used for storing file metadata in a distributed mode.
In order to make the technical solutions of the embodiments of the present invention better understood, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
With the development of communication technology, the era of large data has gradually progressed, and an independent storage server is difficult to satisfy in the face of extensive data storage. There are many technologies for big data, and distributed storage technology is a key technology of big data systems. The distributed storage technology can share the local storage of a plurality of hosts, the plurality of hosts are used as a unified storage system, and when any host needs certain data or files in the system, any host in the storage system can be accessed to acquire the data or files.
Distributed file system HDFS is by far the most widely used distributed storage technology. In the HDFS, a host that actually stores a Data file is referred to as a Data host Data Node, and the Data Node may store not only Data or a file but also access another host to acquire necessary Data or a file. The specific access method is that any Data Node in the HDFS generates file metadata according to the Data or file, and then stores the file metadata to another specific host dedicated for storing the file metadata, and the host storing the file metadata is called a metadata host Name Node. It should be noted that, in the general case, in the HDFS, there are often two Name nodes which are active and standby each other and store the same content, so as to prevent uncontrollable situations such as failure or intrusion. Specifically, when an application in the Data Node1 needs to read a Data or file, or a user needs to search for a Data or file using the Data Node1, the user generally accesses the Name Node and requests it to return the file metadata of the Data or file, then determines the host Data Node2 where the Data or file is actually stored according to the returned file metadata of the Name Node, and then obtains the Data or file from the Data Node 2.
However, the Name Node in HDFS is a host, and in the face of system expansion, more and more Data nodes need to search for Data or files, and a host has limited service capability in the face of such a large search amount. Because the service capability of the Name Node is often not enough to deal with such a large amount of searches, the speed of searching the Data Node may be slow, and the system may even be crashed seriously, so that the scale of the system is difficult to continue expanding.
Therefore, in the embodiment of the present invention, a data storage system and a data search method are provided, which are used for storing file metadata in a distributed manner, and since all hosts in the system can store the file metadata without storing all the file metadata on one or two specific hosts in a centralized manner, when the system is increased in scale, more and more hosts can be used for storing the file metadata, that is, the service capability of receiving queries on the file metadata is increased, and therefore, the expansion of the system is not affected by the storage amount of the file metadata and the search amount of the file metadata.
In view of this, as shown in fig. 1, there is a schematic structural diagram of a data storage system according to an embodiment of the present invention, the data storage system is composed of a plurality of hosts, and the hosts can be used as metadata hosts for storing file metadata and returning the file metadata. The host can also be used as a query host, and is used for determining the metadata host storing the file metadata according to the metadata search request and acquiring the file metadata from the metadata host when receiving the metadata search request for the metadata of the target file. It should be noted that the querying host and the metadata host may be the same host, that is, when the querying host receives a metadata search request for metadata of a target file, and determines that the metadata host storing the metadata of the file is a local host according to the metadata search request, the metadata of the file may be obtained from the local host. In other possible embodiments, when determining the metadata host storing the file metadata, it may also first query whether the host stores the file metadata, which is not limited herein.
Alternatively, in some possible embodiments, the querying host may not be able to determine the target metadata host storing the target file metadata directly from the metadata lookup request, in which case the host in the data storage system may also act as a routing table host for storing a routing table entry indicating the target metadata host storing the target file metadata. Specifically, in the embodiment of the present invention, the correspondence between the file metadata and the metadata host storing the file metadata may be stored as a routing table constructed in the form of a file directory, and each routing table entry in the routing table may indicate the correspondence between one file metadata and the metadata host storing the file metadata. In the embodiment of the present invention, the routing table is divided into a plurality of routing table entries, which are stored in each host of the system, and from another perspective, it can also be said that the stored routing table entries in the system form a complete routing table configured in the form of a file directory.
In some possible embodiments, the target routing table host storing the target routing table entry may be derived using a consistent Hash algorithm, and the same determined host may be obtained whether the querying host needs to query the target routing table host for obtaining the target routing table entry or whether the metadata storing host needs to determine the target routing table host for storing the target routing table entry.
Specifically, as shown in fig. 2, it is a schematic diagram of an embodiment of a system in which a host is used as a query host, and the query host includes:
a receiving unit 201, configured to receive a metadata search request for metadata of the target file.
A determining unit 203, configured to determine a target metadata host storing the target file metadata indicated by the receiving unit 201.
An obtaining unit 204, configured to obtain the target file metadata obtained from the target metadata host determined by the determining unit 203.
Specifically, the determining unit 203 may include:
a first determining subunit 2031, configured to determine, according to the metadata lookup request received by the receiving unit 201, a target routing table host storing a target routing table entry.
An obtaining sub-unit 2032, configured to obtain the target routing table entry from the target routing table host determined by the first determining sub-unit 2031.
A second determining subunit 2033, configured to determine, according to the target routing table entry acquired by the acquiring subunit 2032, the target metadata host storing the target file metadata.
Optionally, the querying host 200 may further include:
a first storage unit 205, configured to store the target file metadata acquired by the acquisition unit 204 after the acquisition unit 204 acquires the target file metadata from the target metadata host.
Optionally, the querying host 200 may further include:
a searching unit 202, configured to, after the receiving unit 201 receives a metadata search request for target file metadata, locally return the target file metadata if the target file metadata is found, and trigger the determining unit 203 to execute its function if the target file metadata is not found.
As shown in fig. 3, a diagram of one embodiment of a host in a system as a metadata host 300 is shown.
A second storage unit 301 for storing file metadata.
A first returning unit 302, configured to return the target file metadata.
Optionally, the metadata host 300 may further include:
a deleting unit 303, configured to delete the target file metadata after returning the target file metadata.
Fig. 4 is a schematic diagram of an embodiment of a system in which a host acts as a routing table host.
A third storage unit 401 for storing routing table entries.
The second returning unit 402, when receiving a route query request for a target routing table entry, returns the target routing table entry stored by the third storing unit 401, the target routing table entry being used to indicate that the target file metadata is stored in the target metadata host.
Optionally, the metadata host 400 may further include:
an updating unit 403, configured to delete the target file metadata after the second returning unit 402 returns the target file metadata.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described units may refer to corresponding processes of step 501 to step 510 in the following method embodiments, and are not described herein again.
For understanding, a specific interactive flow of the data searching method in the embodiment of the present invention is described below, and referring to fig. 5, an embodiment of the data searching method in the embodiment of the present invention includes:
501. the querying host receives a metadata lookup request for metadata of a target file.
In the embodiment of the invention, when the query host receives a metadata search request for the metadata of the target file, a user can input a search request for some data or files at a terminal, and the metadata of the target file is determined according to the search request for the data or files, so that the metadata search request for the metadata of the target file is generated. In other possible embodiments, the metadata search request for the target file metadata from the user may be received directly, such as by directly typing a link or code into a download window of some downloaded software, so that a metadata search request for the target file metadata may be generated and received by the query host. Or when the query host is a server or a terminal, and when the query host runs certain application programs to enable a target data file to be needed, the requirement for the metadata of the target file is generated, and a metadata searching request for the metadata of the target file can be received.
In some possible embodiments, the target file metadata or file metadata may be a link, or a seed file, or other indication that the target data file is stored, and is not limited herein. In other possible embodiments, the querying host may also receive a metadata lookup request for metadata of the target file sent by another host, which is not limited herein.
It should be noted that, in the embodiment of the present invention, when the query host receives a metadata search request for the target file metadata, the target metadata host storing the target file metadata may be determined according to the metadata search request. In some possible embodiments, the target metadata host storing the metadata of the target file may be determined directly through the metadata lookup request, and in other possible embodiments, the following specific manner from step 504 to step 506 may also be implemented.
502. The query host searches the target file metadata locally.
Alternatively, in some possible embodiments, when the querying host receives a metadata lookup request for target file metadata, the target file metadata may first be searched locally. In particular, the querying host may perform the search by traversing a local file directory. In some possible embodiments, typically, most applications will divide the storage of files into multiple directories. Different hosts, data files of operations are stored in different directories, and the traversal of files by the application program is based on the file directory form. It should be noted that, in general, most applications use and access the file system in accordance with the storage form of the file directory, and if some applications do not access the file in accordance with the storage form, the application may be modified to adapt the mode to achieve the optimal performance.
In another possible embodiment, the query host determines that the target metadata host storing the target file metadata is local according to a metadata lookup request for the target file metadata, which is not limited herein.
It should be noted that, if the target file metadata is found by the local search, step 503 in the following embodiment is executed, and if the target file metadata is not found by the local search, steps 504 to 506 in the following embodiment are executed.
503. The querying host returns the target file metadata.
In some possible embodiments, when the target file metadata is searched locally by the querying host, the file metadata may be returned so that the querying host may query the target data file through the target file metadata.
504. And the query host determines the target routing table host storing the target routing table entry according to the metadata searching request.
Alternatively, in some possible embodiments, when the querying host receives a metadata lookup request for the target file metadata, the target routing table host storing the target routing table entry may be determined from the metadata lookup request. Specifically, the metadata search request for the metadata of the target file may include a key (file directory), and then the key may be calculated according to a given hash function to obtain h (key). It should be noted that, any host calculates the key through a given Hash function at any time, and the calculated Hash value is the same, so as to ensure that no matter the host is used for querying the target routing table entry or the metadata host storing the target routing table entry, the corresponding node, that is, the target routing table host, can always be found according to the Hash value. In addition, in general, each routing table entry can be stored on a plurality of hosts almost randomly and uniformly through the Hash algorithm, so that the storage position of each routing table entry has certain fixity.
In some feasible embodiments, in order to backup the routing table entries to prevent data loss in some unexpected situations, two or more routing table entries are often needed to be backed up, and preferably, two hosts may be used to store the target routing table entries, which are mutually active and standby. When the query host determines the target routing table host storing the target routing table entry according to the metadata lookup request, the target routing table host may have two or more, preferably, one closer to the local host or one shorter in data path may be selected.
505. The querying host obtains the target routing table entry from the target routing table host.
In some possible embodiments, if the target routing table host and the querying host are not the same host, the querying host may send an acquisition request for the target routing table entry to the target routing table host, and the target routing table host returns the directory routing table entry when receiving the acquisition request for the target routing table entry. In other possible embodiments, if the target routing table host and the query host are the same host, step 502 and step 503 similar to those in the above embodiments may be performed to locally obtain the required target routing table entry, which is not described herein again. In other possible embodiments, the querying host may access the target routing table host, and traverse the file directory of the target routing table host to query the target routing table entry.
506. And the query host determines the target metadata host storing the target file metadata according to the target routing table entry.
In some possible embodiments, since the target routing table entry indicates the target metadata host storing the target file metadata, the target metadata host storing the target file metadata may be determined through the target routing table entry. Specifically, since the target file metadata corresponds to the target metadata host in the target routing table entry, the target metadata host may be determined.
507. The query host obtains the target file metadata from the target metadata host.
In some possible embodiments, if the target metadata host and the querying host are not the same host, the querying host may send a get request for the target file metadata to the target metadata host, and the target metadata host returns the directory routing table entry when receiving the get request for the target file metadata. In other possible embodiments, if the target metadata host and the query host are the same host, step 502 and step 503 in the above embodiments may be performed, and are not described herein again. In other possible embodiments, the querying host may access the target metadata host, and traverse the file directory of the target metadata host to query the target file metadata.
In some possible embodiments, after the query host obtains the target file metadata from the target metadata host, optionally, the target file metadata may be moved to the local, the target file metadata in the target metadata host may be deleted, and the target routing table entry in the target routing table host may be updated. Specifically, please refer to steps 508, 509, and 510 of the following embodiments, it should be noted that the steps 508, 509, and 510 have no timing relationship.
508. The querying host stores the target file metadata.
In some possible embodiments, it may be preferable to store the target file metadata when the querying host determines that frequent use of the target file metadata is required, so that if frequent use of the target file metadata is later made, access to the target metadata host is not required, but rather a query is made directly locally. If the scheme for storing the target file metadata is used in the system in a large scale, most of the inquiring hosts can store the required file metadata, and most of the requirements on the file metadata can be met locally, so that the search for other hosts in the system is greatly reduced, and the performance and the efficiency of the system are greatly improved.
509. And the target routing table host updates the target routing table entry to obtain an updated target routing table entry, wherein the updated target routing table entry is used for indicating that the target file metadata is stored in the query host.
In some possible embodiments, after the target routing table host returns the target routing table entry to the querying host, if the querying host can send the information to the target routing table host after storing the target file metadata, the routing table host may update the target routing table entry to obtain the updated target routing table entry. In other possible embodiments, if the rule of the system is preset, when the query host stores the target file metadata after acquiring the target file metadata, the target routing table host does not need to send the information to the target routing table host, and the target routing table host can update the directory routing table entry by itself after returning the directory routing table entry. Optionally, the updated target routing entry may indicate that the target file metadata is stored in the querying host, and in other possible embodiments, if the target file metadata host deletes the target file metadata, the updated target routing table host may indicate that the target file metadata is stored in the querying host, and delete the target file metadata and the information in the target metadata host, specifically, refer to step 510 in the following embodiments. However, if the target metadata host retains the target file metadata, the resulting updated target routing table host may indicate that the target file metadata is stored in the querying host, and at the same time, indicate that the target routing table host may indicate that the target file metadata is stored in the target metadata host, which is not limited herein.
510. The target metadata host deletes the target file metadata.
In some possible embodiments, optionally, after the target metadata host returns the target file metadata to the querying host, the target metadata host may delete the target file metadata if it determines that the target file metadata is not commonly used by the host. After the target metadata host returns the target file metadata to the querying host, if the querying host can send the information to the target metadata host after storing the target file metadata, the routing table host may delete the target file metadata. In other possible embodiments, if the rule of the system is preset, when the query host obtains the target file metadata and then stores the target file metadata, the query host does not need to send the information to the target metadata host, and the target metadata host can delete the target file metadata by itself after returning the target file metadata.
It should be noted that, although there is no timing relationship among step 508, step 509 and step 510 in the embodiment, step 508 and step 510 should correspond to step 509. If the target file metadata is stored by the querying host in step 508, then the target routing table host updates the target routing table entry in step 509, which should indicate that the target file metadata is stored in the querying host. Similarly, if the target metadata host deletes the target file metadata, the target routing table host should update the target routing table entry accordingly, and the obtained updated target routing table entry should no longer indicate that the target file metadata is stored in the target metadata host.
In the above embodiment, steps 501 to 510 describe the data viewing method through an interactive flowchart, and in order to make the reader understand more easily, please refer to fig. 6 below, this embodiment describes the data viewing method through the perspective of the host in the system. It should be noted that, for simplicity of description, the following embodiments do not include a process in which the querying host queries locally, and do not include a process in which the querying host stores the target file metadata, the target routing table host updates the target routing table entry, and the target metadata host deletes the target file metadata. That is, step 601 in the following embodiments corresponds to step 501 in the above embodiments, and steps 602 to 605 correspond to steps 504 to 507 in the above embodiments.
Another embodiment of the data searching method in the embodiment of the present invention includes:
601. the querying host receives a metadata lookup request for metadata of a target file.
602. And the query host determines the target routing table host storing the target routing table entry according to the metadata searching request.
603. The querying host obtains the target routing table entry from the target routing table host.
604. And the query host determines the target metadata host storing the target file metadata according to the target routing table entry.
605. The query host obtains the target file metadata from the target metadata host.
Step 601 in this embodiment is the same as step 501 in the above embodiment, and steps 602 to 605 are the same as steps 504 to 507 in the above embodiment, and are not repeated here.
In the above embodiment, steps 601 to 605 describe the data search method from the perspective of the system, and for the reader to understand more easily, please refer to fig. 7 below, this embodiment describes the data viewing method from the perspective of the flow of the single side of the querying host side.
Referring to fig. 7, another embodiment of the data searching method according to the embodiment of the present invention includes:
701. the querying host receives a metadata lookup request for metadata of a target file.
702. The inquiring host judges whether the target file metadata exists locally by searching the target file metadata locally.
703. The querying host returns the target file metadata.
704. And the query host determines the target routing table host storing the target routing table entry according to the metadata searching request.
705. The querying host obtains the target routing table entry from the target routing table host.
706. And the query host determines the target metadata host storing the target file metadata according to the target routing table entry.
707. The query host obtains the target file metadata from the target metadata host.
708. The querying host stores the target file metadata.
Steps 701 to 708 in this embodiment are the same as steps 501 to 508 in the above embodiment, and are not described again here.
The host of the system in the embodiment of the present invention may further include a server, fig. 8 is a schematic structural diagram of a server provided in the embodiment of the present invention, the server 800 may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 822 (e.g., one or more processors) and a memory 832, and one or more storage media 830 (e.g., one or more mass storage devices) for storing the application programs 842 or the data 844. Memory 832 and storage medium 830 may be, among other things, transient or persistent storage. The program stored in the storage medium 830 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, a central processor 822 may be provided in communication with the storage medium 830 for executing a series of instruction operations in the storage medium 830 on the server 800.
The server 800 may also include one or more power supplies 826, one or more wired or wireless network interfaces 850, one or more input-output interfaces 858, and/or one or more operating systems 841, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
The host in the system in the embodiment of the present invention may be a terminal or a server, as long as the host has the functions of querying the host and the metadata host, and preferably, the host may further have the function of a routing table host. Specifically, the terminal may include any terminal device such as a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), a vehicle-mounted computer, and the like, as shown in fig. 9, the mobile phone is taken as an example for description, for convenience of description, only a part related to the embodiment of the present invention is shown, and details of the specific technology are not disclosed, please refer to a method part of the embodiment of the present invention:
fig. 9 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present invention. Referring to fig. 9, the handset includes: radio Frequency (RF) circuit 910, memory 920, input unit 930, display unit 940, sensor 950, audio circuit 960, wireless fidelity (WiFi) module 970, processor 980, and power supply 990. Those skilled in the art will appreciate that the handset configuration shown in fig. 9 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
For convenience and brevity of description, the specific working process of the server or the terminal described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A data storage system comprising a plurality of hosts, the system comprising:
the metadata host is used for storing the file metadata and returning the target file metadata;
the query host is used for receiving a metadata search request for the target file metadata, determining a target metadata host in which the target file metadata are stored, and acquiring the target file metadata from the target metadata host;
and the routing table host is used for storing routing table entries and returning the target routing table entries, and the target routing table entries are used for indicating that the target file metadata is stored in the target metadata host.
2. The system according to claim 1, wherein the querying host comprises a determining unit, the determining unit is configured to determine a target metadata host storing the target file metadata, and the determining unit comprises:
the first determining subunit is used for determining a target routing table host in which a target routing table entry is stored according to the metadata searching request;
an obtaining subunit, configured to obtain the target routing table entry from the target routing table host determined by the first determining subunit;
a second determining subunit, configured to determine, according to the target routing table entry acquired by the acquiring subunit, the target metadata host in which the target file metadata is stored.
3. The system of claim 2, wherein the querying host further comprises:
and the storage unit is used for storing the target file metadata after the query host acquires the target file metadata from the target metadata host.
4. The system of claim 3, wherein the target routing table host further comprises:
and the updating unit is used for updating the target routing table entry to obtain an updated target routing table entry after the target routing table entry is returned, wherein the updated target routing table entry is used for indicating that the target file metadata is stored in the query host.
5. The system of claim 3 or 4, wherein the target metadata host further comprises:
and the deleting unit is used for deleting the metadata of the target file after returning the metadata of the target file.
6. The system of any of claims 1-4, wherein the querying host further comprises:
and the searching unit is used for searching the file metadata locally after the inquiring host receives the metadata searching request, and triggering the searching host to determine a target metadata host storing the target file metadata if the file metadata cannot be searched.
7. A method for data searching, for use in a system as claimed in any one of claims 1 to 6, the method comprising:
the query host receives a metadata search request for metadata of a target file;
the query host computer determines a target metadata host computer which stores the target file metadata;
the query host acquires the target file metadata from the target metadata host;
the querying host determining a target metadata host storing the target file metadata comprises:
the query host determines a target routing table host in which target routing table entries are stored according to the metadata search request;
the query host acquires the target routing table entry from the target routing table host;
and the query host determines the target metadata host storing the target file metadata according to the target routing table entry.
8. The method of claim 7, wherein after the querying host obtains the target file metadata from the target metadata host, the method further comprises:
the querying host stores the target file metadata.
9. The method of claim 7 or 8, wherein after the querying host receives the metadata lookup request for the metadata of the target file, the method further comprises:
the query host searches the target file metadata from the local;
if the target file metadata is searched, returning the target file metadata;
and if the target file metadata cannot be searched, triggering the inquiry host to determine the target metadata host storing the target file metadata.
CN201610393783.3A 2016-06-06 2016-06-06 Data storage system and data searching method Active CN107463577B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610393783.3A CN107463577B (en) 2016-06-06 2016-06-06 Data storage system and data searching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610393783.3A CN107463577B (en) 2016-06-06 2016-06-06 Data storage system and data searching method

Publications (2)

Publication Number Publication Date
CN107463577A CN107463577A (en) 2017-12-12
CN107463577B true CN107463577B (en) 2021-01-29

Family

ID=60545329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610393783.3A Active CN107463577B (en) 2016-06-06 2016-06-06 Data storage system and data searching method

Country Status (1)

Country Link
CN (1) CN107463577B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108449376A (en) * 2018-01-31 2018-08-24 合肥和钧正策信息技术有限公司 A kind of load-balancing method of big data calculate node that serving enterprise

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102201986A (en) * 2011-05-10 2011-09-28 苏州两江科技有限公司 Zonal routing method for non-relational database Cassandra
CN102737130A (en) * 2012-06-21 2012-10-17 广州从兴电子开发有限公司 Method and system for processing metadata of hadoop distributed file system (HDFS)
CN102890678A (en) * 2011-07-20 2013-01-23 华东师范大学 Gray-code-based distributed data layout method and query method
CN103150394A (en) * 2013-03-25 2013-06-12 中国人民解放军国防科学技术大学 Distributed file system metadata management method facing to high-performance calculation
CN103647797A (en) * 2013-11-15 2014-03-19 北京邮电大学 Distributed file system and data access method thereof
CN103793534A (en) * 2014-02-28 2014-05-14 苏州博纳讯动软件有限公司 Distributed file system and implementation method for balancing storage loads and access loads of metadata
CN104391930A (en) * 2014-11-21 2015-03-04 用友软件股份有限公司 Distributed file storage device and method
CN104461792A (en) * 2014-12-03 2015-03-25 浪潮集团有限公司 HA method for clearing single-point failure of NAMENODE of HADOOP distributed file system
WO2015124042A1 (en) * 2014-02-24 2015-08-27 华为技术有限公司 Method, device and host for updating metadata stored in columns in distributed file system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7640267B2 (en) * 2002-11-20 2009-12-29 Radar Networks, Inc. Methods and systems for managing entities in a computing device using semantic objects
US9118425B2 (en) * 2012-05-31 2015-08-25 Magnum Semiconductor, Inc. Transport stream multiplexers and methods for providing packets on a transport stream
CN104967641B (en) * 2014-08-15 2017-06-23 浙江大华技术股份有限公司 A kind of method and device for realizing active and standby meta server data syn-chronization
US20160065479A1 (en) * 2014-08-26 2016-03-03 rift.IO, Inc. Distributed input/output architecture for network functions virtualization

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102201986A (en) * 2011-05-10 2011-09-28 苏州两江科技有限公司 Zonal routing method for non-relational database Cassandra
CN102890678A (en) * 2011-07-20 2013-01-23 华东师范大学 Gray-code-based distributed data layout method and query method
CN102737130A (en) * 2012-06-21 2012-10-17 广州从兴电子开发有限公司 Method and system for processing metadata of hadoop distributed file system (HDFS)
CN103150394A (en) * 2013-03-25 2013-06-12 中国人民解放军国防科学技术大学 Distributed file system metadata management method facing to high-performance calculation
CN103647797A (en) * 2013-11-15 2014-03-19 北京邮电大学 Distributed file system and data access method thereof
WO2015124042A1 (en) * 2014-02-24 2015-08-27 华为技术有限公司 Method, device and host for updating metadata stored in columns in distributed file system
CN103793534A (en) * 2014-02-28 2014-05-14 苏州博纳讯动软件有限公司 Distributed file system and implementation method for balancing storage loads and access loads of metadata
CN104391930A (en) * 2014-11-21 2015-03-04 用友软件股份有限公司 Distributed file storage device and method
CN104461792A (en) * 2014-12-03 2015-03-25 浪潮集团有限公司 HA method for clearing single-point failure of NAMENODE of HADOOP distributed file system

Also Published As

Publication number Publication date
CN107463577A (en) 2017-12-12

Similar Documents

Publication Publication Date Title
CN108038114B (en) Path query method, terminal and computer readable storage medium
US9305002B2 (en) Method and apparatus for eventually consistent delete in a distributed data store
US20180046511A1 (en) Tracking large numbers of moving objects in an event processing system
CN103019960B (en) Distributed caching method and system
US20140195551A1 (en) Optimizing snapshot lookups
CN105512320B (en) User ranking obtaining method and device and server
CN107181686B (en) Method, device and system for synchronizing routing table
JP2019519025A (en) Division and movement of ranges in distributed systems
CN111400334B (en) Data processing method, data processing device, storage medium and electronic device
CN104391863A (en) Data storage method and device
CN104488248A (en) File synchronization method, server and terminal
CN111131079B (en) Policy query method and device
US11455117B2 (en) Data reading method, apparatus, and system, avoiding version rollback issues in distributed system
CN110998537A (en) Expired backup processing method and backup server
CN105939355A (en) Data access method and system, as well as client and server
CN114610680A (en) Method, device and equipment for managing metadata of distributed file system and storage medium
CN107463577B (en) Data storage system and data searching method
US8527478B1 (en) Handling bulk and incremental updates while maintaining consistency
CN112000850B (en) Method, device, system and equipment for processing data
JP6788002B2 (en) Data storage methods and devices for mobile devices
CN114422537B (en) Multi-cloud storage system, multi-cloud data reading and writing method and electronic equipment
CN109325057B (en) Middleware management method, device, computer equipment and storage medium
CN110245122B (en) Data processing method and KV storage system
KR101298852B1 (en) Method of restoring file and system for the same
CN114442931A (en) Data deduplication method and system, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200201

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant after: HUAWEI TECHNOLOGIES Co.,Ltd.

Address before: 210012 HUAWEI Nanjing base, 101 software Avenue, Yuhuatai District, Jiangsu, Nanjing

Applicant before: Huawei Technologies Co.,Ltd.

GR01 Patent grant
GR01 Patent grant